Sunday, December 23, 2012

What is assembly language:
You must have heard of microprocessors in which the machine understands only ones and zeros. i.e. the machine needs to be fed data as a series of ones and zeros or ON and OFF type data.

Let us imagine a very simple microprocessor which only understands two instructions
1.      Add two numbers and display the result on output.
2.      Subtract two numbers and display the result on output.

Assuming the command also called operation-code for the first instruction addition is byte ‘1000 0000’ and for the second instruction subtraction is Byte ‘1001 0000’

So if we give the command to our processor as

‘1000 0000’ ‘0000 0001’ ‘0000 00001’  this is called an op-code.
now the machine interprets the series of bits as follows
1st Byte 1000 0000    ->    Add the following two data bytes
2nd Byte 0000 00001 -> 1 decimal
3rd Byte 0000 00001 -> 1 decimal

So what happens is that it takes up the last two bytes and adds them and stores the result somewhere. This is hardcore machine language. Now it is very difficult to remember each command in the above form so something called a mnemonic was invented. A command very similar to english language words now represent each op-code.

Our statement above can be written like

ADD 1,1 which is much simpler to understand. ADD is the mnemonic for OP CODE ‘1000 0000’


‘1000 0000’      ‘0000 0001’ ‘0000 00001’

<-OP- CODE-> ß------ OPERAND ---à 

Corresponding Assembler MNEMONIC is

ADD               1               ,     1          

 So languages are classified as follows:

          machine languages op codes

eg: op code= 90,01,01

means ADD ‘1’ to ‘1’

Low level languages:         
Assembler mnemonics have a one to
one correspondence with op-codes
                                Eg:    ADD  ‘1’,’1’  is  90,1,1

High Level Languages:     
Here one statement of the language generates many low level Op-codes or statements. Examples C, Java, etc.


There are several number formats that we need to be familiar while using assembly language, so please refer the following slide:


1)         Binary ,Base is 2.  (valid digits are 0,1)         

            eg  1000 0010 stands for

1*2^7 + 0*2^6 +  0*2^5 +  0*2^4 +  0*2^3 +  0*2^2 +  1*2^1 +  0*2^0  
which is equal to 128 + 2 = 130 decimal.

2)             Octal , base is 8 ( valid digits are 0,1,2,3,4,5,6,7)

Eg: 014 stands for 1*8^1 + 4*8^0
Which is equal to 8 + 4 = 12 decimal.
Octal numbers are always written with a 0 as the leading digit.

3)         Decimal, base is 10 ( valid digits 0,1, …. 9)
this is our standard number system which we use daily.

3)         Hexadecimal, Base is 16 ( valid digits are 0,1,2,3,4,5,6,7,8,9,10,A,B,C,D,E,F)

            Here    A => 10
                        B => 11

                        C => 12

                        D => 13
                        E => 14
F => 15

eg: 0x1A stands for    1*16^1 + A*16^0 = 1*16^1 + 10*16^0 = 16 + 10 = 26 decimal.

            DECIMAL     BINARY        OCTAL          HEXADECIMAL    
135                  1000 0111       0207                0x87

Conversion of the various formats:

We will be basically interested in the following conversion

Decimal ß> Binary <-> HEX


Decimal to Binary Conversion

Decimal = 50
Conversion: Keep Dividing  50 by 2 if remainder is non zero write 1 in the bit position

Division                       Bit Position
                        |1         1         
          2            |3         1
          2            |6         0
          2            |12       0
          2            |25       1
          2            |50       0          Least Significant Bit

Corresponding Binary is  0011 0010

Binary to Hex Conversion

Binary = 1000 1011
Lower four bytes = 1011 = decimal 11 = B
Higher four bytes = 1000 = decimal 8

Corresponding HEX is 0x8B

Hex to Decimal conversion
Hex = 0x82
            0x82 = 8*16^1 + 2*16^0 = 128 + 32 = 160

Decimal is 160

Decimal to Hex         
Deimal = 2058

division                       remainder       

            |8         8
         16           |128     0
         16           |2058   10=A

Hex = 0x80A

Forms of data storage in the Mainframe: The charater data is stored in EBCDIC (Extended Binary Coded Decimal …) format ( similar to ASCII ), the numeric data can be stored in pure binary, zoned, decimal or packed decimal.

We will take examples to illustrate the data formats:

Charater Data requires one byte for one charater:

Charater        Stored As

‘A’                   0xC1

‘a’                    0x81
‘1’                    0xF1

String data:
STRING= “snytle” stored as 0xA2 0x950 xA8 0xA3 0x93 0x85 in six bytes.

Numeric Data in Binary format:

123 in binary is stored in one byte as 01111011                                                        

Numeric Data stored in Zoned decimal format:

123 stored in zoned decimal requires three bytes
it is stored as

0xF1 0xF2 0xC3

Similarly –123 is stored as 0xF1 0xF2 0xD3

Note that the first four bits of the LSByte indicates the sign of the data.

Numeric Data stored in Packed Decimal format:

The same data 123 stored in packed decimal format will result in more efficient storage.

123 is stored as 0x12 0x3C thus we require one byte less here.

Concept of Memory Addressing:
We should know the basics of memory storage to understand addressing, memory is referred by the size of data it can store like BYTE, HALF WORD, WORD, DOUBLE WORD.

Each Byte in memory can be individualy pointed to by its address.

The address of each Half Word is the address of its Most Significant Byte ( left most byte)

The address of  Word is the address of its Most Significant Byte ( left most byte)

The address of Double Word is the address of its Most Significant Byte ( left most byte)

In most OS 370/390 systems the maximum memory that can be addressed is
(2^24 – 1) bytes, which is 16 MB.


BYTE:                                   8 bits   (1 byte)

 0 0 0 0 0 0 0 0

HALF WORD:                       16 bits (2 bytes)

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Most S. Byte

WORD:                                   32 bits (4 bytes)

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Most S. Byte

DOUBLE WORD:     64 bits (8 bytes)
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Most S. Byte

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0

 The Memory map for the mainframe system looks like this:
Note that each HALF WORD has an address which is exactly divisible by 2.
Also note that each WORD has an address which is exactly divisible by 4.
The total memory shown is 16MB = 2^24 –1 bytes, Each byte has a 24 bit address of its own.
The first address is 0x000000 and the last byte address is 0xFFFFFF.

memory map.bmp

Registers used in Mainframe:
There are 16 general purpose 32 bit memory locations  used in the mainframe, these locations are called registers,  each has a four bit binary address R0 = 0000, R1=0001,R2=0010…. R15 = 1111.
The registers arew referred to by their binary address. 0x01 represents R1 and so on.


·  There are a total of 16 registers in the mainframe.

·  Each register size is 32 bits, i.e.  4 bytes.

·  Registers are reffered to as R0, R1, R2 . . . . . R15.

·  The registers R0, R1, R12, R13, R14 and R15 are used for a special purpose.  All the other registers can be used by the programmer for general programming.

·  The address of the registers are 0,1,2, …. 15.

·  There is one more memory location which has a special use and it is called the “ program status word ” This stores the current memory location of the executing program and is used to pinpoint the currently executing instruction. The ‘P.S.W’ also stores other data related to current state of the CPU – Condition code , etc. The size of the PSW is 16 bytes. the lower 8 bytes stores the address of the current instruction.

Base Displacement form of Addressing:
The memory that you have seen in slide#8 is accessed in the program in a unique way called base displacement form.

The total bits required to address each memory byte is 24 bits because there are 2^24 – 1 bytes. But the OS/370 OS/390 system has allocated only 16 bits to act as a memory pointer to access the whole of the 2^24 –1 byte memory.

The 16 bits are divided in to two parts, the base of 4 bits and the displacement of 12 bits.
So the memory is mapped into segments of 2^12 (4096) bytes each and each segments absolute address is kept in a register pointed to by the first 4 bits. Thus a total memory
Of  4096 * 16 = 65536 can be mapped (addressed) if we use all the registers we have. This is still less than the total of  2^24 = 16,777,216 but then we seldom need that much of memory.

The advantage of the base displacement form of addressing is that programs can be made relocatable.

Base Displacement Addressing Scheme:

Base 4 bits
Displacement 12 bits

Points to a Register R0 – R15            The 12 bit displacement can address a maximum area of 4096 Bytes. ( 2^ 12)

Memory Address =  Contents of the Register pointed to by first four bytes + displacement.


0 0 1 0
0 0 0 0 0 0 0 0 1 1 0 0
Base                            Displacement

Memory Address = contents of R4   +   12
                             = 2005   + 12       decimal (assuming R4 contains 2005 )      
                             = 2017        decimal.                                  

Assembly statement format:
The assembler language is column sensitive so we have to be careful about the placement of the commands. The assembler  interprets the symbol by referring to the column number in which it is found.




instruction 2.bmp


The MVS Assembler language statements are divided into the following types depending upon the type of operands, memory addressing, number of bytes etc.

rr TYPE.bmp

BALR R14,R15    OPCODE R1,R2

            The corresponding  instruction in HEX is




SS1 TYPE.bmp

SS2 TYPE.bmp
Implicit addressing: The above examples that we have seen all support explicit addressing, i.e. we are specifying the actual address in the base displacement form ( D1(B1) ), but we can use a symbol to point to this address instead of specifying the register and displacement. The symbol can be thought of as a label for that memory location.
So now we can actualy use meaningful variables to represent the memory locations, the following examples illustrate implicit addressing:

Let us take the example of the LOAD instruction that we had seen before.


L         R1,D1(X2,B2)

Replacing D1(X2,B2) by a variable MYMEMORY we will have,

L         R1, MYMEMORY

MYMEMORY  is a variable name given to address  = contents of B2 + displacement D1.

(index is taken as zero), so the effective statement is the same i.e register R1 is loaded with the WORD (4 bytes) located at the address MYMEMORY.


The source code written in Mnemonic form has to be assembled by using an assembler program, it is similar to the compiler used for higher level languages..

The source program is assembled by the assembler into object code in two passes, the first pass assigns values to symbols and addresses to literals, the second creates the obect module and listing file.
The listing file will display the object code produced in the object file and all source module statements, a cross reference list of all literals and symbols used in the program and a diagnostic section showing errors occurred during assembly.
The final Load module or executable code is produced by the linkeditor which resolves all the addresses to called subroutines and modules.

Source program


          Object Module                                           Listing file             


       Load Module

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.