What is assembly language:
You
must have heard of microprocessors in which the machine understands only ones
and zeros. i.e. the machine needs to be fed data as a series of ones and zeros
or ON and OFF type data.
Let
us imagine a very simple microprocessor which only understands two instructions
1. Add two numbers and display
the result on output.
2. Subtract two numbers and
display the result on output.
Assuming
the command also called operation-code for the first instruction addition is
byte ‘1000 0000’ and for the second instruction subtraction is Byte ‘1001 0000’
So
if we give the command to our processor as
‘1000
0000’ ‘0000 0001’ ‘0000 00001’ this is
called an op-code.
now
the machine interprets the series of bits as follows
1st
Byte 1000 0000 -> Add the following two data bytes
2nd
Byte 0000 00001 -> 1 decimal
3rd
Byte 0000 00001 -> 1 decimal
So
what happens is that it takes up the last two bytes and adds them and stores
the result somewhere. This is hardcore machine language. Now it is very
difficult to remember each command in the above form so something called a
mnemonic was invented. A command very similar to english language words now
represent each op-code.
Our
statement above can be written like
ADD
1,1 which is much simpler to understand. ADD is the mnemonic for OP CODE ‘1000
0000’
OP-CODE and
MNEMONIC
‘1000
0000’ ‘0000 0001’ ‘0000 00001’
<-OP-
CODE-> ß------ OPERAND ---à
Corresponding
Assembler MNEMONIC is
ADD 1 , 1
So languages are classified as follows:
machine languages
op codes
eg:
op code= 90,01,01
means
ADD ‘1’ to ‘1’
Low level languages:
Assembler
mnemonics have a one to
one
correspondence with op-codes
Eg: ADD
‘1’,’1’ is 90,1,1
High Level Languages:
Here one statement
of the language generates many low level Op-codes or statements. Examples C,
Java, etc.
NUMBER FORMATS
There
are several number formats that we need to be familiar while using assembly
language, so please refer the following slide:
1) Binary ,Base is 2. (valid digits are 0,1)
eg
1000 0010 stands for
1*2^7 + 0*2^6 + 0*2^5 +
0*2^4 + 0*2^3 + 0*2^2 +
1*2^1 + 0*2^0
which is equal to 128 + 2 = 130 decimal.
2) Octal , base is 8 ( valid digits are
0,1,2,3,4,5,6,7)
Eg: 014 stands for 1*8^1 + 4*8^0
Which is equal to 8 + 4 = 12 decimal.
Octal numbers are always written with a 0 as the
leading digit.
3) Decimal, base is 10 ( valid digits 0,1,
…. 9)
this is our standard number system which we use
daily.
3) Hexadecimal, Base is 16 ( valid digits
are 0,1,2,3,4,5,6,7,8,9,10,A,B,C,D,E,F)
Here A => 10
B => 11
C => 12
D => 13
E => 14
F => 15
eg: 0x1A stands for 1*16^1 + A*16^0 = 1*16^1 + 10*16^0 = 16 +
10 = 26 decimal.
EG:
DECIMAL BINARY OCTAL HEXADECIMAL
135 1000
0111 0207 0x87
Conversion of the various
formats:
We
will be basically interested in the following conversion
Decimal
ß> Binary <-> HEX
NUMBER FORMAT CONVERSIONS:
Decimal to Binary
Conversion
Decimal
= 50
Conversion:
Keep Dividing 50 by 2 if remainder is
non zero write 1 in the bit position
Division Bit Position
-------------------------------------
|1 1
2 |3 1
2 |6 0
2 |12 0
2 |25 1
2 |50 0 Least
Significant Bit
Corresponding
Binary is 0011 0010
Binary to Hex
Conversion
Binary
= 1000 1011
Conversion:
Lower four bytes = 1011 = decimal 11 = B
Higher four bytes = 1000 = decimal 8
Corresponding
HEX is 0x8B
Hex to Decimal conversion
Hex
= 0x82
Conversion:
0x82 = 8*16^1 + 2*16^0 = 128 + 32 =
160
Decimal
is 160
Decimal to Hex
Deimal
= 2058
Conversion:
division remainder
|8 8
16 |128 0
16 |2058 10=A
Hex
= 0x80A
Forms of data storage in the
Mainframe: The
charater data is stored in EBCDIC (Extended Binary Coded Decimal …) format (
similar to ASCII ), the numeric data can be stored in pure binary, zoned,
decimal or packed decimal.
We
will take examples to illustrate the data formats:
Charater Data requires one
byte for one charater:
Charater Stored As
‘A’ 0xC1
‘a’ 0x81
‘1’ 0xF1
String data:
STRING=
“snytle” stored as 0xA2 0x950 xA8 0xA3 0x93 0x85 in six bytes.
Numeric Data in Binary
format:
123
in binary is stored in one byte as 01111011
Numeric Data stored in Zoned
decimal format:
123
stored in zoned decimal requires three bytes
it
is stored as
0xF1
0xF2 0xC3
Similarly
–123 is stored as 0xF1 0xF2 0xD3
Note that the first four bits of the LSByte indicates the
sign of the data.
Numeric Data stored in
Packed Decimal format:
The
same data 123 stored in packed decimal format will result in more efficient
storage.
123
is stored as 0x12 0x3C thus we require one byte less here.
Concept of Memory Addressing:
We
should know the basics of memory storage to understand addressing, memory is
referred by the size of data it can store like BYTE, HALF WORD, WORD, DOUBLE
WORD.
Each
Byte in memory can be individualy pointed to by its address.
The
address of each Half Word is the address of its Most Significant Byte ( left
most byte)
The
address of Word is the address of its
Most Significant Byte ( left most byte)
The
address of Double Word is the address of its Most Significant Byte ( left most
byte)
In
most OS 370/390 systems the maximum memory that can be addressed is
(2^24
– 1) bytes, which is 16 MB.
BITS AND BYTES
BYTE: 8 bits (1 byte)
0 0 0 0 0 0 0 0
|
HALF WORD: 16
bits (2 bytes)
0 0 0 0 0 0 0 0
|
0 0 0 0 0 0 0 0
|
Most S. Byte
WORD: 32
bits (4 bytes)
0 0 0 0 0 0 0 0
|
0 0 0 0 0 0 0 0
|
0 0 0 0 0 0 0 0
|
0 0 0 0 0 0 0 0
|
Most S. Byte
DOUBLE WORD: 64 bits
(8 bytes)
0 0 0 0 0 0 0 0
|
0 0 0 0 0 0 0 0
|
0 0 0 0 0 0 0 0
|
0 0 0 0 0 0 0 0
|
Most S. Byte
0 0 0 0 0 0 0 0
|
0 0 0 0 0 0 0 0
|
0 0 0 0 0 0 0 0
|
0 0 0 0 0 0 0 0
|
The Memory map for the mainframe system looks
like this:
Note
that each HALF WORD has an address which is exactly divisible by 2.
Also
note that each WORD has an address which is exactly divisible by 4.
The
total memory shown is 16MB = 2^24 –1 bytes, Each byte has a 24 bit address of
its own.
The
first address is 0x000000 and the last byte address is 0xFFFFFF.
Registers used in Mainframe:
There are 16 general purpose 32 bit memory locations used in the mainframe, these locations are
called registers, each has a four bit
binary address R0 = 0000, R1=0001,R2=0010…. R15 = 1111.
The registers arew referred to by their binary address.
0x01 represents R1 and so on.
REGISTERS
· There are a total of 16
registers in the mainframe.
· Each register size is 32
bits, i.e. 4 bytes.
· Registers are reffered to as
R0, R1, R2 . . . . . R15.
· The registers R0, R1, R12,
R13, R14 and R15 are used for a special purpose. All the other registers can be used by the
programmer for general programming.
· The address of the registers
are 0,1,2, …. 15.
· There is one more memory
location which has a special use and it is called the “ program status word ” This stores the current memory location of
the executing program and is used to pinpoint the currently executing
instruction. The ‘P.S.W’ also stores other data related to current state of the
CPU – Condition code , etc. The size of the PSW is 16 bytes. the lower 8 bytes
stores the address of the current instruction.
Base Displacement form of
Addressing:
The
memory that you have seen in slide#8 is accessed in the program in a unique way
called base displacement form.
The
total bits required to address each memory byte is 24 bits because there are
2^24 – 1 bytes. But the OS/370 OS/390 system has allocated only 16 bits to act
as a memory pointer to access the whole of the 2^24 –1 byte memory.
The
16 bits are divided in to two parts, the base of 4 bits and the displacement of
12 bits.
So
the memory is mapped into segments of 2^12 (4096) bytes each and each segments
absolute address is kept in a register pointed to by the first 4 bits. Thus a
total memory
Of 4096 * 16 = 65536 can be mapped (addressed)
if we use all the registers we have. This is still less than the total of 2^24 = 16,777,216 but then we seldom need
that much of memory.
The
advantage of the base displacement form of addressing is that programs can be
made relocatable.
Base Displacement Addressing Scheme:
Base 4 bits
|
Displacement
12 bits
|
Points to a Register R0 –
R15 The 12 bit displacement can
address a maximum area of 4096 Bytes. ( 2^ 12)
Memory
Address = Contents of the Register
pointed to by first four bytes + displacement.
Example:
0
0 1 0
|
0
0 0 0 0 0 0 0 1 1 0 0
|
Base Displacement
Memory
Address = contents of R4 + 12
= 2005 + 12 decimal
(assuming R4 contains 2005 )
= 2017 decimal.
Assembly statement format:
The
assembler language is column sensitive so we have to be careful about the
placement of the commands. The assembler interprets the symbol by referring to the
column number in which it is found.
ASSEMBLER STATEMENT FORMAT 1.
ASSEMBLER STATEMENT
FORMAT 2.
INSTRUCTION TYPES:
The
MVS Assembler language statements are divided into the following types
depending upon the type of operands, memory addressing, number of bytes etc.
· LENGTH
OF THE INSTRUCTION IS 16 BITS, THERE ARE TWO OPERANDS
· BOTH
OPERANDS ARE REGISTERS.
Example:
BALR R14,R15 OPCODE R1,R2
The corresponding instruction in HEX is
0X05
|
0XE
|
0XF
|
Implicit addressing: The above examples that we
have seen all support explicit addressing, i.e. we are specifying the actual
address in the base displacement form ( D1(B1) ), but we can use a symbol to
point to this address instead of specifying the register and displacement. The
symbol can be thought of as a label for that memory location.
So
now we can actualy use meaningful variables to represent the memory locations,
the following examples illustrate implicit addressing:
Let us take the example of the LOAD instruction that we had seen before.
L R1,D1(X2,B2)
Replacing D1(X2,B2) by a variable MYMEMORY we will have,
L R1, MYMEMORY
MYMEMORY is a variable name given to address = contents of B2 + displacement D1.
(index is taken as zero), so the effective statement is the same i.e register R1 is loaded with the WORD (4 bytes) located at the address MYMEMORY.
PREPARING AN ASSEMBLY
PROGRAM FOR RUN:
The source code written in Mnemonic form has to be assembled by using an assembler program, it is similar to the compiler used for higher level languages..
The
source program is assembled by the assembler into object code in two passes,
the first pass assigns values to symbols and addresses to literals, the second
creates the obect module and listing file.
The
listing file will display the object code produced in the object file and all
source module statements, a cross reference list of all literals and symbols
used in the program and a diagnostic section showing errors occurred during
assembly.
The
final Load module or executable code is produced by the linkeditor which resolves
all the addresses to called subroutines and modules.
Source program
Assemble
Object Module Listing file
LinkEdit
Load Module