8086 Microprocessor: Instruction Formats and Assembler Directives
8086 Instruction Formats
A machine language instruction format has one or more fields associated with it.
- The first field is called the operation code field or opcode field, which indicates the type of operation to be performed by the CPU.
- The instruction format also contains other fields known as operand fields.
- The CPU executes the instruction using the information which resides in these fields.
- There are six general formats of instructions in the 8086 instruction set.
- The length of an instruction may vary from 1 byte to 6 bytes. The instruction formats are described as follows:
One Byte Instruction
- This format is only one byte long and may have implied data or register operands.
- The least significant 3 bits of the opcode are used for specifying the register operand, if any.
- Otherwise, all 8 bits form an opcode, and the operands are implied.
Register to Register
- This format is 2 bytes long.
- The first byte of the code specifies the operation code and the width of the operand specified by the ‘w’ bit.
- The second byte of the code shows the register operands and the R/M field.
- The register represented by the REG field is one of the operands.
- The R/M field specifies another register or memory location, i.e., the other operand.
Register to/from Memory with No Displacement
- This format is also 2 bytes long and similar to the Register to Register format except for the MOD field.
- The MOD field shows the mode of addressing. The MOD, R/M, REG, and the ‘W’ fields.
Register to/from Memory with Displacement
- This type of instruction format contains 1 or 2 additional bytes for displacement along with the 2-byte format of the register to/from memory without displacement.
Immediate Operand to Register
- In this format, the first byte, as well as the 3 bits from the second byte, which are used for the REG field in the case of register-to-register format, are used for the opcode.
- It also contains one or two bytes of immediate data.
Immediate Operand to Memory with 16-bit Displacement
- This type of instruction format requires 5 or 6 bytes for coding.
- The first 2 bytes contain the information regarding OPCODE, MOD, and R/M fields.
- The remaining 4 bytes contain 2 bytes of displacement and 2 bytes of data.
Assembler Directives
Assembler directives are directions to the assembler that indicate how an operand or section of the program is to be processed. These are also called pseudo operations, which are not executable by the microprocessor. The following section explains the basic assembler directives for 8086.
ASSUME
The ASSUME directive is used to inform the assembler of the name of the logical segment it should use for a specified segment.
Ex: ASSUME DS: DATA
tells the assembler that for any program instruction that refers to the data segment, it should use the logical segment called DATA.
DB (Define Byte)
It is used to declare a byte variable or set aside one or more storage locations of type byte in memory.
For example, CURRENT_VALUE DB 36H
tells the assembler to reserve 1 byte of memory for a variable named CURRENT_VALUE and to put the value 36H in that memory location when the program is loaded into RAM.
DW (Define Word)
It tells the assembler to define a variable of type word or to reserve storage locations of type word in memory.
DD (Define Double Word)
This directive is used to declare a variable of type double word or restore memory locations that can be accessed as type double word.
DQ (Define Quadword)
This directive is used to tell the assembler to declare a variable 4 words in length or to reserve 4 words of storage in memory.
DT (Define Ten Bytes)
It is used to inform the assembler to define a variable that is 10 bytes in length or to reserve 10 bytes of storage in memory.
EQU (Equate)
It is used to give a name to some value or symbol. Every time the assembler finds the given name in the program, it will replace the name with the value or symbol we have equated with that name.
ORG (Originate)
The ORG statement changes the starting offset address of the data.
It allows you to set the location counter to a desired value at any point in the program. For example, the statement ORG 3000H
tells the assembler to set the location counter to 3000H.
PROC (Procedure)
It is used to identify the start of a procedure or subroutine.
END (End Program)
This directive indicates to the assembler that this is the end of the program module. The assembler ignores any statements after an END directive.
ENDP (End Procedure)
It indicates the end of the procedure (subroutine) to the assembler.
LENGTH Operator
LENGTH is an operator that tells the assembler to determine the number of elements in some named data item, such as a string or an array. When the assembler reads the statement MOV CX, LENGTH STRING1
, for example, it will determine the number of elements in STRING1 and load it into CX. If the string was declared as a string of bytes, LENGTH will produce the number of bytes in the string. If the string was declared as a word string, LENGTH will produce the number of words in the string.
OFFSET Operator
OFFSET is an operator that tells the assembler to determine the offset or displacement of a named data item (variable) or a procedure from the start of the segment that contains it. When the assembler reads the statement MOV BX, OFFSET PRICES
, for example, it will determine the offset of the variable PRICES from the start of the segment in which PRICES is defined and will load this value into BX.
SHORT Operator
The SHORT operator is used to tell the assembler that only a 1-byte displacement is needed to code a jump instruction in the program. The destination must be in the range of -128 bytes to +127 bytes from the address of the instruction after the jump. The statement JMP SHORT NEARBY_LABEL
is an example of the use of SHORT.
TYPE Operator
The TYPE operator tells the assembler to determine the type of a specified variable. The assembler actually determines the number of bytes in the type of the variable. For a byte-type variable, the assembler will give a value of 1; for a word-type variable, the assembler will give a value of 2; and for a double word-type variable, it will give a value of 4. It can be used in instructions such as ADD BX, TYPE WORD_ARRAY
, where we want to increment BX to point to the next word in an array of words.