Assembly Language: Stack, Addressing, RISC, and CISC

Stack

  • Activation record (aka Stack Frame): Section of stack containing procedure components
  • Call stack: Activation records stacked on each other
  • STD Call: Method of adding to the stack and using RET n to deconstruct the stack
  • Moves toward the heap. Starts at a high address and decrements when values are added
  • PUSH OFFSET value – is 32 bit or 4 bytes ex. zBYTE “Why are you looking at this?”, 0

Order of Adding to Stack:

  1. Passed parameters: By PUSHing before the procedure call
  2. Return address: By the procedure CALL
  3. Old value of Base Pointer: By PUSH EBP or assembler directive
  4. Local Variables: By directly decrementing ESP or by assembler directive LOCAL
  5. Saved Registers: Individually by PUSH, as a group by PUSHAD, or by assembler directive USES

Addressing Types

  • Direct addressing: Access data using variable name MOV EAX, maxVal
  • Base + Offset: MOV EAX, [EDX + 12] where EDX is constant
  • Register Indirect: The concept of a base pointer is abandoned and the register itself is incremented or decremented to change the pointer to another element of the array
  • Indirect Operand: Any of the 4-byte multi-purpose registers, surrounded by brackets (e.g., [EBP])
  • Indexed Operand: Combines an array name with a byte-distance offset to the element of interest. [myArr + 12], myArr[8]

Element Access

  • Address of list[n] = (Address of list) + ((n-1) x (TYPE of element))
  • List[11] is accessing the 10th item.

Commands

  • PTR: Allows us to explicitly specify the number of bytes to write to memory. MOV DWORD PTR [EDI], 10
  • PUSHAD: Pushes all 32-bit registers onto the stack in order EAX, ECX, EDX, EBX, ESP (value before executing PUSHAD), EBP, ESI, and EDI
  • PUSHA: 16-bit registers
  • PUSHFD/POPFD: EFLAGS
  • USES: Saves and restores registers used in a procedure
  • PUSH 16bit ESP-2, 32BIT ESP-4

Little Endian vs Big Endian

  • Little Endian: myArr1 WORD 11AAh | Multi-byte take higher address first | Memory: 100: AA 101: 11

Arrays

  • LENGTHOF: Number of elements
  • SIZEOF: Number of bytes

Different Ways to Declare Arrays

  • list DWORD MAXSIZE DUP(?)
  • valArr DWORD 100,200,0AAh,10101010b,800
  • myStr BYTE “One Byte Per Character!”,0

Module 8

Direction Flag

  • CLD: Clear Direction Flag (primitives will increment pointer)
    • Primitives will increment by the size (in bytes) of the TYPE
    • Used to move “forward” through an array
  • STD: Set Direction Flag (primitives will decrement pointer)
    • Primitives will decrement by the size (in bytes) of the TYPE
    • Used to move “backward” through an array

Accumulators: AL, AX, EAX

Operation

Explanation

BYTE Instruction

WORD Instruction

DWORD Instruction

Store

Store accumulator contents into memory addressed by EDI

STOSB

STOSW

STOSD

Load

Load memory addressed by ESI into accumulator

LODSB

LODSW

LODSD

Move

Copy data from memory addressed by ESI into memory addressed by EDI

MOVSB

MOVSW

MOVSD

Compare

Compare contents of two memory locations addressed by ESI and EDI

CMPSB

CMPSW

CMPSD

Scan

Compare the accumulator to memory addressed by EDI

SCASB

SCASW

SCASD

REP Instructions

Instruction

Function

REP

Repeat string primitive and decrement ECX while ECX > 0

REPZREPE

Repeat string primitive and decrement ECX while the Zero flag is set and ECX > 0.

Caution: Should only be used with SCAS* or CMPS* since only these modify the Zero flag.

REPNZREPNE

Repeat string primitive and decrement ECX while the Zero flag is clear and ECX > 0

Caution: Should only be used with SCAS* or CMPS* since only these modify the Zero flag.

Algorithm to Convert ASCII Character to Integer

numInt = 0

 get numString

 for numChar in numString:

   if 48 <= numChar <= 57:

     numInt = 10 * numInt + (numChar – 48)

   else:

     break

Macros vs Procedures

Procedures…

Macros…

Compare?

are a separate, named section of code

are a separate, named section of code

Yes

are used to implement a module of program logic

are used to implement some small task or to simplify writing/reading a program.

No

have call / return mechanisms which modify the instruction pointer during runtime.

are replaced inline by the macro body as a preprocessing step.

No

may have parameters, passed in registers/on the stack.

may have arguments which replace parameter placeholders in the macro definition.

No

may be used multiple times without bloating the code segment.

are expanded every time they are called, possibly bloating the code segment.

No

MASM preprocessor implements inline expansion for MACRO

Local Directive in Procedures STILL REQUIRES RET

  1. Creates the stack frame. Inserts two lines of code at the beginning of a procedure: PUSH EBP | MOV EBP, ESP
  2. Terminates the stack frame. Inserts two lines of code at the end of a procedure: MOV ESP, EBP | POP EBP
  3. Makes space on the stack for any local variables.
    • Local variables are stored on the stack immediately above the old value of EBP.
    • The LOCAL directive directly subtracts from ESP a number of bytes equivalent to the number of bytes of local variables you declare.
    • If you declare two LOCAL DWORDs, the operation ADD ESP, -8 is performed.
  4. Provides the software engineer with labels which reference these stack locations.

Module 9

An operand is an input to an expression (number values, variables, etc…), and an operator is a control for the expression (+, -, *, /, etc..)

Convert Infix to Postfix

  1. When encountering an operand, dump it directly to the output.
  2. When encountering an operator
    • If operator (, push to stack
    • If operator ), repeatedly pop top of stack to output until you hit the topmost ( then erase ( from input and ) from top of stack.
    • If any other operator
      • If operator is higher precedence than the top of the stack, push to stack.
      • If operator is lower or equal precedence to the top of the stack, pop top of stack to output and repeat from “If any other operator”
  3. At end of the input, dump the stack to output

There is a modified order of precedence for operators for evaluating the conditions above:

  1. ^ (Exponent)
  2. *, / (Multiplication, Division)
  3. +, – (Addition, Subtraction)
  4. ( (Opening Parenthesis)
  5. = (Evaluation)

Convert Postfix to Infix

  1. Scan the Postfix String from Left to Right
  2. If the character is an Operand, write it down to the right of the previous Operand (leave space for Operators).
  3. If the character is an Operator, bracket the two previous Operands with parenthesis and insert the Operator between them.
  4. Anything in the bounds of parenthesis is considered a single Operand.

Module 10

Reduced Instruction Set Computers have a much smaller set of instructions at the ISA level, and their instructions are executed immediately without having to go through instruction decoding

-As a tradeoff, RISC Assembly Language programs look much longer than CISC Assembly Language programs because CISC instructions do more on a per-instruction basis.

-However, RISC programs tend to execute more quickly due to the lack of micro-program execution overhead.

RISC Design Principles

  • Prioritize single-cycle instruction execution
    • Instructions are executed directly by hardware (no micro-programs)
  • Maximize rate of Fetching instructions by using an Instruction Cache
    • Fetch multiple instructions rather than just one
    • Reduces idle time due to Instruction Fetch
    • Allows deep instruction pipelining. More on pipelining later…
    • May also be present in CISC architectures.
  • Only two memory-access instructions (LOAD and STORE)
    • Compare to CISC, with its multiple MOV implementations, FPU memory access mechanisms, ALU memory access mechanisms, string primitives, etc…
    • The decoupling of memory and processing allows for more efficient pipelining.
  • Simplify Instructions and use few Addressing Modes
    • As a result, there are very few instructions and the CPU can process instructions quickly.
  • Another interesting aspect, which isn’t really a design principle, is that RISC designs tend to have more registers than CISC designs.

RISC Benefits

  • Physically smaller and less expensive (less silicon in the CPU integrated circuit itself) for the same level of performance, which allows the products that use them to be smaller and lower cost.
  • As an alternative to minimizing the CPU size, more circuitry can be added onto the CPU integrated circuit without it becoming too complicated to manufacture efficiently. This allows product designers to integrate electronic functions that would otherwise be separate parts, which allows even further reduction in product size and cost.
  • Lower complexity of the CPU circuitry boosts long-term reliability, so products can have a longer operating life.
  • Lower power consumption than their CISC equivalents, so the products using them can be more environmentally friendly due to their better energy efficiency. Battery-powered products get an added bonus. Designers can choose to keep the same batteries and have a longer run time, which helps the product user. Or designers can use smaller, less-expensive batteries for the same run time, which makes the product smaller and lower cost.
  • Lower operating temperature (directly related to the lower power consumption) means products can reduce the cost of mechanical components that manage heat dissipation (such as fans, cooling systems, heat sinks), which reduce product cost and complexity, and increase product reliability.

Hardware Parallelism

  • Instruction Level Parallelism
    • Instruction Caching – (a fundamental design method in RISC architectures) involves bringing a group of instructions into a cache, rather than fetching them individually. This makes these instructions available for immediate execution when the processor is ready for them. This method works great when the structure of the program is mostly sequential. Decision structures, repetition, procedure calls, or any place where execution might transfer to a different part of the program, cause it to lose efficiency.

Branching

Flags

Status Flag

Abbreviation

EFLAGS bit #

Description

Overflow

OF or OV or O

Bit 11

Used in signed-integer (two’s complement) arithmetic: Set if the integer result is too large a positive number or too small a negative number (excluding the sign-bit) to fit in the destination operand; cleared otherwise. This flag indicates an overflow condition.

Direction

UP or D

Bit 10

Sets direction ESI or EDI are stepped when executing string processing instructions.

Interrupt Enable

EI or I

Bit 9

Set indicates hardware-generated interrupts will be issued. Cleared indicates they are masked (i.e., the interrupts will not be issued).

Sign

SF or PL or S

Bit 7

Set equal to the most-significant bit of the result, which is the sign bit of a signed integer. (0 indicates a positive value and 1 indicates a negative value.)

Zero

ZF or ZR or Z

Bit 6

Set if the result is zero; cleared otherwise.

Auxiliary Carry

AF or AC or A

Bit 4

Used in binary-coded decimal (BCD) arithmetic: Set if an arithmetic operation generates a carry or a borrow out of bit 3 of the result; cleared otherwise.

Parity

PF or PE or P

Bit 2

Set if the least-significant byte of the result contains an even number of 1 bits; cleared otherwise.

Carry

CF or CY or C

Bit 0

Used in unsigned arithmetic: Set if an arithmetic operation generates a carry or a borrow out of the most significant bit of the result; cleared otherwise. It is also used in multiple-precision arithmetic.