Assembly Language: Stack, Addressing, RISC, and CISC

Posted on Dec 18, 2024 in Computers

Stack

Activation record (aka Stack Frame): Section of stack containing procedure components
Call stack: Activation records stacked on each other
STD Call: Method of adding to the stack and using RET n to deconstruct the stack
Moves toward the heap. Starts at a high address and decrements when values are added
PUSH OFFSET value – is 32 bit or 4 bytes ex. zBYTE “Why are you looking at this?”, 0

Order of Adding to Stack:

Passed parameters: By PUSHing before the procedure call
Return address: By the procedure CALL
Old value of Base Pointer: By PUSH EBP or assembler directive
Local Variables: By directly decrementing ESP or by assembler directive LOCAL
Saved Registers: Individually by PUSH, as a group by PUSHAD, or by assembler directive USES

Addressing Types

Direct addressing: Access data using variable name MOV EAX, maxVal
Base + Offset: MOV EAX, [EDX + 12] where EDX is constant
Register Indirect: The concept of a base pointer is abandoned and the register itself is incremented or decremented to change the pointer to another element of the array
Indirect Operand: Any of the 4-byte multi-purpose registers, surrounded by brackets (e.g., [EBP])
Indexed Operand: Combines an array name with a byte-distance offset to the element of interest. [myArr + 12], myArr[8]

Element Access

Address of list[n] = (Address of list) + ((n-1) x (TYPE of element))
List[11] is accessing the 10^th item.

Commands

PTR: Allows us to explicitly specify the number of bytes to write to memory. MOV DWORD PTR [EDI], 10
PUSHAD: Pushes all 32-bit registers onto the stack in order EAX, ECX, EDX, EBX, ESP (value before executing PUSHAD), EBP, ESI, and EDI
PUSHA: 16-bit registers
PUSHFD/POPFD: EFLAGS
USES: Saves and restores registers used in a procedure
PUSH 16bit ESP-2, 32BIT ESP-4

Little Endian vs Big Endian

Little Endian: myArr1 WORD 11AAh | Multi-byte take higher address first | Memory: 100: AA 101: 11

Arrays

LENGTHOF: Number of elements
SIZEOF: Number of bytes

Different Ways to Declare Arrays

list DWORD MAXSIZE DUP(?)
valArr DWORD 100,200,0AAh,10101010b,800
myStr BYTE “One Byte Per Character!”,0

Module 8

Direction Flag

CLD: Clear Direction Flag (primitives will increment pointer)
- Primitives will increment by the size (in bytes) of the TYPE
- Used to move “forward” through an array
STD: Set Direction Flag (primitives will decrement pointer)
- Primitives will decrement by the size (in bytes) of the TYPE
- Used to move “backward” through an array

Accumulators: AL, AX, EAX

Operation	Explanation	BYTE Instruction	WORD Instruction	DWORD Instruction
Store	Store accumulator contents into memory addressed by EDI	STOSB	STOSW	STOSD
Load	Load memory addressed by ESI into accumulator	LODSB	LODSW	LODSD
Move	Copy data from memory addressed by ESI into memory addressed by EDI	MOVSB	MOVSW	MOVSD
Compare	Compare contents of two memory locations addressed by ESI and EDI	CMPSB	CMPSW	CMPSD
Scan	Compare the accumulator to memory addressed by EDI	SCASB	SCASW	SCASD

REP Instructions

Instruction	Function
REP	Repeat string primitive and decrement ECX while ECX > 0
REPZ, REPE	Repeat string primitive and decrement ECX while the Zero flag is set and ECX > 0. Caution: Should only be used with SCAS* or CMPS* since only these modify the Zero flag.
REPNZ, REPNE	Repeat string primitive and decrement ECX while the Zero flag is clear and ECX > 0 Caution: Should only be used with SCAS* or CMPS* since only these modify the Zero flag.

Algorithm to Convert ASCII Character to Integer

numInt = 0

get numString

for numChar in numString:

if 48 <= numChar <= 57:

numInt = 10 * numInt + (numChar – 48)

else:

break

Macros vs Procedures

Procedures…	Macros…	Compare?
are a separate, named section of code	are a separate, named section of code	Yes
are used to implement a module of program logic	are used to implement some small task or to simplify writing/reading a program.	No
have call / return mechanisms which modify the instruction pointer during runtime.	are replaced inline by the macro body as a preprocessing step.	No
may have parameters, passed in registers/on the stack.	may have arguments which replace parameter placeholders in the macro definition.	No
may be used multiple times without bloating the code segment.	are expanded every time they are called, possibly bloating the code segment.	No

MASM preprocessor implements inline expansion for MACRO

Local Directive in Procedures STILL REQUIRES RET

Creates the stack frame. Inserts two lines of code at the beginning of a procedure: PUSH EBP | MOV EBP, ESP
Terminates the stack frame. Inserts two lines of code at the end of a procedure: MOV ESP, EBP | POP EBP
Makes space on the stack for any local variables.
- Local variables are stored on the stack immediately above the old value of EBP.
- The LOCAL directive directly subtracts from ESP a number of bytes equivalent to the number of bytes of local variables you declare.
- If you declare two LOCAL DWORDs, the operation ADD ESP, -8 is performed.
Provides the software engineer with labels which reference these stack locations.

Module 9

An operand is an input to an expression (number values, variables, etc…), and an operator is a control for the expression (+, -, *, /, etc..)

Convert Infix to Postfix

When encountering an operand, dump it directly to the output.
When encountering an operator
- If operator (, push to stack
- If operator ), repeatedly pop top of stack to output until you hit the topmost ( then erase ( from input and ) from top of stack.
- If any other operator
  - If operator is higher precedence than the top of the stack, push to stack.
  - If operator is lower or equal precedence to the top of the stack, pop top of stack to output and repeat from “If any other operator”
At end of the input, dump the stack to output

There is a modified order of precedence for operators for evaluating the conditions above:

^ (Exponent)
*, / (Multiplication, Division)
+, – (Addition, Subtraction)
( (Opening Parenthesis)
= (Evaluation)

Convert Postfix to Infix

Scan the Postfix String from Left to Right
If the character is an Operand, write it down to the right of the previous Operand (leave space for Operators).
If the character is an Operator, bracket the two previous Operands with parenthesis and insert the Operator between them.
Anything in the bounds of parenthesis is considered a single Operand.

Module 10

Reduced Instruction Set Computers have a much smaller set of instructions at the ISA level, and their instructions are executed immediately without having to go through instruction decoding

-As a tradeoff, RISC Assembly Language programs look much longer than CISC Assembly Language programs because CISC instructions do more on a per-instruction basis.

-However, RISC programs tend to execute more quickly due to the lack of micro-program execution overhead.

RISC Design Principles

Prioritize single-cycle instruction execution
- Instructions are executed directly by hardware (no micro-programs)
Maximize rate of Fetching instructions by using an Instruction Cache
- Fetch multiple instructions rather than just one
- Reduces idle time due to Instruction Fetch
- Allows deep instruction pipelining. More on pipelining later…
- May also be present in CISC architectures.
Only two memory-access instructions (LOAD and STORE)
- Compare to CISC, with its multiple MOV implementations, FPU memory access mechanisms, ALU memory access mechanisms, string primitives, etc…
- The decoupling of memory and processing allows for more efficient pipelining.
Simplify Instructions and use few Addressing Modes
- As a result, there are very few instructions and the CPU can process instructions quickly.
Another interesting aspect, which isn’t really a design principle, is that RISC designs tend to have more registers than CISC designs.

RISC Benefits

Physically smaller and less expensive (less silicon in the CPU integrated circuit itself) for the same level of performance, which allows the products that use them to be smaller and lower cost.
As an alternative to minimizing the CPU size, more circuitry can be added onto the CPU integrated circuit without it becoming too complicated to manufacture efficiently. This allows product designers to integrate electronic functions that would otherwise be separate parts, which allows even further reduction in product size and cost.
Lower complexity of the CPU circuitry boosts long-term reliability, so products can have a longer operating life.
Lower power consumption than their CISC equivalents, so the products using them can be more environmentally friendly due to their better energy efficiency. Battery-powered products get an added bonus. Designers can choose to keep the same batteries and have a longer run time, which helps the product user. Or designers can use smaller, less-expensive batteries for the same run time, which makes the product smaller and lower cost.
Lower operating temperature (directly related to the lower power consumption) means products can reduce the cost of mechanical components that manage heat dissipation (such as fans, cooling systems, heat sinks), which reduce product cost and complexity, and increase product reliability.

Hardware Parallelism

Instruction Level Parallelism
- Instruction Caching – (a fundamental design method in RISC architectures) involves bringing a group of instructions into a cache, rather than fetching them individually. This makes these instructions available for immediate execution when the processor is ready for them. This method works great when the structure of the program is mostly sequential. Decision structures, repetition, procedure calls, or any place where execution might transfer to a different part of the program, cause it to lose efficiency.

Branching

Flags

Status Flag	Abbreviation	EFLAGS bit #	Description
Overflow	OF or OV or O	Bit 11	Used in signed-integer (two’s complement) arithmetic: Set if the integer result is too large a positive number or too small a negative number (excluding the sign-bit) to fit in the destination operand; cleared otherwise. This flag indicates an overflow condition.
Direction	UP or D	Bit 10	Sets direction ESI or EDI are stepped when executing string processing instructions.
Interrupt Enable	EI or I	Bit 9	Set indicates hardware-generated interrupts will be issued. Cleared indicates they are masked (i.e., the interrupts will not be issued).
Sign	SF or PL or S	Bit 7	Set equal to the most-significant bit of the result, which is the sign bit of a signed integer. (0 indicates a positive value and 1 indicates a negative value.)
Zero	ZF or ZR or Z	Bit 6	Set if the result is zero; cleared otherwise.
Auxiliary Carry	AF or AC or A	Bit 4	Used in binary-coded decimal (BCD) arithmetic: Set if an arithmetic operation generates a carry or a borrow out of bit 3 of the result; cleared otherwise.
Parity	PF or PE or P	Bit 2	Set if the least-significant byte of the result contains an even number of 1 bits; cleared otherwise.
Carry	CF or CY or C	Bit 0	Used in unsigned arithmetic: Set if an arithmetic operation generates a carry or a borrow out of the most significant bit of the result; cleared otherwise. It is also used in multiple-precision arithmetic.

Assembly Language: Stack, Addressing, RISC, and CISC

Stack

Order of Adding to Stack:

Addressing Types

Element Access

Commands

Little Endian vs Big Endian

Arrays

Different Ways to Declare Arrays

Module 8

Direction Flag

Operation

Explanation

BYTE Instruction

WORD Instruction

DWORD Instruction

REP Instructions

Instruction

Function

Algorithm to Convert ASCII Character to Integer

Macros vs Procedures

Procedures…

Macros…

Compare?