Linking, Loading, and Distributed Systems in Operating Systems
I. Linking & Loading
1. Syntactic and Semantic Analysis
The parser performs syntactic and semantic analysis. It can be generated automatically using tools like Yacc or Bison. Parsers are generated based on a grammar, typically in Backus-Naur Form (BNF).
2. Intermediate Code
An example of intermediate code in Java is the ‘.class’ file.
3. Background
The operating system (OS) is responsible for starting programs. For a program to execute, it must be loaded into memory within a process’s address space. User programs undergo several steps before execution. Linkers and loaders prepare the program for execution. They bind abstract names used by programmers to concrete numeric values, which are memory addresses.
4. Linker vs. Loader
Linkers and loaders are similar but highly sensitive to the architectural details of the CPU and OS. The loader handles program loading (copying the program from secondary storage to main memory) and relocation. The linker performs symbol resolution (resolving references between subprograms using symbols) and relocation (as each object code program’s address starts at 0). Some systems employ linking loaders.
5. Binding Instructions and Data to Memory
- Compile Time: If the memory location is known beforehand, absolute code can be generated. However, recompilation is necessary if the starting location changes.
- Load Time: Relocatable code is generated when the memory location is unknown at compile time.
- Execution Time: Binding is delayed until runtime if the process might move between memory segments during execution.
6. Two-Pass Linking
The linker takes object files, libraries, and command files as input. Its output is an executable file, a link/load map, and/or a debug symbol file. Linkers often use a two-pass approach: the first pass scans for symbols, and the second pass creates the executable file.
7. Object Files
Compilers and assemblers generate object files from source files. These files contain header information, object code (binary instructions and data), relocation information, symbols, and debugging information.
8. Libraries
A library is a collection of object modules. Dependencies can exist between libraries. Linking libraries is an iterative process. The linker reads object files within a library, searching for external symbols. If found, the linker adds the object file to the program and its external symbols to the program’s list of external symbols. This process repeats until no new external symbols or object files are added.
9. PE Sections
In Portable Executable (PE) format, each section has an address and size within the file, as well as a memory address and size. Each section also has read, write, and execute permissions. The linker creates the PE file for a specific target address, known as the image base.
10. Shared Libraries
Shared libraries offer efficiency by avoiding the need to link the same library to every program. When a program starts, the loader locates and maps the required libraries to the program’s address space. Standard systems share read-only pages. Static shared libraries necessitate distinct addresses. Assigning address space to libraries is complex.
11. Dynamic Libraries
Dynamic libraries can be relocated to free address space, simplifying updates and sharing. Dynamic linking allows programs to load and unload routines during runtime. This leads to better memory utilization, as unused routines are never loaded. It’s particularly useful when handling infrequent cases that require substantial code.
12. Dynamic Linking Libraries (DLLs)
Similar to ELF dynamic libraries, DLLs can be relocated if the required address space is unavailable. The dynamic linker is part of the Windows kernel. Lazy binding defers binding until execution time. Each exported DLL function is identified by a numeric ordinal and a name. Function addresses are stored in the Export Address Table.
13. ELF Dynamic Libraries
Executable and Linkable Format (ELF) dynamic libraries can load at any address due to their position-independent code (PIC). The Global Offset Table (GOT) holds pointers to all static data referenced in the program. Lazy procedure linkage is achieved using the Procedure Linkage Table (PLT). The dynamic loader (ld.so) locates the library by its name, major version, and minor version.
J. Networking
1. Distributed Systems
A distributed system is a collection of loosely coupled processors connected through a communication network. Benefits include:
- Resource sharing
- Computation speedup through load sharing
- Increased reliability
- Communication via message passing
Types of Distributed Systems:
- Network OS: Users are aware of multiple machines and access them via remote login (e.g., Telnet, SSH) or data transfer (e.g., FTP).
- Distributed OS: Users are unaware of multiple machines. Access to remote resources resembles local access. Features include data migration, computation migration, process migration, load balancing, computation speedup, hardware and software preference, and transparent data access.
2. Local Area Network (LAN)
A LAN covers a small area, typically connecting workstations or PCs. It can be a multi-access bus, ring, or star network, operating at speeds of 100 Mb/s or higher. Broadcasting on a LAN is fast and inexpensive.
3. Wide Area Network (WAN)
A WAN connects geographically distant sites using point-to-point connections over long-haul lines. Broadcasting on a WAN usually requires multiple messages.
4. Network Topology
Systems can be physically connected in various ways, each with its own characteristics. Network topologies are evaluated based on:
- Basic cost (link cost)
- Communication cost
- Reliability