File Systems: Essential Concepts and Operations

Essential Conditions for Long-Term Information Storage

The essential conditions for the storage of long-term information are:

  • It should be possible to store a large amount of information.
  • The information must survive the completion of the process that uses it.
  • It should be possible that several concurrent processes access the information.

The Solution: Files on External Media

The solution is to store information on disks and other external media in units called files:

  • Files should be persistent, i.e., they should not be affected by the creation or termination of a process.
  • Files are a collection of named data.
  • They can be manipulated as a unit for operations such as open, close, create, destroy, copy, rename, list.
  • Individual data elements within the file can be manipulated by operations such as read, write, update, insert, delete.

File System: Part of Storage Management

The File System is part of the storage management system mainly responsible for the administration of secondary storage files.

The OS is the part responsible for allowing controlled sharing of information files.

Functions of the File System

  • Users should be able to create, modify, and delete files.
  • Should be able to share files in a carefully controlled manner.
  • The mechanism for sharing files must provide several types of controlled access:
  • E.g., read access, write access, execute access, various combinations of these, and so on.
  • It should be possible to structure the files most appropriately for each application.
  • Users should be able to order the transfer of information between files.

Backup and Recovery

It must provide backup and recovery options to guard against:

  • The accidental loss of information.
  • The malicious destruction of information.

Symbolic Names and Device Independence

  • It should be possible to reference files using symbolic names, providing device independence.
  • In sensitive environments, the file system must provide opportunities for encryption and decryption.

User-Friendly Interface

The file system must provide a user-friendly interface:

  • They must provide a logical view of data and functions to be executed, rather than a physical view.

The user should not have to worry about:

  • Particular devices.
  • Where data will be stored.
  • The data format on devices.
  • Physical means of transferring data to and from devices.

Components of a File System

The File System is an important component of an OS and usually contains:

  • Access methods concerning how to access data stored in files.
  • File management refers to the provision of mechanisms for the files to be stored, referenced, shared, and secured.
  • Auxiliary Storage Management for allocating space to files on secondary storage devices.
  • File Integrity to ensure the integrity of the file information.

The file system is primarily concerned with the management of secondary storage space, primarily for disk storage.

Organization of a File System

One way of organizing a file system can be:

  • Root is used to indicate where on the disk the root directory begins.
  • The root points to the root directory.
  • A user directory contains one entry for each user’s files.
  • Each file entry points to the place on the disk where the referenced file is stored.
  • File names need only be unique within a given user directory.
  • The name for a given file system must be unique to the file system.
  • In a hierarchical file system, the name for a file system is usually made as the name of the root directory path to the file.

File Naming

  • Exact rules for file names vary from system to system.
  • Some file systems distinguish between uppercase and lowercase letters, while others do not.
  • Many OS use file names with two parts, separated by a dot:
  • The part after the item is the file extension and usually indicates something in the file, although extensions are often mere conventions.

Structure of a File

Files can be structured in various ways, the most common are:

Sequence of Bytes

  • The file is an unstructured series of bytes.
  • Has maximum flexibility.
  • The OS does not help but neither interferes.

Sequence of Records

The file is a sequence of fixed-length records, each with its own internal structure.

Tree

  • The file consists of a tree of records, not necessarily of the same length.
  • Each record has a key field (key or key) in a fixed position of the record.
  • The tree is sorted by the key field to allow a quick search for a particular key.

File Types

Many OS support various types of files, e.g., Regular files, directories, character special files, block special files, etc., where:

  • Regular files are those that contain user information.
  • Directories are system files for maintaining a file system structure.

Character Special Files

  • Have regarding the I/O.
  • Are used to model serial I/O devices (terminals, printers, networking, etc.).
  • Block special files are used to model disks.

File Access

The most popular types of access are:

Sequential Access

The process reads in order all records in the file starting at the top, without the ability to:

  • Jump records.
  • Read in another order.

Random Access

The process can read the records in any order using two methods to determine the point of beginning reading:

  • Each read operation (read) gives the position in the file with which to start.
  • A special operation (seek) sets the working position; the file can then be read sequentially.

Working with Files

The most common calls related to system files are:

  • Create (create): The file is created with no data.
  • Delete (delete): If the file is no longer needed, it should be removed to free up disk space. Certain OS automatically delete a file if it is not used for N days.
  • Open (open): Before using a file, a process must open it. The purpose is to allow the system to move the attributes and the address list on disk to main memory for quick access on subsequent calls.
  • Close (close): When the accesses are concluded, the attributes and disk addresses are no longer needed, so the file needs to close and free the table space inside.
  • Read (read): Data is read from the file; the caller must specify the amount of data needed and provide a buffer to place them.
  • Write (write): Data is written to the file in the current position. The file size may increase (added entries) or not (update records).
  • Append (add): Is a restricted form of writing. You can only add data to the end of the file.
  • Seek (search): Specifies the point at which to position. Change the position of the pointer to the active position at a certain location of the file.
  • Get attributes (get attributes): Allows processes to obtain the file attributes.
  • Set attributes (set attributes): Some attributes may be determined by the user and modified after the file creation. Information regarding the type of protection and most of the flags are obvious examples.
  • Rename (rename): Lets you change the name of an existing file.

Memory-Mapped Files

Some OS allow you to associate files with an address space of a running process.

System Calls for Mapping and Unmapping

  • Map: Using a file name and a virtual address, the OS associates the file with the virtual address in the address space, so the reads or writes of memory areas associated with the file are also carried on the mapped file.
  • Unmap: Removes files and address space, ending the partnership operation.
  • The file mapping eliminates the need to schedule the I/O directly, facilitating programming.

Main Problems with Memory-Mapped Files

  • The impossibility of knowing a priori the length of the output file, which could overcome the memory.
  • Difficulty sharing mapped files to avoid inconsistencies because the modifications made to the pages will not be reflected on disk until those pages are removed from memory.

Directories

They are generally used by the OS to keep track of files. Many systems also have directories as files.