Database Management Systems: Concepts and Best Practices

Key Definitions

  • Data: Information that a computer stores and records.
  • Record: Discrete piece of information in a register.
  • Registration: Information related to a product, event, or person.
  • File or Data File: A collection of related information, stored like a file. A file is a set of records.
  • Query: The search for a specific record or a request to select records that meet a set of criteria. There are select, delete, table creation, and update queries.
  • Report: Ordered list of selected fields and records in an easy-to-read format.

Concept of File

A file is a wealth of information structured in the same item, treated as a storage unit. Its format and file type determine how to interpret the information it contains, as it is only a set of 0s and 1s. Database files are homogeneous records containing information on each object in the database. Each piece of information in a record is called a field.

File Types

Permanent Archives

(Lifespan is at least as long as the application that uses them)

  • Master Files: Contain the current state of the data relevant to the application.
  • Constant Files: Their data are not changed, only queried.
  • Historical Archives: Keep a trace of the values of the files that interest us.

Temporary Archives

(Created and deleted during the execution of the application)

  • Intermediate Files: Saved results are used by another application process.
  • Maneuver Files: Data stored in main memory do not change.
  • Output File: Save data to be sent to an output device.

File Operations

  1. Creating and deleting data (mark the life of a file).
  2. Changing the design or structure of the database.
  3. Modification of records.
  4. Insertion/deletion of records.
  5. Information retrieval (query).
  6. Maintenance:
    • Repairs (check the records and repair errors if possible).
    • Compaction (eliminates gaps in the file produced by the deletion of records).

Data File Organization

  • Sequential: Keeps records contiguously. Records can only be added at the end, and querying requires reading the preceding records. Update operations rewrite the entire file.
  • Chained: All registers contain a pointer to the next. Insertion is also done at the end. Amendment does not rewrite deleted files and leaves room in the data file. The consultation is still sequential.
  • Index: The index file is processed sequentially, and each index stores the address where the log information is saved. The deletion is logical (it does not free the space occupied by the log information).
  • Direct or Random: We can calculate its position in the file based on the address.
    • Direct Routing: Use a numerical formula.
    • Partner Address: Stores the record address.
    • Calculated Address: The address is calculated by a hash function.

Sorting and Searching

In a file, it can only be ordered, not searched, for sequential management. Algorithms include direct exchange, mergers, and Quick Sort.

Other Aspects to Consider

  • Maintenance: Updates, screening, compaction, etc. If there is a lot of redundancy, maintenance will be expensive.
  • Privacy and Security: Withholding information from unauthorized persons to prevent unwanted changes.

Limitations of Traditional File Systems

  • Data redundancy.
  • Poor data control.
  • Poor data handling capabilities.
  • Excessive programming effort.

Characteristics of Data

  • Should not change with time (not be ephemeral).
  • Structured to facilitate data needs.
  • Operational and transactional, which can be used by operators to obtain certain results.
  • With a specific meaning (semantic sense).
  • Complete, i.e., must exist in reality.

Database Management System

A Database Management System is a coordinated set of programs, languages, or procedures that provide us with the means to describe, retrieve, and manipulate data while maintaining its integrity, confidentiality, and security.

Structure

USER <-> APPLICATION PROGRAM <-> DATABASE MANAGER <-> DATABASE

Functions

  • Description or Definition: Specifies the data in the database, its structure, and the relationships between them.
  • Handling: Allows us to find, delete, or modify data in the database.
  • Use: Provides the necessary interfaces to allow users to communicate with the database.

Basic Features of a Database

  • Integrity: The unification of several separate files can be viewed as a database, partially or totally eliminating redundancy between them.
  • Sharing: Individual parts of a database should be able to be shared among different users while maintaining data integrity.
  • Security and Confidentiality Restrictions: It has to create a hierarchy of access to maintain secure access to data, allowing or prohibiting users from performing certain actions on the database.
  • Multiple Data Views: It must be possible to retrieve the desired information from the database in the required format (lists, graphics, etc.).
  • Fault Protection: It is necessary to check data integrity at all times and create backups.
  • High-Level Interface: You must be able to access and modify data using SQL.

Advantages of Databases

Referring to Data

  1. Independent of those on treatment and vice versa.
  2. Better availability of them.
  3. Greater storage efficiency, coding, and entry.

Referring to Results

  1. Greater consistency.
  2. More informative.
  3. Improved and more standardized documentation of information.

Referring to Users

  1. Faster and easier access for end users.
  2. More facilities to share data among all users.
  3. Greater flexibility to meet changing demands.

Disadvantages of Databases

Relating to Implementation

  1. Costly in software and hardware equipment.
  2. Absence of standards.
  3. Long and difficult implementation.
  4. Medium-term profitability.

Relating to Users

  1. Personnel.
  2. Gap between theory and practice.

Database Architecture

(3 levels of abstraction)

  1. External: The highest level. Contains external schemas that describe the set of data that each user or group needs. The same data can have different views.
  2. Conceptual Level: Has a conceptual framework that describes the overall structure of the database with these features:
    1. Independent physical structure.
    2. Describes entities, attributes, relationships, and restrictions.
    3. Is defined from the logical data model.
  3. Internal Level: Describes the physical storage structure of the database. Influences the efficiency of the database.

Data Models

(3 types)

  1. Object-Based Logical Models: Entity-relationship models and object-oriented.
  2. Logical Models Based on Records: The most widely used is the relational model. The hierarchical model and the network model are in decline. Used to specify the overall logical structure of the database.
    1. Network Model: Made up of collections of related records with pointers or edges in graphs with no restrictions.
    2. Hierarchical Model: Like the network log, but the records are organized as collections of trees. Do not allow relations N:M. The same record cannot belong to two sets.
    3. Relational Model: Information is stored in tables where rows have the same structure (fields).
  3. Physical Data Models: Data models are based on old files. They are little used as a conceptual model of the database.

The Entity-Relationship Model

Based on a perception of the world composed of objects, called entities, and relations between them. Entities differ in their attributes.

ER Model Elements

  1. Entity: This object about which you want to store information in the database and should be characterized.
  2. Value: The association or correspondence between entities.
    • Name: Identifies the relationship and through which the reference is made.
    • Grade: Number of types of entities involved in a relationship.
    • Cardinality: Maximum/minimum.