Database Design: Relationships, Normalization, and Entity Modeling
Database Design Fundamentals
Types of Relationships
- One-to-One: An entity in set A is associated with at most one entity in set B, and vice versa. Example: A car’s VIN is linked to its license plate.
- One-to-Many: An entity in set A can be associated with multiple entities in set B, but an entity in set B can only be associated with one entity in set A. Example: A teacher (identified by employee ID) teaches multiple courses (identified by CRN).
- Many-to-One: An entity in set B can be associated with multiple entities in set A, but an entity in set A can only be associated with one entity in set B.
- Many-to-Many: Entities in both sets can be associated with multiple entities in the other set. Example: Students can enroll in multiple courses, and each course can have multiple students.
Data Dictionary
A data dictionary describes the meaning, structure, and other characteristics of data elements within a database. It serves as a valuable resource for understanding and managing the database.
Hierarchical Database
A hierarchical database system stores data in a tree-like structure. A parent node can have multiple child nodes. The top-level node is called the root, and nodes without children are called leaves. A key limitation of this model is its difficulty in handling data redundancy.
Entity
An entity represents a real-world object or concept and is a fundamental unit of data in a database. Entities have attributes that describe their characteristics. Examples: Student, Employee, Product.
Specialization and Generalization
Specialization creates lower-level entities from a subset of a higher-level entity, highlighting their differences. Generalization combines lower-level entities to form a higher-level entity, emphasizing their commonalities.
Normal Forms
- First Normal Form (1NF): Each field contains only one value. Example: Instead of a single “name” field, separate fields for first name, middle name, and last name.
- Second Normal Form (2NF): All non-key attributes must be fully dependent on the entire primary key, not just part of it. The table must also be in 1NF. Example: In a table with a composite key (customer ID and product ID), a product description should not be included as it depends only on the product ID.
- Third Normal Form (3NF): All non-key attributes must not depend on other non-key attributes. The table must also be in 2NF and 1NF. Example: In an address table, the city should not depend on the zip code, as both depend on the primary key.
MER Diagrams (Extended Entity-Relationship Diagrams)
Extended Entity-Relationship diagrams incorporate the concepts of subclasses and superclasses, along with specialization and generalization, to model complex relationships.
Superkey
A superkey is a set of one or more attributes that uniquely identifies an entity within an entity set. Example: A customer ID can uniquely identify a customer.
Candidate Key
A candidate key is a minimal superkey. When an entity has more than one candidate key, one is chosen as the primary key. Example: Product ID (primary key), or a unique product code.
Simple and Composite Attributes
Simple attributes are indivisible. Example: Gender. Composite attributes can be further divided into meaningful sub-attributes. Example: Birthdate (day, month, year).
Multivalued and Single-valued Attributes
Multivalued attributes can have multiple values for a single entity. Example: Phone numbers. Single-valued attributes can have only one value for a single entity. Example: Date of birth.
Cardinality and Participation Constraints
Cardinality specifies the maximum number of entities that can be related to another entity in a relationship. Participation constraints determine whether an entity’s existence depends on its relationship with another entity.
Transaction
A transaction is a sequence of operations treated as a single, indivisible unit of work. Example: Transferring money between bank accounts.