Database Concepts: From SQLite to Normalization
Q. Describe SQLite
SQLite is a lightweight, self-contained relational database management system (RDBMS) that doesn’t require a separate server process. It’s embedded directly into the end program, making it ideal for small to medium-sized applications, especially those with low to medium traffic. It supports standard SQL syntax and provides most of the features of a traditional database system, including transactions and ACID compliance. It’s widely used in mobile apps, embedded systems, and as an alternative data storage option for desktop applications.
Q. Describe Google Bigtable
Google Bigtable is a distributed storage system designed to handle massive amounts of structured data across thousands of commodity servers. It’s a NoSQL database that provides high scalability, reliability, and performance for storing and querying large datasets. Bigtable organizes data in tables, each indexed by a row key, column key, and timestamp. It’s ideal for applications requiring real-time analytics, time-series data processing, and high-throughput transactional workloads. Bigtable is used internally by Google for various services like Google Search, Maps, and Gmail, and it’s also available as a managed service on Google Cloud Platform for external users.
Q. 1NF
First Normal Form (1NF) is a fundamental concept in the field of database normalization. It lays down the foundational rules for organizing data in a relational database to minimize redundancy and inconsistencies. In 1NF:
- Atomic Values: Each attribute in a table must have atomic values, meaning each value in a column should be indivisible. There should be no repeating groups or arrays within a single cell.
- Unique Column Names: Every column in a table must have a unique name. This ensures that each attribute is distinctly identifiable within the table.
- Ordering: There’s no inherent ordering of rows or columns in 1NF. The data is considered a set of unordered tuples.
By adhering to 1NF, databases become more efficient, less prone to anomalies, and easier to manage and manipulate.
Q. 2NF
Second Normal Form (2NF) is a level of database normalization that builds upon the requirements of the First Normal Form (1NF). In order for a relation (table) to be in 2NF, it must meet the following criteria:
- It must be in 1NF.
- It should eliminate partial dependencies, meaning that no non-prime attribute (an attribute not part of any candidate key) should be dependent on only a portion of the primary key.
In simpler terms, 2NF ensures that every non-prime attribute in a table is fully functionally dependent on the entire primary key, not just part of it. This helps to avoid redundancy and anomalies in the database.
For example, consider a table representing orders, with columns for Order ID (primary key), Customer ID, Customer Name, and Product ID. If Customer Name is functionally dependent only on Customer ID (part of the primary key) and not on the entire primary key (Order ID), then it violates 2NF. To adhere to 2NF, you would need to separate the Customer Name into a separate table where it’s functionally dependent on the Customer ID alone.
Q. Describe BCNF
Boyce-Codd Normal Form (BCNF) is a further normalization concept in relational database design. It’s a higher level of normalization than the Third Normal Form (3NF). A relation (table) is in BCNF if and only if every determinant (attribute or set of attributes that uniquely determines other attributes) is a candidate key.
To clarify:
- It must be in 3NF: The relation must already be in the Third Normal Form.
- Every determinant must be a candidate key: This means that for every functional dependency X→Y, where X is a set of attributes and Y is a set of attributes, X must be a superkey. In simpler terms, every non-trivial functional dependency must be supported by a superkey.
BCNF aims to eliminate all redundancy and anomalies that may arise from functional dependencies. When a table is in BCNF, it’s considered to be in a state of optimal normalization.
For example, if you have a table of employees with attributes like Employee ID, Employee Name, and Department, where both {Employee ID} and {Employee Name} are candidate keys, BCNF requires that every non-trivial functional dependency be determined by one of these candidate keys. If, for instance, Department is functionally dependent only on Employee Name, then this table is not in BCNF because Employee Name is not a superkey. To conform to BCNF, you might need to decompose the table into separate tables.
Q. 4NF
Fourth Normal Form (4NF) is a level of database normalization that goes beyond Third Normal Form (3NF) and addresses certain types of multi-valued dependencies (MVDs) in a relation (table). A relation is in 4NF if it is in 3NF and contains no non-trivial multi-valued dependencies.
To clarify:
- It must be in 3NF: The relation must already be in the Third Normal Form.
- No non-trivial multi-valued dependencies: This means that there are no non-trivial dependencies between non-prime attributes (attributes not part of any candidate key) of the relation.
4NF helps to further reduce redundancy and anomalies in the database schema by eliminating certain types of dependency among attributes.
For example, consider a relation (table) storing information about books, with attributes for Book ID, Author, and Genre. If a book can have multiple authors and multiple genres, and the attributes Author and Genre are independent of each other (meaning changing the value of one does not affect the value of the other), then this table would have a multi-valued dependency. To conform to 4NF, you might decompose this table into separate tables to eliminate this dependency.