Relational Database Concepts and SQL Queries
Relational Algebra
Selection (σ): Horizontal tuples
Projection (π): Vertical tuples
Union Compatible: Same number of attributes, same domain, only unique tuples returned
- Union (∪): “or”, all
- Intersection (∩): “and”, similar
- Difference (-): “not”
Inner Joins
- Equijoin: Attribute used to join is repeated
- Natural Join: Gets rid of repeated attribute. Tuples without a matching attribute or null value are eliminated from the join result
Outer Joins
- Left Outer Join (LOJ): Keeps every tuple in the first or left relation. If no matching tuple in S, then filled with null
- Right Outer Join (ROJ): Keeps every tuple in the second or right relation. If no matching tuple in R, then filled with null
- Full Outer Join (FOJ): Keeps all tuples in both the left and right, no matching then null
SQL Queries
SELECT <column list> FROM <table list> WHERE <condition>
Aggregate Functions
GROUP BY expression
, HAVING condition
, ORDER BY column names (ASC def.)
COUNT
, AVG
, MIN
, MAX
, SUM
, AS
(aliases) before FROM
. Ignore nulls
LIKE wildcards:
%
many characters, ex.SELECT * FROM employees WHERE name LIKE '%G%'
_
single character, ex.SELECT * FROM employees WHERE name LIKE '_I_A'
IN
used to identify values in list or sub-query. Must use IS
operator for null values
Data Manipulation
INSERT
, INTO
, VALUES
, UPDATE
, SET
SELECT * FROM course C, department D WHERE C.dcode = D.dcode;
is exactly the same as SELECT * FROM course, department WHERE course.dcode = department.dcode;
Cartesian Product: SELECT * FROM table1, table2;
Equijoin: SELECT * FROM C INNER JOIN D ON C.dcode = D.dcode;
Natural Join: Attributes with the same name are required
SELECT * FROM table1 NATURAL JOIN table2;
OR SELECT * FROM table1 JOIN table2 USING attributes;
OR SELECT table1.*, table2.attr1, ...table2.attr# FROM table1 INNER JOIN table2 ON table1.joining_attr = table2.joining_attr;
LOJ: SELECT * FROM D LEFT OUTER JOIN C ON D.dcode = C.dcode;
Union/Intersection/Difference: SELECT * FROM R UNION/INTERSECT/MINUS SELECT * FROM S;
INSERT: INSERT INTO table (attributes) VALUES (values);
UPDATE: UPDATE table SET attribute = value [WHERE <search-condition>]
DELETE: DELETE FROM table [WHERE <search-condition>]
Functional Dependencies
Specify a relationship between attributes in a relations schema. A→B, A is the determinant, B is the dependent
Anomalies
Insert, Delete, Update. Decompose to get rid of them
Working with Functional Dependencies
Armstrong’s Axioms: Rules for systematically understanding what attributes are determined by what other attributes
Cover and Closure (Fc and F+):
- Cover: Minimal FDs that express all dependencies in F
- Closure: All possible FDs in F(F+)
Armstrong’s Axioms
- Reflexivity: Deals with trivial dependencies. Impossible not to satisfy and an FD is trivial if and only if the dependent is a subset of the determinant. If Y is a subset of X, then X→Y (and Y→X)
- Augmentation: We can add anything to the determinant and it will still determine the dependent. If X→Y, then {X,Z}→Y, {X,A,B,C,D,E,F}→Y. We can add the same attribute to both sides and still determine the dependent. {X,A}→{Y,A}
- Transitivity: If X→Y and Y→Z, then X→Z
- Union: If X→Y and X→Z, then X→{Y,Z}
- Decomposition: Opposite of Union
- Composition: If A→B and C→D, then {A,C}→{B,D}
- Pseudotransitivity: If X→Y and {Y,W}→Z, then {X,W}→Z
Derivation of Candidate Keys
Synthesis and Decomposition
Database Keys
Superkey: A set of one or more attributes, which taken collectively, uniquely identifies a tuple of a relation {uniqueness property}
Candidate Key: A superkey with no proper subset that uniquely identifies a tuple of a relation {uniqueness property + irreducibility}
Primary Key: A candidate key with no missing values for the constituent attributes {uniqueness property + irreducibility + entity integrity constraint}
Alternate Key: Any candidate key that is not serving the role of the primary key
Key Attribute: Any attribute that is a proper subset of a candidate key
Non-Key Attribute: Any attribute that is not a subset of a candidate key
Prime Attribute: Any attribute that is a proper subset of the primary key
Non-Prime Attribute: Any attribute that is not a proper subset of the primary key (except when it is a primary or candidate key)
Normal Forms
First Normal Form (1NF)
- No composite attributes are allowed (i.e., an attribute’s domain only allows for atomic values)
- An attribute value in a relation’s tuple must be a single value from the domain
- Each relation must be able to be identified by a primary key
Second Normal Form (2NF)
- The relation is in 1NF
- There are no Partial Dependencies
- A Partial Dependency exists a) if the relation has a composite primary key and b) if attributes in the relation depend only on part of the primary key and not the complete primary key
Third Normal Form (3NF)
- The relation is in 2NF
- There are no Transitive Dependencies
- A transitive dependency exists when a non-prime attribute depends on other non-prime attributes rather than depending upon the prime attributes or primary key
Boyce-Codd Normal Form (BCNF)
- The relation is in 3NF
- For a dependency A → B, A should be a super key
SQL Functions
Concatenation: Allows you to join multiple strings together using two vertical bars. SELECT 'My pilot is ' || oil_pilotname AS "Pilot Name" FROM pilots;
SUBSTR: SUBSTR (char, m [,n])
returns a portion of char, beginning at character m, n characters long (if n is omitted, to the end of char). The first position of char is 1. SELECT SUBSTR ('ABCDEFG', 3, 4) AS "Substring" FROM DUAL;
returns CDEF
Returns the length of the character string char. SELECT LENGTH ('Jones, John') FROM DUAL;
returns 11
LTRIM: (char [, set])
function removes unwanted characters from the left of char, with initial characters removed up to the first character not in set. If no set of characters is specified, set defaults to ‘ ‘ (a blank space) and the function trims off leading blank spaces. SELECT LTRIM('xxxXxxLAST WORD', 'x') AS "Left trim Example" FROM DUAL;
returns XxxLASTWORD. Is case sensitive
LPAD/RPAD: Allow you to “pad” the left (and right) side of a column or character string with a set of characters. LPAD/RPAD(string, length, [,'set'])
. string is the name of the character column (or a literal string), length is the total number of characters long that the result should be (i.e., its width), and set is the set of characters that do the padding
INSTR: Is used to return the numeric value of the location of a character string within a character column or character literal. INSTR(char1, char2 [,n[,m]])
. Its purpose is to locate the position of the mth occurrence of char2 in char1, beginning the search at position n
TO_CHAR
TO_DATE function is used to convert character strings to a valid date format
If the values are equal, the DECODE
function returns result; otherwise, default_value is returned. The DECODE
function allows you to perform if-then-else logic in SQL within a row
The CUBE
Operator subtotals all possible combinations of the group by columns. GROUP BY CUBE(attribute(s))
. RANK()
leaves a gap in the sequence when there is a tie, but DENSE_RANK()
leaves no gap