Normalization

Atomicity

An attribute is atomic, if and only if,

Examples

The attribute courseID, with values like ECE208, ECE250, ECE356, is not atomic in the context of a university database, since there are other departmental prefixes, such as SE, CS and MATH. However, in the context of a departmental database, where the only departmental prefix is ECE, the attribute courseID is atomic, since the division into course prefixes and course codes does not serve a major purpose given the context of the database.

Additionally, a datetime object in the MySQL database engine can be described as atomic. While it is divisible into second, minute, hour, day, month and year, this object type is defined in the MySQL database and is made up of distinct, indivisible attributes. Hence it can be considered atomic.

Simple Functions

A function is a simple function if and only if it is computable without the use of explicit mapping.

Examples

f(x)=x2 is a simple function.

f(x) as defined by the table

f(x)x
00
11
24
39

is not a simple function.

Non-Redundancy

An attribute is non-redundant if and only if it is neither an input to, nor the result of, a simple function.

In contrast, an attribute is redundant if and only if it is either an input to, or the result of, a simple function.

Examples

Consider a section relation defined with attributes as follows:

The attribute semester is a term code combined with the last two digits of the year. For eg. W21 represents "winter 2021".

Hence, there exists a simple function that can extract the year out of the semester.

This means neither of the attributes semester or year are non-redundant.

Lossless Decomposition

A decomposition of relation r into relations r1 and r2 is lossless if and only if

r=r1r2

In other words, it is a lossless decomposition if and only if joining relations r1 and r2 using all the attributes that are common in both results in the relation r.

More specifically, if relation r(R) is has attributes Ai for i=1,2,3,...,10:

R=(A1,A2,A3,...,An)

then, relations r1(R1) and r2(R2) derived from the lossless decomposition of relation r(R) are defined as:

R1=(Ai,Aj,...,Ak,Al)R2=(Ai,Aj,...,Ap,Aq)

where some attributes are in both R1 and R2 (i.e. Ai,Aj) and some attributes are in only R1 (i.e. Ak,Al) or in only R2 (i.e. Ap,Aq). Note that there are no attributes that are not in either of R1 and R2.

Functional Dependencies

Given attributes in a relation, a functional dependency is a function that uniquely maps the values of a set of attributes to the values of another set of attributes. The notation for functional dependencies is as follows:

a1,a2,...,anb1,b2,...,bm

Formal Definition

Consider a relation schema R, and α, β as subsets of attributes of R:

αR,βR

Then, the functional dependency,

αβ

holds if and only if, for any relation r(R), whenever any two tuples t1 and t2 agree on attributes α, they also agree on attributes β. That is,

t1[α]=t2[α]t1[β]=t2[β]

Examples

Because everyone in Canada has a Social Insurance Number (SIN), there exists a functional dependency such that SIN Name. This is because a person's social insurance number functionally determines their name. In other words, if we know the SIN, we know their name.

Properties

Identity

Reflexivity

Augmentation

Transitivity

Closures

Given a set of functional dependencies F, the closure of F, denoted as F+ is defined as the set of all possible functional dependencies logically implied by F using the properties of functional dependencies. This also implies FF+

Attribute-Set Closures

Given a relation r, a set of functional dependencies F and a set of attributes α, the attribute-set closure α+ is defined as the set of all attributes that can be logically determined using the attributes in α.

Extraneous Attributes

Given a set of functional dependencies F, functional dependency (αβ)F, an attribute Aα is extraneous in αβ if and only if,

F(F{αβ}){(αA)β}

In other words, A is not a necessary attribute in the functional dependency αβ.

Examples

Consider a set of functional dependencies F:

F={AC,ABC}

Here, B is an extraneous attribute in ABC, since we already know that AC. Hence B is not needed in that functional dependency.

Canonical Covers

Given a set of functional dependencies F, the canonical cover of F, is a minimal set of functional dependencies Fc such that

Examples

Consider a set of functional dependencies F:

F={AB,BC,AC}

We know that,

AB,BCAC

Hence, AC is a redundant functional dependency, which means F is not a canonical cover. However, Fc defined as,

Fc={AB,BC}

is a canonical cover.

Keys

Super Keys

Consider the set of all attributes R in relation r, and set of attributes αR. Then, α is a super key of r if and only if α+R.

In other words, in order for α to be a super key, all the attributes in r must be logically determinable by the attributes in α.

Candidate Keys

If α is a super key of r, and there exists no βα that is also a super key, then α is a candidate key of r.

In other words, α is a candidate key if and only if removing any attribute from α will result in it losing its super key status.

Determining Candidate Key Attributes

Consider the set of all attributes R in relation r and a canonical cover Fc. We can make the following statements about candidate keys for r.

Required Present

Any attribute that does not appear on the right-hand side of any functional dependency fFc must be present in every candidate key of r.

Required Absent 1

Any attribute that does not appear on the left-hand side of any functional dependency fFc but appears on the right-hand side of at least one functional dependency fFc must be absent in every candidate key of r.

Required Absent 2

Any attribute that appears on the right-hand side of any functional dependency fFc where all the attributes on the left-hand side are required, must be absent in every candidate key of r.

Maybe Present

Any attribute that appears on the left-hand side of some functional dependency fFc and the right hand-side of some other functional dependency fFc must be absent in at least one candidate key of r.

First Normal Form

Given a relation r with the set of all attributes R, r is in first normal form if and only if every attribute in R is atomic. In other words, if any of the attributes can be divided into multiple attributes for major uses within the context of the database and is not a type that is supported as multiple values in the database engine, the relation r is not in first normal form.

Third Normal Form

Given a relation r with the set of all attributes R and a set of functional dependencies F, r is in third normal form if and only if, for every functional dependency αβ in F, at least one of the following is true:

Intuitively, third normal form is just enforcing a table to describe a single entity. By requiring every functional dependency to either have α as a super key or have every attribute in β in at least one candidate key, it is enforcing that every attribute in this relation is a part of the same entity and cannot be neatly broken down into multiple entities.

Third Normal Form Decomposition

The steps for decomposing relation r to multiple relations that are in third normal form are as follows:

Boyce-Codd Normal Form

Given a relation r with the set of all attributes R and a set of functional dependencies F, r is in Boyce-Codd normal form if and only if, for every functional dependency αβ in F, at least one of the following is true:

Boyce-Codd Normal Form Decomposition

The steps for decomposing relation r to multiple relations that are in Boyce-Codd normal form are as follows: