4.38M

Normalization

1.

Database Normalization
Learning Objectives
11.4.1.5 define the connections between tables in a
database(1-3NF)
Assessment criteria
-Brings connections to the third normal form (3NF)
-Knows rules of the third normal form (3NF)

2.

Normalization

3.

There are three goals of the normalization
process:
• eliminate redundant data (for example, storing the same data
in more than one table)
• ensure data dependencies make sense (only storing related
data in a table)
• no data is unnecessarily duplicated (i.e. the same data item
held in more than one table)

4.

What Is Database Normalization?
Store only the minimal amount of
information ('atomic' data).
Remove redundancies.
Remove anomalies.
Restructure data.

5.

There are a few rules for
database normalization.
Each rule is called a
"normal form."

6.

Normalizations forms

7.

Rules of First Normal Form
● Each table cell must contain one value (atomic
data)
● Each record must be unique
● Must be Primary key
● No repeated groups of attributes
!!!

8.

Unnormalised invoice table
OrderID
1
2
3
Date
29/09/2018
01/10/2018
16/10/2018
CustomerID
1
2
3
CustomerName
Fred Green
Thomas James
Christian Cox
CustomerAddress
Item
ID
It4mDescription ItemQuantity
1
Mouse
3
3
keyboard
4
2
Speakers
5
1
Mouse
8
2
Speakers
2
3
Keyboard
4
4
Microphone
10
21 Church Road
20 Oxford Street
41 Shakespeare
Road
ItemPrice
£
4.00
£
10.00
£
7.00
£
4.00
£
7.00
£
10.00
£
5.00

9.

1NF normalised table
The data are atomic
Remove repeating attributes (columns) from table.
Each record (row) needs a unique identifier
CustomerI CustomerNam CustomerAddre
ItemDescriptio ItemQuantit
Item ID
ItemPrice
D
e
ss
n
y
£
29/09/2018
1
Fred Green
21 Church Road
1
Mouse
3
4.00
£
29/09/2018
1
Fred Green
21 Church Road
3
keyboard
4
10.00
£
01/10/2018
2
Thomas James 20 Oxford Street
2
Speakers
5
7.00
£
01/10/2018
2
Thomas James 20 Oxford Street
1
Mouse
8
4.00
£
01/10/2018
2
Thomas James 20 Oxford Street
2
Speakers
2
7.00
41 Shakespeare
£
16/10/2018
3
Christian Cox
3
Keyboard
4
Road
10.00
41 Shakespeare
£
17/10/2018
3
Christian Cox
4
Microphone
10
Road
5.00
OrderID Date
1
1
2
2
2
3
3
Order ID and Item ID taken together are a unique identifier for each record so
we have a composite primary key

10.

1NF Orders Table
OrderID
Date
CustomerID
CustomerName
CustomerAddress
Item ID
Item Description
Item Quantity
Item price
Orders(OrderID, Date, CustomerID, CustomerName, CustomerAddress, Item ID,
Item Description, Item Quantity, Item price)

11.

Second normal form (2NF)
Need to be in 1NF (so data are atomic and no repeating
groups of attributes)
No partial key dependencies of non-key attributes

12.

Partial key dependencies in Orders
For each non key attribute we determine whether it is dependent
on any of the attributes that make up the primary key, which in
this case is order ID and Item ID.
OrderID
Date
CustomerID
CustomerName
CustomerAddress
Item ID
Item Description
Item Quantity
Item price
Non key
attributes
Order
ID
Item ID
Explanation
Date
Yes
No
Date is dependent on order ID, but exists independently of Item ID
Customer ID
No
No
CustomerName
No
No
CustomerAddre
ss
No
No
Item Description
No
Yes
Item description needs to have an Item ID but can exist without an orderID, so is not
dependent on order ID by is dependent Item ID
Item Quantity
Yes
Yes
Item quantity is dependent on Item ID because you cannot have a quantity of nothing
and it is dependent on Order ID.
Item Price
No
Yes
Item Price needs to have an Item ID but can exist without an orderID, so is not
dependent on order ID by is dependent Item ID
A customer can exist without an orderID and Item ID so is dependent on neither of these
A customer can exist without an orderID and Item ID so is dependent on neither of these
A customer can exist without an orderID and Item ID so is dependent on neither of these

13.

Splitting table into Orders and OrderItems
Orders(OrderID, Date, CustomerID,
CustomerName, CustomerAddress)
Non key
attributes
Order ID
Item ID
Date
Yes
No
Customer ID
No
No
CustomerName
No
No
No
No
No
Yes
Item Quantity
Yes
Yes
Item Price
No
Yes
OrderItems(OrderID, Item ID, Item Description, CustomerAddress
Item Quantity, Item price)
Item Description

14.

2NF – Two tables
Order
OrderID
Date
CustomerID
CustomerName
CustomerAddress
OrderItems
OrderID
Item ID
Item Description
Item Quantity
Item price
Orders(OrderID, Date, CustomerID, CustomerName, CustomerAddress)
OrderItems(OrderID, Item ID, Item Description, Item Quantity, Item price)
The relatoinship between the two tables is one-to-many. An order can have multiple
different items.

15.

2NF normalised table
OrderItems
OrderID
Orders
Item ID ItemQuantity
OrderID
1
1
3
1
3
4
2
2
5
2
1
8
2
2
2
3
3
4
3
4
10
composite primary key
Date
CustomerID CustomerName
1
29/09/2018
1
Fred Green
21 Church Road
2
01/10/2018
2
Thomas James
20 Oxford Street
3
16/10/2018
3
Christian Cox
41 Shakespeare
Road
Items
Item ID ItemDescription ItemPrice
As it stands the
database is in 2NF
CustomerAddres
s
1
Mouse
£
4.00
2
Speakers
£
7.00
3
Keyboard
£
10.00
4
Microphone
£
5.00

16.

2NF
The relationship between orderItems and Items is one-to-many. An Item can appear
in multiple orders.
Order
OrderID
Date
CustomerID
CustomerName
CustomerAddress
OrderItems
OrderID
Item ID
Item Quantity
Orders(OrderID, Date, CustomerID, CustomerName, CustomerAddress)
OrderItems(OrderID, Item ID, Item Quantity)
Items(Item ID, Item Description, Item price)
Items
Item ID
Item Description
Item price

17.

3NF - Third normal form
Need to be in 1NF (so data are atomic and no
repeating groups of attributes)
Need to be in 2NF (no partial key dependencies of
non-key attributes)
Eliminate non-Key dependencies or transitive
dependencies

18.

Transitive Dependency
Three attributes A, B, and C connected in such a way that A→B and B→C.
In other words A→C.

19.

Non key attributes
It is clear in our orders table that CustomerName and CustomerAddress are dependent on
CustomerID which is a non key attribute, so we need to create an additional table to make it
3NF.
Orders
CustomerAddres
s
Non key attributes
CustomerID
Date
No
1
29/09/2018
1
Fred Green
21 Church Road
CustomerName
Yes
2
01/10/2018
2
Thomas James
20 Oxford Street
CustomerAddress
Yes
3
16/10/2018
3
Christian Cox
41 Shakespeare
Road
OrderID
Date
CustomerID CustomerName
Non-key attribute

20.

3NF normalised database
OrderItems
OrderID
OrderID
Item ID ItemQuantity
1
1
3
1
3
4
2
2
5
2
1
8
2
2
2
3
3
4
4
10
3
Customers
Orders
1
2
3
Date
29/09/2018
01/10/2018
16/10/2018
CustomerID
CustomerID CustomerName
CustomerAddres
s
1
1
Fred Green
21 Church Road
2
Thomas James
20 Oxford Street
3
Christian Cox
41 Shakespeare
Road
2
3
Items
Item ID ItemDescription ItemPrice
1
Mouse
£
4.00
2
Speakers
£
7.00
3
Keyboard
£
10.00
4
Microphone
£
5.00

21.

3NF
Order
OrderID
Date
CustomerID
OrderItems
OrderID
Item ID
Item Quantity
Items
Item ID
Item Description
Item price
CustomerID
CustomerName
CustomerAddress
Orders(OrderID, Date, CustomerID)
OrderItems(OrderID, Item ID, Item Description, Item Quantity, Item price)
Items(Item ID, Item Description, Item price)
Customers(CustomerID, CustomerName, CustomerAddress)

22.

Is it 1NF?

23.

No. There are repeating groups (subject, subjectcost,
grade)
How can you make it 1NF?

24.

Create new rows so each cell contains only one value
But now look – is the studentID primary
key still valid?

25.

No – the studentID no longer uniquely identifies each row
You now need to declare studentID and subject
together to uniquely identify each row and
create composite key.
So the new key is StudentID and Subject.

26.

So. We now have 1NF.
Is it 2NF?

27.

Studentname and address are dependent on studentID
(which is part of the key)
This is good.
But they are not dependent
on Subject (the other part of
the key)

28.

And 2NF requires…
Partial dependence
(studentID + subject)

29.

So it’s not 2NF
How can we fix it?

30.

Make new tables
Make a new table for each primary key field
Give each new table its own primary key
Move columns from the original table to the new table that
matches their primary key…

31.

Step 1
STUDENT TABLE (key = StudentID)

32.

Step 2
STUDENT TABLE (key = StudentID)
SUBJECTS TABLE (key = Subject)

33.

Step 3
STUDENT TABLE (key = StudentID)
SUBJECTS TABLE (key = Subject)
RESULTS TABLE (key = StudentID+Subject)

34.

Step 3
STUDENT TABLE (key = StudentID)
SUBJECTS TABLE (key = Subject)
RESULTS TABLE (key = StudentID+Subject)

35.

A 2NF check
STUDENT TABLE (key = StudentID)
1
SUBJECTS TABLE (key = Subject)
8
8
1
RESULTS TABLE (key = StudentID+Subject)
SubjectCost is only
dependent on the
primary key,
Subject

36.

A 2NF check
STUDENT TABLE (key = StudentID)
1
SUBJECTS TABLE (key = Subject)
8
8
1
Grade is only dependent
on the primary key
(studentID + subject)
RESULTS TABLE (key = StudentID+Subject)

37.

A 2NF check
STUDENT TABLE (key = StudentID)
1
SUBJECTS TABLE (key = Subject)
1
8
8
So it is
2NF!
But is it 3NF?
RESULTS TABLE (key = StudentID+Subject)

38.

A 2NF check
STUDENT TABLE (key = StudentID)
Oh oh…
1
1
8
8
What?
SUBJECTS TABLE (key = Subject)
But is it 3NF?
RESULTS TABLE (key = StudentID+Subject)

39.

A 2NF check
STUDENT TABLE (key = StudentID)
SUBJECTS TABLE (key = Subject)
1
8
HouseName is
dependent on
both StudentID +
HouseColour
8
1
But is it 3NF?
RESULTS TABLE (key = StudentID+Subject)

40.

A 2NF check
STUDENT TABLE (key = StudentID)
SUBJECTS TABLE (key = Subject)
1
8
Or HouseColour is
dependent on
both StudentID +
HouseName
8
1
But is it 3NF?
RESULTS TABLE (key = StudentID+Subject)

41.

A 2NF check
STUDENT TABLE (key = StudentID)
SUBJECTS TABLE (key = Subject)
1
8
But either way,
non-key fields are
dependent on MORE
THAN THE PRIMARY
KEY (studentID)
8
1
But is it 3NF?
RESULTS TABLE (key = StudentID+Subject)

42.

A 2NF check
STUDENT TABLE (key = StudentID)
SUBJECTS TABLE (key = Subject)
1
8
And 3NF says that
non-key fields must
depend on nothing
but the key
8
1
But is it 3NF?
RESULTS TABLE (key = StudentID+Subject)

43.

A 3NF check
STUDENT TABLE (key = StudentID)
1
8
8
WHAT DO
WE DO?
RESULTS TABLE (key = StudentID+Subject)
SUBJECTS TABLE (key = Subject)
1

44.

Again, carve off the offending fields
1
SUBJECTS TABLE (key = Subject)
8
8
1
RESULTS TABLE (key = StudentID+Subject)

45.

8
A 3NF fix
1
1
1
8
8
SUBJECTS TABLE (key = Subject)
RESULTS TABLE (key = StudentID+Subject)

46.

8
A 3NF win!
1
8
8
1
1
SUBJECTS TABLE (key = Subject)
RESULTS TABLE (key = StudentID+Subject)
Or…

47.

The Reveal
Before…
After…
8
1
1
8
8
1
SUBJECTS TABLE (key = Subject)
RESULTS TABLE (key =
StudentID+Subject)

48.

Summary
Normal
Forms:
1 NF
2 NF
3 NF
Rules
No multivalued attributes or repeating
groups.(atomic)
Be in 1 NF plus no partial
dependencies
Be in 2 NF plus remove non-Key
dependencies or transitive
dependencies

49.

Formative Assessment 2

50.

Formative Assessment 3

51.

1NF

52.

2 normal form
Book
Author
BookAuthors

53.

3 normal form
Book
Author
BookAuthors
Publisher
Genre

54.

Formative Assessment 2

55.

Answer
1 NF

56.

2 NF

57.

3 NF

58.

References
https://sway.office.com/BkRBWLUUmZF9Ax7d?ref=Link&loc=play
More about Normalisation https://www.studytonight.com
Normalization on Wikibooks
Relationship example on BBC.co.uk
English     Русский Rules