What is Normalization in Database Design?

What is Normalization in Database Design?

What is Normalization in Database Design?

Normalization is the process of organizing a relational database in a way that reduces redundancy and dependency by splitting large tables into smaller ones and ensuring relationships between them are maintained. This process helps to minimize the chances of anomalies like insert, update, and delete anomalies.

Normalization involves several steps known as normal forms (NF). The higher the normal form, the more normalized the database is. Below is a detailed breakdown of the first four normal forms:

1. First Normal Form (1NF)

A table is in 1NF if:

  • All columns contain atomic (indivisible) values.

  • Each column contains only one value per row.

  • Each column has a unique name.

  • The order in which data is stored does not matter.

Example of 1NF

Consider a table storing customer orders:

OrderIDCustomerNameItemsOrdered
1AliceApple, Banana
2BobOrange, Grapes, Mango

In the above table, the ItemsOrdered column contains multiple values (i.e., a list of items). This violates the rule of atomicity.

To convert this into 1NF, we would split the items into separate rows:

OrderIDCustomerNameItem
1AliceApple
1AliceBanana
2BobOrange
2BobGrapes
2BobMango

Now, each row has atomic values, and the table is in 1NF.

2. Second Normal Form (2NF)

A table is in 2NF if:

  • It is in 1NF.

  • All non-key columns are fully dependent on the primary key. There should be no partial dependency, meaning that no non-key column should depend on just part of a composite primary key.

Example of 2NF

Consider a table with a composite primary key (OrderID, Item):

OrderIDItemCustomerNameQuantity
1AppleAlice10
1BananaAlice5
2OrangeBob8

In this table, CustomerName depends only on OrderID (and not on Item), while Quantity depends on both OrderID and Item. This is a case of partial dependency.

To convert the table into 2NF, we remove the partial dependency by splitting the table into two:

Orders Table:

OrderIDCustomerName
1Alice
2Bob

OrderItems Table:

OrderIDItemQuantity
1Apple10
1Banana5
2Orange8

Now, the CustomerName depends only on the OrderID, and the Quantity depends on both OrderID and Item. This table is in 2NF.

3. Third Normal Form (3NF)

A table is in 3NF if:

  • It is in 2NF.

  • There is no transitive dependency, meaning no non-key column depends on another non-key column.

Example of 3NF

Consider the following table:

OrderIDCustomerNameCustomerAddressItemQuantity
1Alice123 Main StApple10
2Bob456 Elm StOrange8

In this table, CustomerAddress depends on CustomerName, which is a non-key column. This is a transitive dependency because CustomerAddress depends on CustomerName, which is not the primary key.

To convert this table into 3NF, we remove the transitive dependency by creating a new table for customer information:

Orders Table:

OrderIDCustomerNameItemQuantity
1AliceApple10
2BobOrange8

Customers Table:

CustomerNameCustomerAddress
Alice123 Main St
Bob456 Elm St

Now, CustomerAddress depends only on CustomerName, and the table is in 3NF.

4. Boyce-Codd Normal Form (BCNF)

A table is in BCNF if:

  • It is in 3NF.

  • Every determinant (a column that determines another column) is a candidate key.

Example of BCNF

Consider the following table:

CourseIDInstructorRoom
CS101Dr. Smith101
CS102Dr. Jones102
CS101Dr. Brown101

In this case, Room is dependent on CourseID, but Instructor is not a candidate key (it doesn’t uniquely identify a row). This violates BCNF, since a non-candidate key (Instructor) is determining another attribute (Room).

To convert this table into BCNF, we break the table into two:

Courses Table:

CourseIDInstructor
CS101Dr. Smith
CS102Dr. Jones

CourseRooms Table:

CourseIDRoom
CS101101
CS102102

Now, Instructor is not determining Room anymore, and both tables are in BCNF.

Summary of Normal Forms

Normal FormKey Condition
1NFAtomic values, no repeating groups
2NF1NF + no partial dependency (non-key attributes depend on the full primary key)
3NF2NF + no transitive dependency (non-key attributes don't depend on other non-key attributes)
BCNF3NF + every determinant is a candidate key

Normalization is a fundamental part of relational database design. It helps in minimizing redundancy, ensuring consistency, and improving the efficiency of database operations. However, higher normal forms may require additional tables and relationships, which can sometimes lead to performance trade-offs. Therefore, it's essential to strike a balance between normalization and performance needs when designing your database.

Soeng Souy

Soeng Souy

Website that learns and reads, PHP, Framework Laravel, How to and download Admin template sample source code free.

Post a Comment

CAN FEEDBACK
close