Normalization is a process that helps analysts or databases designers to design table structures for an application. The focus of normalization is to attempt to reduce redundant table data to the very minimum. Through the normalization the collection of data in a single table is replaced, by the same data being distributed over multiple tables with a specific relationship being setup between the tables. By this process RDBMS schema designers try their best to reduce table data to the very minimum.
Note:- It is essential to remember that redundant data cannot be reduced to zero in any database management system.
Having said this, when the process of normalization is applied to table data and this data is spread across several associated (i.e. a specific relationship bas been established) tables, it takes a query much longer to run and retrieve user data from the set of tables.
Hence, often in a commercial application after 100% normalization is carried out across the master tables often the table structures are de-normalized deliberately to make SQL queries run faster. This means that in commercial application there is a often a trade off between redundant table data and the speed of query execution.
Normalization is carried out for the following reasons:
1. To structure the data between tables so that data maintenance is simplified
2. To allow data retrieval at optimal speed
3. To simplify data maintenance through updates, inserts and deletes
4. To reduce the need to restructure tables as new application requirements arise
5. To improve the quality of design for an application by relationship by rationalization of table data
Normalization is a technique that:
1. Decomposes data into two-dimension tables
2. Eliminates any relationship in which table data does fully depend upon the primary key of a record
3. Eliminates any relationship that contains transitive dependencies
A description of the three forms of Normalization is as mentioned below.
First Normal Form
When a table is decomposed into two-dimensional tables with all repeating groups of data eliminated, the table data is said to be in its first normal form.
The repetitive portion of data belonging to the record is termed as repeating groups.
To understand the application of normalization to table data the following table structure will be taken as an example:
Note: 1-n indicated that there are many occurrences of this field - it is a repeating group.
Data held in the above table structure:
|Project Number||Project Name||Empoyee Number||Empoyee Name||Rate Category||Hourly Rate|
|P001||Using MySQL On Linux||E001||Sharanam Shah||A||7000|
|P001||Using MySL On Linux||E002||Vaishali Shah||B||6500|
|P001||Using MySQL On Linux||E006||Hansel Colaco||C||4500|
|P002||Using MySQL On Linux||E001||Sharanam Shah||A||7000|
|P002||Using MySQL On Linux||E007||Chhaya Banker||B||4000|
In the above data there are a few problems:
⦁ The Project Name in the second record is misspelled. This can be solved by removing duplicate. Do this using normalization
⦁ Data is repeated and thus occupies more space
A table is in 1st normal form it:
⦁ There are no repeating groups
⦁ All the key attributes are defined
⦁ All attributes are dependent on a primary key
So far there are no keys, and there are repeating groups. So remove the repeating groups, and define the primary key.
To convert a table to its First Normal Form:
1. The unnormalized data in the first table is the entire table
2. A key that will uniquely identify each record should be assigned to the table. This key has to be unique because it should be capable of identifying any specific row from the table for extracting information for use. This key is called the table's Primary key.
|Project number||Primary key|
|Employee number||Primary Key|
This table is now in 1st normal form.
Second Normal Form
A table is said to be in its second normal form when each record in the table is in the first normal form and each column in the record is fully dependent on its primary key.
A table is in 2nd form if:
⦁ It's in 1st normal form
⦁ It includes no partial dependencies (Where an attribute is dependent on only a part of a primary key)
The steps to convert a table to its Second Normal Form:
1. Find and remove fields that are related to the only part of the key
2. Group the removed items in the another table
3. Assign the new table with the key i.e. part of a whole composite key
Going through all the fields reveals the following:
⦁ Project name is only dependent on Project number
⦁ Employee name, Rate category and Hourly rate are dependent only on Employee number
To convert the table into the second normal form remove and place these fields in a separate table, with the key being that part of the original key they are dependent on.