Normalization
The process of
organizing data in a relational database. The goal is to eliminate the data redundancy.
It lowers the records locking so that more and more people can use the
database. It increases the efficiency and concurrency.
Normalization is accomplish by dividing the larger Tables
into smaller tables.
Normalization of Database
Database Normalisation is a technique of organizing the data in the database. Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anamolies. It is a multi-step process that puts data into tabular form by removing duplicated data from the relation tables.Normalization is used for mainly two purpose,
- Eliminating reduntant(useless) data.
- Ensuring data dependencies make sense i.e data is logically stored.
Problem Without Normalization
Without Normalization, it becomes difficult to handle and update the database, without facing data loss. Insertion, Updation and Deletion Anamolies are very frequent if Database is not Normalized. To understand these anomalies let us take an example of Student table.S_id | S_Name | S_Address | Subject_opted |
---|---|---|---|
401 | Adam | Noida | Bio |
402 | Alex | Panipat | Maths |
403 | Stuart | Jammu | Maths |
404 | Adam | Noida | Physics |
- Updation Anamoly : To update address of a student who occurs twice or more than twice in a table, we will have to update S_Address column in all the rows, else data will become inconsistent.
- Insertion Anamoly : Suppose for a new admission, we have a Student id(S_id), name and address of a student but if student has not opted for any subjects yet then we have to insert NULL there, leading to Insertion Anamoly.
- Deletion Anamoly : If (S_id) 401 has only one subject and temporarily he drops it, when we delete that row, entire student record will be deleted along with it.
Normalization Rule
Normalization rule are divided into following normal form.- First Normal Form
- Second Normal Form
- Third Normal Form
- BCNF
First Normal Form (1NF)
As per First Normal Form, no two Rows of data must contain repeating group of information i.e each set of column must have a unique value, such that multiple columns cannot be used to fetch the same row. Each table should be organized into rows, and each row should have a primary key that distinguishes it as unique.The Primary key is usually a single column, but sometimes more than one column can be combined to create a single primary key. For example consider a table which is not in First normal form
Student Table :
Student | Age | Subject |
---|---|---|
Adam | 15 | Biology, Maths |
Alex | 14 | Maths |
Stuart | 17 | Maths |
Student Table following 1NF will be :
Student | Age | Subject |
---|---|---|
Adam | 15 | Biology |
Adam | 15 | Maths |
Alex | 14 | Maths |
Stuart | 17 | Maths |
Second Normal Form (2NF)
As per the Second Normal Form there must not be any partial dependency of any column on primary key. It means that for a table that has concatenated primary key, each column in the table that is not part of the primary key must depend upon the entire concatenated key for its existence. If any column depends only on one part of the concatenated key, then the table fails Second normal form.In example of First Normal Form there are two rows for Adam, to include multiple subjects that he has opted for. While this is searchable, and follows First normal form, it is an inefficient use of space. Also in the above Table in First Normal Form, while the candidate key is {Student, Subject}, Age of Student only depends on Student column, which is incorrect as per Second Normal Form. To achieve second normal form, it would be helpful to split out the subjects into an independent table, and match them up using the student names as foreign keys.
New Student Table following 2NF will be :
Student | Age |
---|---|
Adam | 15 |
Alex | 14 |
Stuart | 17 |
New Subject Table introduced for 2NF will be :
Student | Subject |
---|---|
Adam | Biology |
Adam | Maths |
Alex | Maths |
Stuart | Maths |
Third Normal Form (3NF)
Third Normal form applies that every non-prime attribute of table must be dependent on primary key. The transitive functional dependency should be removed from the table. The table must be in Second Normal form. For example, consider a table with following fields.Student_Detail Table :
Student_id | Student_name | DOB | Street | city | State | Zip |
---|
New Student_Detail Table :
Student_id | Student_name | DOB | Zip |
---|
Zip | Street | city | state |
---|
The advantage of removing transtive dependency is,
- Amount of data duplication is reduced.
- Data integrity achieved.
No comments:
Post a Comment