What is normalization in database?

0 votes
in Computers & Internet by

3 Answers

0 votes
by
Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.

The Normal Forms

The database community has developed a series of guidelines for ensuring that databases are normalized. These are referred to as normal forms and are numbered from one (the lowest form of normalization, referred to as first normal form or 1NF) through five (fifth normal form or 5NF). In practical applications, you'll often see 1NF, 2NF, and 3NF along with the occasional 4NF. Fifth normal form is very rarely seen and won't be discussed in this article.

Before we begin our discussion of the normal forms, it's important to point out that they are guidelines and guidelines only. Occasionally, it becomes necessary to stray from them to meet practical business requirements. However, when variations take place, it's extremely important to evaluate any possible ramifications they could have on your system and account for possible inconsistencies. That said, let's explore the normal forms.

First Normal Form (1NF)

First normal form (1NF) sets the very basic rules for an organized database:

Eliminate duplicative columns from the same table.

Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

Second Normal Form (2NF)

Second normal form (2NF) further addresses the concept of removing duplicative data:

Meet all the requirements of the first normal form.

Remove subsets of data that apply to multiple rows of a table and place them in separate tables.

Create relationships between these new tables and their predecessors through the use of foreign keys.

Third Normal Form (3NF)

Third normal form (3NF) goes one large step further:

Meet all the requirements of the second normal form.

Remove columns that are not dependent upon the primary key.

Fourth Normal Form (4NF)

Finally, the fourth normal form (4NF) has one additional requirement:

Meet all the requirements of the third normal form.

A relation is in 4NF if it has no multi-valued dependencies.

Remember, these normalization guidelines are cumulative. For a database to be in 2NF, it must first fulfill all the criteria of a 1NF database.
0 votes
by
Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored
0 votes
by
Database normalization is a design technique by which relational database tables are structured in such a way as to make them less vulnerable to certain types of logical inconsistencies and anomalies. Tables can be normalized to varying degrees: relational database theory defines "normal forms" of successively higher degrees of stringency, so, for example, a table in third normal form is less open to logical inconsistencies and anomalies than a table that is only in second normal form

First normal form

    Main article: First normal form

    The criteria for first normal form (1NF) are:

        * A table must be guaranteed not to have any duplicate records; therefore it must have at least one candidate key.

        * There must be no repeating groups, i.e. no attributes which occur a different number of times on different records. For example, suppose that an employee can have multiple skills: a possible representation of employees' skills is {Employee ID, Skill1, Skill2, Skill3 ...}, where {Employee ID} is the unique identifier for a record. This representation would not be in 1NF.

[edit] Second normal form

    Main article: Second normal form

    The criteria for second normal form (2NF) are:

        * The table must be in 1NF.

        * None of the non-prime attributes of the table are functionally dependent on a part (proper subset) of a candidate key; in other words, all functional dependencies of non-prime attributes on candidate keys are full functional dependencies. For example, consider a "Department Members" table whose attributes are Department ID, Employee ID, and Employee Date of Birth; and suppose that an employee works in one or more departments. The combination of Department ID and Employee ID uniquely identifies records within the table. Given that Employee Date of Birth depends on only one of those attributes – namely, Employee ID – the table is not in 2NF.

        * Note that if none of a 1NF table's candidate keys are composite – i.e. every candidate key consists of just one attribute – then we can say immediately that the table is in 2NF

[edit] Third normal form

    Main article: Third normal form

    The criteria for third normal form (3NF) are:

        * The table must be in 2NF.

        * There are no non-trivial functional dependencies between non-prime attributes. A violation of 3NF would mean that at least one non-prime attribute is only indirectly dependent (transitively dependent) on a candidate key, by virtue of being functionally dependent on another non-prime attribute. For example, consider a "Departments" table whose attributes are Department ID, Department Name, Manager ID, and Manager Hire Date; and suppose that each manager can manage one or more departments. {Department ID} is a candidate key. Although Manager Hire Date is functionally dependent on {Department ID}, it is also functionally dependent on the non-prime attribute Manager ID. This means the table is not in 3NF.

[edit] Boyce-Codd normal form

    Main article: Boyce-Codd normal form

    The criteria for Boyce-Codd normal form (BCNF) are:

        * The table must be in 3NF.

        * Every non-trivial functional dependency must be a dependency on a superkey.

[edit] Fourth normal form

    Main article: Fourth normal form

    The criteria for fourth normal form (4NF) are:

        * The table must be in BCNF.

        * There must be no non-trivial multivalued dependencies on something other than a superkey. A BCNF table is said to be in 4NF if and only if all of its multivalued dependencies are functional dependencies.

[edit] Fifth normal form

    Main article: Fifth normal form

    The criteria for fifth normal form (5NF and also PJ/NF) are:

        * The table must be in 4NF.

        * There must be no non-trivial join dependencies that do not follow from the key constraints. A 4NF table is said to be in the 5NF if and only if every join dependency in it is implied by the candidate keys.

[edit] Domain/key normal form

    Main article: Domain/key normal form

    Domain/key normal form (or DKNF) requires that a table not be subject to any constraints other than domain constraints and key constraints.

[edit] Sixth normal form

It has been suggested that this section be split into a new article entitled Sixth normal form. (Discuss)

This normal form was, as of 2005, only recently proposed: the sixth normal form (6NF) was only defined when extending the relational model to take into account the temporal dimension. Unfortunately, most current SQL technologies as of 2005 do not take into account this work, and most temporal extensions to SQL are not relational. See work by Date, Darwen and Lorentzos[3] for a relational temporal extension, or see TSQL2 for a different approach
...