DBT | Notion

1. What is normalization? What is its need? Explain 1NF, 2NF, 3NF and BCNF in detail.

Normalization is a design technique used in relational databases to organize data to reduce redundancy and improve data integrity. Its main needs include:

• Eliminating Data Redundancy: Avoids duplicate data storage.

• Preventing Anomalies: Minimizes insertion, update, and deletion anomalies.

• Improving Data Consistency: Ensures that data modifications are propagated correctly throughout the database.

Normal Forms:

First Normal Form (1NF):

– Requires that each column contains atomic (indivisible) values and that each record is unique.

– For example, a table storing multiple phone numbers in a single field violates 1NF; splitting them into separate rows or columns makes the data atomic.
Second Normal Form (2NF):

– Builds on 1NF by ensuring that every non-key attribute is fully functionally dependent on the entire primary key (i.e., no partial dependency).

– This is particularly important for tables with composite primary keys. For instance, if a table’s primary key is (CourseID, StudentID), then attributes such as “StudentName” must depend on the entire key, not just one part.
Third Normal Form (3NF):

– Requires that the table is in 2NF and that no non-key attribute is transitively dependent on the primary key (i.e., no dependency on other non-key attributes).

– For example, if a table contains “StudentID”, “DepartmentID”, and “DepartmentName”, and “DepartmentName” depends on “DepartmentID” (which is not part of the primary key), then it violates 3NF.
Boyce-Codd Normal Form (BCNF):

– A stricter version of 3NF where every determinant (an attribute on which some other attribute is fully functionally dependent) must be a candidate key.

– BCNF addresses certain anomalies that 3NF might miss. For instance, in a table where one candidate key is “Professor” but a non-candidate attribute “Course” uniquely determines “Professor,” the table is in 3NF but not in BCNF.

2. Explain entity relationship diagram and explain all types of relationships with examples.

An Entity Relationship Diagram (ERD) is a visual representation of the data and its relationships within a system. It consists of:

Entities: Represent objects or concepts (e.g., Student, Course).
Attributes: Characteristics of entities (e.g., StudentID, StudentName).
Relationships: How entities are connected.

Types of Relationships:

One-to-One (1:1):

– Each instance of Entity A is related to one instance of Entity B and vice versa.

– Example: A person and their unique passport.
One-to-Many (1:N) / Many-to-One:

– One instance of Entity A can relate to many instances of Entity B, but each instance of Entity B is linked to only one instance of Entity A.

– Example: A department (A) has many employees (B), but each employee belongs to one department.
Many-to-Many (M:N):

– Multiple instances of Entity A can relate to multiple instances of Entity B.

– Example: Students and courses—students can enroll in multiple courses, and each course can have many students. This relationship is usually implemented via an associative table.