Unit 3 – Relational Database Design | DBMS SPPU

Unit 3 – Relational Database Design | DBMS SPPU

By Vinay Bhadane19 May 202612 min read

Introduction to the Relational Model

The Relational Model is the theoretical foundation of most modern Database Management Systems (DBMS). Proposed by Edgar F. Codd in 1970, this model organizes data into two-dimensional tables called relations. By separating the logical structure of data from its physical storage, the relational model provides a simple, highly mathematical, and structured way to store and retrieve information.

Unlike older hierarchical or network models, the relational model allows users to represent complex data relationships purely through table columns and values without needing complex pointers or rigid storage structures.

Basic Concepts of the Relational Model

To understand how relational databases work, one must first understand the standard terminology used in the relational model.

  • Relation: A relation is a two-dimensional table that contains data. It consists of rows and columns.

  • Tuple: A single row or record in a relation is called a tuple. It represents a single entity or instance of data (e.g., one specific student's record).

  • Attribute: A column in a relation is called an attribute. It represents a specific property or characteristic of the entity (e.g., Roll_Number, Name, Age).

  • Degree: The total number of attributes (columns) present in a relation is called its degree.

  • Cardinality: The total number of tuples (rows) present in a relation is called its cardinality.

  • Relational Schema: The logical design or structure of the relation, typically written as Table_Name(Attribute1, Attribute2, Attribute3).

  • Relational Instance: The actual data (rows) present in the table at any specific given moment.

Hinglish Explanation: Relational model mein, ek 'Relation' ka matlab ek simple table hota hai. Table ki row ko 'Tuple' kehte hain (jo ek poori record hoti hai), aur column ko 'Attribute' kehte hain. 'Degree' matlab kitne columns hain, aur 'Cardinality' matlab kitni rows hain.

Attributes and Domains

Every attribute in a relation must have a predefined set of permitted values. This set of allowed values is called the Domain of the attribute.

  • Definition: A domain is the original set of atomic values used to model data. For example, the domain of an attribute named "Month" must be integers from 1 to 12 or string values from "January" to "December".

  • Atomic Values: The values inside a domain must be indivisible. For example, a single cell in a table should not contain multiple phone numbers.

  • NULL Values: A special value called NULL is used to represent a missing, unknown, or not applicable value. NULL is not the same as zero or a blank space.

CODD’s 12 Rules for Relational Databases

To define what truly qualifies as a Relational Database Management System (RDBMS), E.F. Codd established a set of 13 rules (numbered 0 to 12). For a database to be considered strictly relational, it must follow these rules.

  • Rule 0: Foundation Rule: A relational database system must manage all its data entirely through its relational capabilities.

  • Rule 1: Information Rule: All information in the database is represented in exactly one way: as values in table cells.

  • Rule 2: Guaranteed Access Rule: Every single data element must be logically accessible using a combination of the Table Name, Primary Key, and Column Name. No data should be hidden.

  • Rule 3: Systematic Treatment of NULL Values: The database must support NULL values to represent missing or unknown information systematically, distinct from default values or zeros.

  • Rule 4: Dynamic Online Catalog: The database dictionary (metadata) must be stored in the form of relational tables and queried using the same SQL language used for regular data.

  • Rule 5: Comprehensive Data Sublanguage Rule: The system must support at least one well-defined language (like SQL) that supports data definition, data manipulation, integrity constraints, and transaction management.

  • Rule 6: View Updating Rule: Any view that is theoretically updatable must also be practically updatable by the system.

  • Rule 7: High-Level Insert, Update, and Delete: The system must support inserting, updating, and deleting multiple rows of data at a time, not just one row at a time.

  • Rule 8: Physical Data Independence: Changes made to the physical storage (like moving files to a different hard drive) must not change how the applications or users interact with the database.

  • Rule 9: Logical Data Independence: Changes to the logical schema (like adding a new column to a table) must not break existing applications that do not rely on the new column.

  • Rule 10: Integrity Independence: Integrity constraints (like primary keys and foreign keys) must be specified separately from application programs and stored in the database catalog.

  • Rule 11: Distribution Independence: If the database is distributed across multiple locations, it should remain invisible to the end-user. The query format must remain exactly the same.

  • Rule 12: Nonsubversion Rule: If the system provides a low-level (record-at-a-time) interface, that interface cannot be used to bypass the security and integrity rules defined at the high level.

Hinglish Explanation: Codd ke rules ek strict checklist ki tarah hain. Agar koi software khud ko RDBMS (Relational Database) kehta hai, toh use ye saare rules follow karne padenge. Jaise Rule 1 kehta hai ki sab kuch table format mein hona chahiye, aur Rule 2 kehta hai ki har data point ko Table Name + Column + ID se dhundha ja sake.

Relational Integrity Constraints

Integrity constraints are a set of rules applied to the database to ensure that the data stored is accurate, valid, and consistent.

1. Domain Integrity

Domain integrity ensures that all data items in a column fall within a defined set of valid values. It restricts the data type, format, and range of values.

  • Example: An attribute "Age" must be a numeric integer and cannot be negative.

2. Entity Integrity

Entity integrity states that no primary key attribute can contain a NULL value.

  • Reason: The primary key is used to uniquely identify each row in a table. If it is NULL, the database cannot identify that specific record.

3. Referential Integrity

Referential integrity maintains consistency between two related tables. It is enforced using Foreign Keys.

  • Rule: If a foreign key in Table B refers to a primary key in Table A, then the foreign key value must either exactly match an existing primary key value in Table A, or it must be NULL. You cannot have a foreign key pointing to a primary key that does not exist.

Hinglish Explanation: Referential integrity ka matlab hai ki agar ek table (Student) mein koi foreign key hai (jaise Department_ID), toh wo ID dusri table (Department) mein maujood honi chahiye. Aap kisi aise department ka ID nahi daal sakte jo exist hi nahi karta.

Database Design: Features of Good Relational Designs

A poorly designed database leads to redundant data and inconsistent results. A good relational database design should minimize data duplication and prevent "anomalies" (errors or inconsistencies) during data manipulation.

Types of Database Anomalies

If a database is not properly normalized, it will suffer from three major anomalies:

  1. Insertion Anomaly: Occurs when certain attributes cannot be inserted into the database without the presence of other attributes. For example, if a table stores both Student and Course details, you cannot add a new Course until at least one Student enrolls in it.

  2. Update Anomaly: Occurs when data is duplicated, and updating it in one place requires updating it in multiple places. If one instance is missed, the database becomes inconsistent.

  3. Deletion Anomaly: Occurs when deleting a row unintentionally deletes other important information. For example, deleting the last student enrolled in a course might also delete the details of the course itself.

Introduction to Normalization

Normalization is the systematic process of organizing data in a database to reduce redundancy (data duplication) and eliminate undesirable characteristics like insertion, update, and deletion anomalies.

The process involves dividing large, poorly structured tables into smaller, well-structured tables and defining relationships between them using foreign keys.

Functional Dependencies

Before learning normal forms, one must thoroughly understand Functional Dependency (FD).

A functional dependency is a constraint between two attributes in a relation. It is denoted as:

X -> Y

This is read as "X determines Y" or "Y is functionally dependent on X".

  • Determinant: The attribute on the left side (X) that determines the value of another attribute.

  • Dependent: The attribute on the right side (Y) whose value is determined by the determinant.

  • Example: In a Student table, Roll_Number -> Student_Name. If you know the Roll_Number, you can exactly determine the Student_Name.

Hinglish Explanation: Functional dependency (X -> Y) ka simple matlab hai ki agar mujhe X ki value pata hai, toh main Y ki value definitely bata sakta hu. Jaise Roll Number se Name pata chal jata hai. Yahan Roll Number 'Determinant' hai.

Atomic Domains and First Normal Form (1NF)

Definition of 1NF

A relation is in First Normal Form (1NF) if and only if every attribute in that relation contains only atomic (indivisible) values.

Rules for 1NF

  1. Each column must contain only single values (no arrays, lists, or multiple values separated by commas).

  2. Each column must have a unique name.

  3. The order in which data is stored does not matter.

Example and Decomposition

Unnormalized Table (Violates 1NF):

Here, the Subjects column contains multiple values, which violates atomic domains.

Roll_No

Name

Subjects

101

Alice

DBMS, OS

102

Bob

Networks

Converted to 1NF:

To decompose this into 1NF, we create a new row for each subject. Data redundancy increases slightly, but the atomic rule is satisfied.

Roll_No

Name

Subject

101

Alice

DBMS

101

Alice

OS

102

Bob

Networks

Second Normal Form (2NF)

Definition of 2NF

A relation is in Second Normal Form (2NF) if:

  1. It is already in 1NF.

  2. It contains NO Partial Dependencies.

What is Partial Dependency?

Partial dependency occurs when a non-prime attribute (an attribute that is not part of any candidate key) depends on only a part of a composite primary key, rather than the whole key.

If a table has a single-column primary key, and it is in 1NF, it is automatically in 2NF. 2NF issues only arise with composite primary keys (keys made of two or more columns).

Example and Decomposition

Assume a table storing student grades for courses.

Primary Key: {Student_ID, Course_ID}

Student_ID

Course_ID

Course_Name

Grade

S1

C1

DBMS

A

S1

C2

OS

B

The Problem:

  • Grade depends on both Student_ID and Course_ID. (Fully functionally dependent).

  • However, Course_Name depends ONLY on Course_ID. It does not depend on Student_ID.

  • Because Course_Name depends on only a part of the composite primary key, this is a Partial Dependency.

Converted to 2NF:

We decompose the table into two separate tables based on the functional dependencies.

Table 1: Course_Details

Course_ID (PK)

Course_Name

C1

DBMS

C2

OS

Table 2: Student_Grades

Student_ID (PK)

Course_ID (PK)

Grade

S1

C1

A

S1

C2

B

Hinglish Explanation: 2NF ka rule simple hai: Agar primary key do columns se mil kar bani hai, toh baaki saare columns ko primary key ke dono columns par depend hona chahiye. Agar koi column sirf ek aadhe primary key par depend karta hai, toh usko alag table mein nikal do.

Third Normal Form (3NF)

Definition of 3NF

A relation is in Third Normal Form (3NF) if:

  1. It is already in 2NF.

  2. It contains NO Transitive Dependencies.

What is Transitive Dependency?

Transitive dependency occurs when a non-prime attribute depends on another non-prime attribute.

Mathematically, if A -> B, and B -> C, then A -> C is a transitive dependency. In a database, every non-key column must depend only on the primary key, and nothing but the primary key.

Example and Decomposition

Assume a table storing employee details.

Primary Key: Employee_ID

Employee_ID

Emp_Name

Zip_Code

City

1001

John

422001

Nashik

1002

Mary

411001

Pune

The Problem:

  • Emp_Name and Zip_Code depend on Employee_ID.

  • However, City depends on Zip_Code (Zip_Code -> City).

  • Since Zip_Code is not a primary key, City depends on a non-prime attribute. This is a Transitive Dependency.

Converted to 3NF:

We decompose the table to remove the transitive dependency.

Table 1: Employee_Info

Employee_ID (PK)

Emp_Name

Zip_Code (FK)

1001

John

422001

1002

Mary

411001

Table 2: Location_Details

Zip_Code (PK)

City

422001

Nashik

411001

Pune

Hinglish Explanation: 3NF kehta hai ki table ka har column sirf aur sirf Primary Key par depend karna chahiye. Agar koi normal column kisi dusre normal column par depend kar raha hai (jaise City, Zip Code par depend kar raha hai), toh usko bahar nikal kar nayi table bana do.

Boyce-Codd Normal Form (BCNF)

Definition of BCNF

BCNF is an advanced, stricter version of 3NF. It is sometimes referred to as 3.5 Normal Form.

A relation is in BCNF if:

  1. It is in 3NF.

  2. For every non-trivial functional dependency X -> Y, X must be a Super Key.

Understanding the Difference Between 3NF and BCNF

A table can be in 3NF but fail BCNF if it contains overlapping candidate keys. In 3NF, it is acceptable for a non-prime attribute to determine a prime attribute. In BCNF, this is strictly forbidden. The rule is absolute: "The determinant must always be a candidate key."

Example and Decomposition

Assume a college scenario where:

  • A student can enroll in multiple subjects.

  • For each subject, a specific professor teaches the student.

  • Each professor teaches only one subject.

Student_ID

Subject

Professor

S1

DBMS

Prof. Smith

S1

OS

Prof. Allen

S2

DBMS

Prof. Smith

Candidate Keys: {Student_ID, Subject} or {Student_ID, Professor}

Functional Dependencies:

  1. {Student_ID, Subject} -> Professor (Student and subject determine the professor).

  2. Professor -> Subject (Because each professor teaches only one subject).

The Problem:

The table is in 3NF (no partial or transitive dependencies for non-prime attributes). However, look at the dependency: Professor -> Subject.

Here, Professor is determining Subject. But Professor is NOT a super key (one professor can teach multiple students, so professor alone cannot identify a unique row). This violates BCNF.

Converted to BCNF:

We decompose based on the violating functional dependency (Professor -> Subject).

Table 1: Professor_Subject

Professor (PK)

Subject

Prof. Smith

DBMS

Prof. Allen

OS

Table 2: Student_Professor

Student_ID (PK)

Professor (PK)

S1

Prof. Smith

S1

Prof. Allen

S2

Prof. Smith

Now, in Table 1, Professor is a super key. In Table 2, the combination of both is a super key. The BCNF rule is satisfied.

Hinglish Explanation: BCNF simple language mein ye kehta hai ki "Jo attribute kisi dusre ko determine kar raha hai (yani Left side of arrow), wo pakka koi primary key ya super key hona hi chahiye." Agar wo key nahi hai, toh table BCNF mein nahi hai.

Key Takeaways and Summary

  • Relational Model: Uses tables (relations), rows (tuples), and columns (attributes) to organize data.

  • CODD’s Rules: A set of 12 strict rules that define what makes a database a true Relational Database.

  • Integrity Constraints: Domain, Entity, and Referential integrities guarantee that the data remains accurate, valid, and structurally sound.

  • Anomalies: Poor design causes insertion, update, and deletion anomalies, which compromise data accuracy.

  • Normalization: The structural process of breaking down tables to eliminate redundancy using functional dependencies.

  • 1NF: Requires all attributes to have atomic, single values.

  • 2NF: Removes partial dependencies (requires full dependency on composite keys).

  • 3NF: Removes transitive dependencies (non-key attributes must not depend on other non-key attributes).

  • BCNF: A stricter 3NF where every determinant in a functional dependency must be a candidate/super key.

SEO Keywords Section

Search keywords related to this topic:

Relational Model in DBMS, Basic concepts of Relational Model, Attributes and Domains in DBMS, CODD's 12 Rules for Relational Databases, Relational Integrity constraints, Referential Integrity vs Entity Integrity, Features of Good Relational Database Design, Database Normalization explained, Insertion Update Deletion Anomalies, Atomic Domains and First Normal Form 1NF, Functional Dependencies in DBMS, Decomposition using Functional Dependencies, Second Normal Form 2NF partial dependency, Third Normal Form 3NF transitive dependency, Boyce-Codd Normal Form BCNF examples, difference between 3NF and BCNF, Computer Engineering DBMS notes, Diploma computer engineering database concepts, Relational schema and instances.

 

Download PDF Notes & Get Updates

Join our WhatsApp channel for free PDF downloads and instant notifications when new notes drop.

Join WhatsApp

Advertisement

Comments (0)

Sign in to join the discussion