As per Database wiki: “Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. “

Ok, what is this saying? The same information should not be in 2 different places. Normalization is about not duplicating information. 

Since there are a lot of school admission samples available, let’s look at a scenario involving students, courses and payments.  There will be 3 entities involved in this example to illustrate the data.

We need a way to store student information, the courses they are enrolled in and the payments for those courses.  A student will sign up for a course and then need to pay for the course in order to be enrolled.

The Entities will be:

  • Student: all information related to a student
  • Course: all information related to a course
  • Enrollment: all information related to who is enrolled and what they are enrolled in, along with payment details.

Structuring Your Tables

Below is an example of two structures, the first one is the incorrect structure and the second will be the correct, normalized structure. 

Incorrect Structure

In this scenario tbl_Enrollment, has the name of the student and the name of the course they enrolled in. 

On the surface this seems sensible, however, there’s no value in duplicating information.  The student and course table already have this information so why duplicate it in the enrollment table.

Correct Structure

If we say, “I want to enroll and my information is found over here”, then we can create that through a query. 

If we put pointers between the entities we can tie the bits of information together.

CREATING A QUERY

What we want to do is store information about the transaction of purchasing and enrolling in a course.  Next we want to be able to find the information related to the entities and display it.  The following query illustrates how we find the information and what’s involved in making it visible.  Once we understand queries and how to connect the entities we can see how we can tie information together. 

 The query below says:

I know there’s a transaction that happened in tbl_Enrollment that is related to students and courses, find the information about the student and the course that is recorded in the transaction.

The database engine is smart enough to uncover information in many different places; we just have to tell it where the information is found.  By putting the Stud_ID in tbl_Enrollment we in effect are giving the Enrollment table this yellow sticky note that says: “Mr. Database Engine, here’s a little note that will lead you to where my other stuff is found. 

 Think of the tables as houses that rent rooms.  In this case the landlord who owns the house (tbl_Enrollment), does not want his house and rooms cluttered with stuff.  The Landlord is concerned with having to clean thing up and isn’t thrilled with spending valuable time managing clutter in his house.  The Landlord has a quiet little request of his tenant, you’re welcome to rent this room, but please keep all your stuff at your parent’s house.

 Joining Tables – Parent / Child Relationship

Understanding how tables are joined is the key to creating those sticky notes.  Joins are made by assigning a parent and child relationship through the use of Primary and Foreign keys.  Think of it like this: you’re a student, renting a room in this house, and when other people want to know about your grades, assignments, your age, and other details, you hand them this sticky note with your parents phone number on it.  “Here call my parents, they’ll give you this information.”

 This query represents the action of someone calling the number on the sticky note the student provides.

Joins tell the database engine what information needs to be returned through indicating the type of relationship between the parent and child.  We can ask for all the records from the parent table, or only records from the child table.  This is a subject for another time.

 Turning to the student enrolling in a course, the same query as above would display the following:

Joins tell the database engine what information needs to be returned through indicating the type of relationship between the parent and child.  We can ask for all the records from the parent table, or only records from the child table.  This is a subject for another time.

 Turning to the student enrolling in a course, the same query as above would display the following:

QUESTIONS TO ASK

 The purpose of Normalization is to keep the house clean so to speak.  When you’re designing a database always consider shortcuts and ask yourself the question, does this information exist somewhere in another table, if it does, then think sticky note.  The way I start building tables is by asking the question, “do these fields have anything to do with this type of entity?”

Eg:

  • Does enrolling in a course have anything to do with my student name ?  no
  • Does student name tell me about enrolling?  No
  • Does student name describe an enrollment?  No

 REAL LIFE EXAMPLE

 Entities are objects that have properties or attributes.  Using an unrelated example to make a point, let’s look at two objects, cars and drivers.  I think we can all agree that a car can not have a brain, and a driver couldn’t have a wheel.  This is no different than an enrollment can not have a student name.

This being said, if we want to associate a driver with a car, the driver will get insurance and the insurance paper will have the driver’s licence and the car’s VIN on it, and these are the sticky note pointers to the car and driver information.  When you’re pulled over by the Police, they could use the Driver’s License Number on the policy to look up your personal driving information, and the VIN to find the car information to make sure the car isn’t stolen.

 In conclusion, build the objects first with all the properties and attributes which that object should have.  Then ask the question how can I connect these two objects so they can talk to each other.  Keep it clean logical and efficient.