This is the third in a series of posts teaching normalization.
The second post focused on the first normal form, its definition, and examples to hammer it home.
Now it is time to take a look at the second normal form. I like to think the reason we place tables in 2nd normal form is to narrow them to a single purpose. Doing so brings clarity to the database design, makes it easier for us to describe and use a table, and tends to eliminate modification anomalies.
This stems from the primary key identifying the main topic at hand, such as identifying buildings, employees, or classes, and the columns, serving to add meaning through descriptive attributes.
EmployeeID isn’t much on its own, but add a name, height, hair color and age, and now you’re starting to describe a real person.
So what is the definition of 2nd normal form?
2NF – Second Normal Form Definition
A table is in 2nd Normal Form if:
- The table is in 1st normal form, and
- All the non-key columns are dependent on the table’s primary key.
We already know about the 1st normal form, but what about the second requirement? Let me try to explain.
The primary key provides a means to uniquely identify each row in a table. When we talk about columns depending on the primary key, we mean that in order to find a particular value, such as what color is Kris’ hair, you would first have to know the primary key, such as an
EmployeeID, to look up the answer.
Once you identify a table’s purpose, then look at each of the table’s columns and ask yourself, “Does this column serve to describe what the primary key identifies?”
- If you answer “yes,” then the column is dependent on the primary key and belongs in the table.
- If you answer “no,” then the column should be moved to a different table.
When all the columns relate to the primary key, they naturally share a common purpose, such as describing an employee. That is why I say that when a table is in second normal form, it has a single purpose, such as storing employee information.
Issues with our Example Data Model
So far, we have taken our example to the first normal form, and it has several issues.
The first issue is the
SalesStaffInformation table has two columns which aren’t dependent on the
EmployeeID. Though they are used to describe which office the
SalesPerson is based out of, the
OfficeNumber columns themselves don’t serve to describe who the employee is.
The second issue is that there are several attributes which don’t completely rely on the entire
Customer table primary key. For a given
customer, it doesn’t make sense that you should have to know both the
EmployeeID to find the
It stands to reason you should only need to know the
CustomerID. Given this, the
Customer table isn’t in 2nd normal form as there are columns that aren’t dependent on the full primary key. They should be moved to another table.
These issues are identified below in red.
Fix the Model to 2NF Standards
Since the columns identified in red aren’t completely dependent on the table’s primary key, it stands to reason they belong elsewhere. In both cases, the columns are moved to new tables.
In the case of
SalesOffice was created. A foreign key was then added to
SalesStaffInformation so we can still describe in which office a sales person is based.
The changes to make
Customer a second normal form table are a little trickier. Rather than move the offending columns
CustomerPostalCode to new table, recognize that the issue is
EmployeeID! The three columns don’t depend on this part of the key. Really, this table is trying to serve two purposes:
- To indicate which customers are called upon by each employee
- To identify customers and their locations.
For the moment, remove
EmployeeID from the table. Now the table’s purpose is clear, it is to identify and describe each
Now let’s create a table named
SalesStaffCustomer to describe which
customers a sales person calls upon. This table has two columns,
EmployeeID. Together, they form a primary key. Separately, they are foreign keys to the
SalesStaffInformation tables respectively.
With these changes made, the data model, in second normal form, is shown below:
To better visualize this, here are the tables with data.
As you review the data in the tables, notice that the redundancy is mostly eliminated. Also, see if you can find any update, insert, or deletion anomalies. Those too are gone. You can now eliminate all the sales people, yet retain customer records. Also, if all the
SalesOffices close, it doesn’t mean you have to delete the records containing sales people.
SalesStaffCustomer table is a strange one. It’s just all keys! This type of table is called an intersection table. An intersection table is useful when you need to model a many-to-many relationship.
Each column is a foreign key. If you look at the data model, you’ll notice that there is a one to many relationship to this table from
SalesStaffInformation and another from
Customer. In effect, the table allows you to bridge the two tables together.
For all practical purposes, this is a pretty workable database. Three out of the four tables are even in third normal form, but there is one table which still has a minor issue, preventing it from being so.
More tutorials are to follow! Remember! I want to remind you all that if you have other questions you want answered, then post a comment or tweet me. I’m here to help you. What other topics would you like to know more about?
Hello my name is Kris. I’m here because I am passionate about helping non-techie people to overcome their fear of learning SQL.
I know what it is like to not know where to start or whether the time spent learning is worth the effort. That is why I am here to help you to:
- Get started in an easy to follow step-by-step manner.
- Use your time wisely so you focus on what is important to learn to get the most value from your time.
- Answer your questions. Really! Just post a comment and I’ll respond. I’m here to help.
It wasn’t long ago that I was helping a colleague with some reporting. She didn’t know where to start and soon got overwhelmed and lost as she didn’t know SQL.
I felt really bad, as she was under pressure to get some summary information to her boss, the built-in reports were falling short, and to make them better would require her to know SQL. At that time that seemed impossible! It in dawned on me, it doesn’t have to be that way.
Then I discovered a way for anyone with the desire to easily learn SQL. I worked with my co-worker, started to teach her what I learned and soon she was able to write reports and answer her boss’ questions without getting stressed or ploughing hours into manipulating data in Excel.
It hasn’t always been easy. Sometimes the information seems abstract or too conceptual. In this case I’ve found out that a visual explanation is best. I really like to use diagrams or videos to explain hard-to-grasp ideas.
Having video, pictures, and text really help to reinforce the point and enable learning.
And now I want to help you get the same results.
The first step is simple, click here http://www.essentialsql.com/get-started-with-sql-server/