The author selected Apache Software Foundation to receive a donation as part of the Write for DOnations program.
When working on a large SQL project, you must maintain the accuracy and consistency of data across all tables with foreign keys. A foreign key is a column or group of columns in a relational database table that provides a link between data in two tables. In this use case, this is where referential integrity comes into play. For instance, you can have an employees
table with a column named job_title_id
that refers back to a lookup table named job_titles
.
Another example can be demonstrated in an e-commerce database where you might create a category_id
column in a products
table that links back to a parent products_categories
table.
Referential integrity ensures all data references are valid and prevents inconsistent entries or orphaned records. Referential integrity is also useful for preventing entries of invalid data in a multi-user database environment.
In this guide, you’ll enforce referential integrity with foreign keys on your database. Although this guide is tested on a MySQL database, it can still work in other SQL-based databases with just a few syntax changes.
To complete this tutorial, you’ll require the following:
An Ubuntu 20.04 server secured with a non-root sudo
user and a basic firewall. Follow the Initial Server Setup for Ubuntu 20.04 guide to create a non-root user and enable a firewall.
A MySQL database server. Review the tutorial on How To Install MySQL on Ubuntu 20.04 server to set up and configure a database server.
In this step, you’ll create a sample database and set up a few tables. You’ll also insert some sample data that you’ll use to work with foreign keys throughout the guide.
Begin by connecting to your server as a non-root user. Then, execute the following command to log in to your MySQL server. Replace example_user
with the exact name for your non-root account.
- sudo mysql -u example_user -p
When prompted, enter the non-root user account password of your MySQL server and press ENTER
or RETURN
to proceed. Next, issue the following SQL command to create a sample company_db
database:
- CREATE DATABASE company_db;
Confirm the following output to make sure the database has been created without errors.
OutputQuery OK, 1 row affected (0.01 sec)
Once you’ve successfully created the database with no error messages in your output, apply the SQL USE
keyword to switch to your new company_db
database:
- USE company_db;
You should see the following confirmation that shows you’ve successfully switched to the company_db
database:
OutputDatabase changed
Next, set up a job_titles
table using the CREATE TABLE
command. This table works as a look-up table for all job titles available in your database. The job_title_id
is a primary key that uniquely identifies each job title in your database using the BIGINT
data type that can accommodate up to 2^63-1
records. You’re using the AUTO_INCREMENT
keyword to let MySQL automatically assign sequential numeric values every time you insert a new job title.
In the CREATE TABLE
command, include a job_title_name
column that stores a human-readable value for a job title. This column stores string values with the maximum length at 50
characters. You will define this data type with the syntax VARCHAR(50)
.
Following CREATE TABLE
command, instruct MySQL to use the InnoDB
database engine by including the ENGINE = InnoDB
keyword. This is a transaction-ready general-purpose storage engine that handles concurrency while still ensuring high reliability and high performance in your database application.
Execute the following command to create the job_titles
table:
- CREATE TABLE job_titles (
- job_title_id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
- job_title_name VARCHAR(50)
- ) ENGINE = InnoDB;
After running the CREATE TABLE job_titles...
statement, make sure your command was completed successfully by confirming the following output:
OutputQuery OK, 0 rows affected (0.03 sec)
You now have a look up table for all valid positions available in your example company. Next, insert some sample positions into the job_titles
table:
- INSERT INTO job_titles (job_title_name) VALUES ('BRANCH MANAGER');
- INSERT INTO job_titles (job_title_name) VALUES ('CLERK');
- INSERT INTO job_titles (job_title_name) VALUES ('LEVEL 1 SUPERVISOR');
After each command, you should get the following confirmation message:
OutputQuery OK, 1 row affected (0.00 sec)
...
Now that you’ve inserted the available job titles, use the MySQL SELECT
keyword to query the job_titles
table to verify your data:
- SELECT
- job_title_id,
- job_title_name
- FROM job_titles;
You should now see a list of all available positions listed as shown below:
Output+--------------+--------------------+
| job_title_id | job_title_name |
+--------------+--------------------+
| 1 | BRANCH MANAGER |
| 2 | CLERK |
| 3 | LEVEL 1 SUPERVISOR |
+--------------+--------------------+
3 rows in set (0.00 sec)
Next, create an employees
table. This table holds records for all staff members in the company. The job_title_id
column in the employees
table points back to the same column in the job_titles
table. You’re achieving this by issuing the statement FOREIGN KEY (job_title_id) REFERENCES job_titles (job_title_id)
. For consistency purposes, you’re using the BIGINT
data type, which you used for the related columns.
In the following employees
table, the employees_id
is the PRIMARY KEY
and you’ve used the AUTO_INCREMENT
keyword to generated new employees_ids
as you insert new values.
You’re capturing the employees’ names using the first_name
and last_name
text fields with a maximum length of 50
characters. This data type is also perfect for the phone number. Therefore, a VARCHAR(50)
data type should work for the first_name
, last_name
, and phone
fields.
To improve the speed when retrieving data from the two interlinked tables, use the statement INDEX (job_title_id)
to index the job_title_id
column. Again, make sure to include the keyword ENGINE = InnoDB
to take advantage of the InnoDB
storage engine as outlined in Step 1.
To create the employees
table, run the following command:
- CREATE TABLE employees (
- employee_id BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
- job_title_id BIGINT NOT NULL,
- first_name VARCHAR(50),
- last_name VARCHAR(50),
- phone VARCHAR(50),
- INDEX (job_title_id),
- FOREIGN KEY (job_title_id) REFERENCES job_titles (job_title_id)
- ) ENGINE = InnoDB;
Ensure you get the following output confirming you’ve created the table:
OutputQuery OK, 0 rows affected (0.04 sec)
Now that you’ve set up the right database and tables for testing purposes, you’ll now look at what occurs next when inserting data into the table.
In this step, you’ll insert some orphaned records into the employees
table. Orphaned records in this case are records with invalid job_title_ids
. From your job_titles
table, you only have 3 valid job titles as shown below.
BRANCH MANAGER
CLERK
LEVEL 1 SUPERVISOR
Now, attempt adding some invalid records into the employees
table by running the following INSERT
statements:
- INSERT INTO employees (job_title_id, first_name, last_name, phone) VALUES (4, 'JOHN', 'DOE', '11111');
- INSERT INTO employees (job_title_id, first_name, last_name, phone) VALUES (15, 'MARY', 'SMITH', '22222');
- INSERT INTO employees (job_title_id, first_name, last_name, phone) VALUES (7, 'JANE', 'MIKE', '33333');
The above INSERT
statements should all fail and display the following errors since 4
, 15
, and 7
are invalid job_title_ids
.
OutputERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`company_db`.`employees`, CONSTRAINT `employees_ibfk_1` FOREIGN KEY (`job_title_id`) REFERENCES `job_titles` (`job_title_id`))
...
In the next step, you’ll enter valid data into the employees
table and check if the commands succeed.
You’ve seen how referential integrity prevents the entry of invalid data when tables are interlinked with foreign keys. In other words, using foreign keys keeps your database in a consistent state even without necessarily coding that business logic in an external client application.
In this step, you’ll now insert valid data and see if the inserts will be successful. Run the following commands:
- INSERT INTO employees (job_title_id, first_name, last_name, phone) VALUES (2, 'PETER', 'SMITH', '55555');
- INSERT INTO employees (job_title_id, first_name, last_name, phone) VALUES (1, 'JOHN', 'DOE', '11111');
- INSERT INTO employees (job_title_id, first_name, last_name, phone) VALUES (2, 'STEVE', 'KIM', '66666');
- INSERT INTO employees (job_title_id, first_name, last_name, phone) VALUES (3, 'MARY', 'ROE', '22222');
- INSERT INTO employees (job_title_id, first_name, last_name, phone) VALUES (2, 'JANE', 'MIKE', '33333');
As you’re inserting valid job_title_ids
, the INSERT
statements will now succeed. After executing each INSERT
command, you’ll receive the following output:
OutputQuery OK, 1 row affected (0.00 sec)
...
By now, you’ll notice that implementing referential integrity is a useful approach for validating data and preventing the entry of non-existent records when working with interlinked tables. As well, by using foreign keys, you’re creating an optimized database that allows you to query interlinked data in an efficient manner.
For instance, to retrieve all staff members’ records with the position names spelled out, run the following JOIN
statement against the employees
and job_titles
tables.
- SELECT
- employee_id,
- employees.job_title_id,
- job_titles.job_title_name,
- first_name,
- last_name,
- phone
- FROM employees
- LEFT JOIN job_titles
- ON employees.job_title_id = job_titles.job_title_id;
In the output below, each employee’s information is now displayed alongside their associated role/position:
Output+-------------+--------------+--------------------+------------+-----------+-------+
| employee_id | job_title_id | job_title_name | first_name | last_name | phone |
+-------------+--------------+--------------------+------------+-----------+-------+
| 5 | 1 | BRANCH MANAGER | JOHN | DOE | 11111 |
| 4 | 2 | CLERK | PETER | SMITH | 55555 |
| 6 | 2 | CLERK | STEVE | KIM | 66666 |
| 8 | 2 | CLERK | JANE | MIKE | 33333 |
| 7 | 3 | LEVEL 1 SUPERVISOR | MARY | ROE | 22222 |
+-------------+--------------+--------------------+------------+-----------+-------+
5 rows in set (0.00 sec)
As you can see in the output above, you have one BRANCH MANAGER
, three of CLERK
, and one LEVEL 1 SUPERVISOR
.
Foreign keys are also great when it comes to preventing the deletion of parent records that are already referenced by a foreign key in a linked child table. Here are a few real-life examples where you can apply this:
In an e-commerce website, you can prevent accidental deletion of customer’s details from a customers
table when you’ve active orders for the customer in the sales
table.
In a library system, you can prevent deleting a student from a registers
table when the student has associated records in the issued_books
table.
In a bank, you can use the foreign keys approach to avoid deleting records from the savings_accounts
table when a client has already made some deposits/withdrawals in the savings_accounts_transactions
table.
Similarly, you can attempt to delete data in your table. On your command-line terminal, delete a single position from the job_titles
table:
- DELETE FROM job_titles
- WHERE job_title_id = 1 ;
Since you’ve already inserted a record in the employees
table with a title of a BRANCH MANAGER
, the DELETE
statement will fail and display the following error:
OutputERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`company_db`.`employees`, CONSTRAINT `employees_ibfk_1` FOREIGN KEY (`job_title_id`) REFERENCES `job_titles` (`job_title_id`))
Again, add a new role into the job_titles
table:
- INSERT INTO job_titles (job_title_name) VALUES ('CEO');
You should receive the following output once you’ve executed the command successfully.
OutputQuery OK, 1 row affected (0.00 sec)
...
Again, query the job_titles
table to check the job_title_id
of the new position:
- SELECT
- job_title_id,
- job_title_name
- FROM job_titles;
You should now see a list of all available positions listed as shown below. The CEO
role has a job_title_id
of 4
:
Output+--------------+--------------------+
| job_title_id | job_title_name |
+--------------+--------------------+
| 1 | BRANCH MANAGER |
| 2 | CLERK |
| 3 | LEVEL 1 SUPERVISOR |
| 4 | CEO |
+--------------+--------------------+
4 rows in set (0.00 sec)
You now have 4 rows in the table. Next, delete the new role with job_title_id
of 4
before entering any associated record into the employees
table.
- DELETE FROM job_titles
- WHERE job_title_id = 4 ;
The DELETE
statement should now succeed.
OutputQuery OK, 1 row affected (0.00 sec)
After completing all the above tests without any errors, it is now clear that your foreign keys are working as expected.
In this guide, you’ve set up a sample database with interlinked tables and practiced the use of referential integrity in a relational database management system. You’ve seen how foreign keys are important in validating and preventing the deletion of data that would otherwise put the database in an inconsistent state. Use the knowledge in this guide towards your next database project to take advantage of foreign keys.
For more practice with your MySQL database, check out these tutorials:
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!