Programming languages typically feature conditional statements, which are commands that perform a specified action until a certain condition is met. A common conditional statement is the if, then, else
statement, which generally follows this logic:
if condition=true
then action A
else action B
The logic of this statement translates into the following language: “If condition is true, then perform action A. Otherwise (else), perform action B.”
CASE
expressions are a feature in Structured Query Language (SQL) that allow you to apply similar logic to database queries and set conditions on how you want to return or display the values in your result set.
In this tutorial, you’ll learn how to use the CASE
expression to set conditions on your data using WHEN
, THEN
, ELSE
, and END
keywords.
To complete this tutorial, you will need:
sudo
administrative privileges and firewall enabled. Follow our Initial Server Setup with Ubuntu 20.04 to get started.Note: Please note that many relational database management systems use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.
To practice using CASE
expressions in this tutorial, you’ll need a database and table loaded with sample data. If you do not have one ready to insert, you can read the following Connecting to MySQL and Setting up a Sample Database section to learn how to create a database and table. This tutorial will refer to this sample database and table throughout.
If your SQL database runs on a remote server, SSH into your server from your local machine:
- ssh sammy@your_server_ip
Next, open the MySQL prompt, replacing sammy
with your MySQL user account information:
- mysql -u sammy -p
Create a database named caseDB
:
- CREATE DATABASE caseDB;
If the database was created successfully, you’ll receive the following output:
OutputQuery OK, 1 row affected (0.01 sec)
To select the caseDB
database run the following USE
statement:
- USE caseDB;
OutputDatabase changed
After selecting the database, create a table within it. For this tutorial’s examples, we’ll create a table that holds data on the ten best-selling albums of all time. This table will hold the following six columns:
music_id
: displays the values of the int
data type and will serve as the table’s primary key, meaning each value in this column will function as a unique identifier for its respective row.artist_name
: stores each artist(s) name using the varchar
data type with a maximum of 30 characters.album_name
: uses the varchar
data type, again at a maximum of 30 characters to hold the names for each album.release_date
: tracks the release date for each album using the DATE
data type, which uses the YYYY-MM-DD
date format.genre_type
: displays the genre classification for each album using the varchar
data type with a maximum of 25 characters.copies_sold
: uses the decimal
data type to store the total number of album copies sold by the millions. This column specifies a precision of four with a scale of one, meaning values in this column can have four digits, with one of those digits being to the right of the decimal point.Create a table named top_albums
that contains each of these columns by running the following CREATE TABLE
command:
- CREATE TABLE top_albums (
- music_id int,
- artist_name varchar(30),
- album_name varchar(30),
- release_date DATE,
- genre_type varchar(25),
- copies_sold decimal(4,1),
- PRIMARY KEY (music_id)
- );
Next insert some sample data into the empty table:
- INSERT INTO top_albums
- (music_id, artist_name, album_name, release_date, genre_type, copies_sold)
- VALUES
- (1, 'Michael Jackson', 'Thriller', '1982-11-30', 'Pop', 49.2),
- (2, 'Eagles', 'Hotel California', '1976-12-08', 'Soft Rock', 31.5),
- (3, 'Pink Floyd', 'The Dark Side of the Moon', '1973-03-01', 'Progressive Rock', 21.7),
- (4, 'Shania Twain', 'Come On Over', '1997-11-04', 'Country', 29.6),
- (5, 'AC/DC', 'Back in Black', '1980-07-25', 'Hard Rock', 29.5),
- (6, 'Whitney Houston', 'The Bodyguard', '1992-11-25', 'R&B', 32.4),
- (7, 'Fleetwood Mac', 'Rumours', '1977-02-04', 'Soft Rock', 27.9),
- (8, 'Meat Loaf', 'Bat Out of Hell', '1977-10-11', 'Hard Rock', 21.7),
- (9, 'Eagles', 'Their Greatest Hits 1971-1975', '1976-02-17', 'Country Rock', 41.2),
- (10, 'Bee Gees', 'Saturday Night Fever', '1977-11-15', 'Disco', 21.6);
OutputQuery OK, 10 rows affected (0.01 sec)
Records: 10 Duplicates: 0 Warnings: 0
Once you’ve inserted the data, you’re ready to start using CASE
expressions in SQL.
CASE
expressions allow you to set conditions for your data and use similar logic to if-then
statements to search your data, compare the values, and evaluate whether they match as “true” to the conditions you set. Here’s an example of the general syntax for a CASE
expression:
. . .
CASE
WHEN condition_1 THEN outcome_1
WHEN condition_2 THEN outcome_2
WHEN condition_3 THEN outcome_3
ELSE else_outcome
END
. . .
Depending on how many conditions you want to set for your data, you’ll also include the following keywords within a CASE
expression:
WHEN
: this keyword evaluates and compares the data values you have in your table against the conditions or criteria you’ve set. WHEN
is comparable to if
in a typical if-then-else
statement.THEN
: this keyword filters through each condition you may have set if a particular value does not meet the criteria.ELSE
: if the data value does not meet any of the conditions you’ve set after going through each WHEN
and THEN
statement, then this keyword can be used to specify the final condition it can be categorized under.END
: to successfully run the CASE
expression and set your conditions, you must end with the END
keyword.With this understanding of CASE
expression structure and syntax, you’re ready to begin practicing with the sample data.
Imagine you’re a DJ preparing a setlist for your eccentric Aunt Carol’s 65th birthday celebration. You know her taste is hard to pin down, so you decide to do some research on the top ten selling albums of all time to inform some of your musical decisions.
First, review the list you’ve compiled in the top_albums
table by running SELECT
and the *
symbol to view all the data from each column:
- SELECT * FROM top_albums;
Output+----------+-----------------+-------------------------------+--------------+------------------+-------------+
| music_id | artist_name | album_name | release_date | genre_type | copies_sold |
+----------+-----------------+-------------------------------+--------------+------------------+-------------+
| 1 | Michael Jackson | Thriller | 1982-11-30 | Pop | 49.2 |
| 2 | Eagles | Hotel California | 1976-12-08 | Soft Rock | 31.5 |
| 3 | Pink Floyd | The Dark Side of the Moon | 1973-03-01 | Progressive Rock | 21.7 |
| 4 | Shania Twain | Come On Over | 1997-11-04 | Country | 29.6 |
| 5 | AC/DC | Back in Black | 1980-07-25 | Hard Rock | 29.5 |
| 6 | Whitney Houston | The Bodyguard | 1992-11-25 | R&B | 32.4 |
| 7 | Fleetwood Mac | Rumours | 1977-02-04 | Soft Rock | 27.9 |
| 8 | Meat Loaf | Bat Out of Hell | 1977-10-11 | Hard Rock | 21.7 |
| 9 | Eagles | Their Greatest Hits 1971-1975 | 1976-02-17 | Country Rock | 41.2 |
| 10 | Bee Gees | Saturday Night Fever | 1977-11-15 | Disco | 21.6 |
+----------+-----------------+-------------------------------+--------------+------------------+-------------+
10 rows in set (0.00 sec)
Since Aunt Carol was born in 1957, she enjoyed a lot of the hits from the seventies and eighties in her younger days. You know she’s a huge fan of pop, soft rock, and disco, so you want to rank those as the highest priority on your setlist.
You can do this by using the CASE
expression to set a condition of “High Priority” for those particular genres by querying for those data values under the genre_type
column. The following query does this, and creates an alias for the resulting column created by the CASE
expression, naming it priority
. This query also includes the artist_name
, album_name
, and release_date
for more context. Don’t forget to use the END
keyword to complete your full CASE
expression:
- SELECT artist_name, album_name, release_date,
- CASE WHEN genre_type = 'Pop' THEN 'High Priority'
- WHEN genre_type = 'Soft Rock' THEN 'High Priority'
- WHEN genre_type = 'Disco' THEN 'High Priority'
- END AS priority
- FROM top_albums;
Output+-----------------+-------------------------------+--------------+---------------+
| artist_name | album_name | release_date | priority |
+-----------------+-------------------------------+--------------+---------------+
| Michael Jackson | Thriller | 1982-11-30 | High Priority |
| Eagles | Hotel California | 1976-12-08 | High Priority |
| Pink Floyd | The Dark Side of the Moon | 1973-03-01 | NULL |
| Shania Twain | Come On Over | 1997-11-04 | NULL |
| AC/DC | Back in Black | 1980-07-25 | NULL |
| Whitney Houston | The Bodyguard | 1992-11-25 | NULL |
| Fleetwood Mac | Rumours | 1977-02-04 | High Priority |
| Meat Loaf | Bat Out of Hell | 1977-10-11 | NULL |
| Eagles | Their Greatest Hits 1971-1975 | 1976-02-17 | NULL |
| Bee Gees | Saturday Night Fever | 1977-11-15 | High Priority |
+-----------------+-------------------------------+--------------+---------------+
10 rows in set (0.00 sec)
Even though this output reflects the conditions you set for those High Priority
genre types, since you left out the ELSE
keyword, this results in unknown or missing data values known as NULL
values. While the ELSE
keyword may not be necessary if your data values meet all the conditions you’ve set in the CASE
expression, it’s useful for any residual data so it can be properly categorized under a single condition.
For this next query, write the same CASE
expression, but this time set a condition with the ELSE
keyword. In the following example, the ELSE
argument labels any non-high priority data values for genre_type
as “Maybe”:
- SELECT artist_name, album_name, release_date,
- CASE WHEN genre_type = 'Pop' THEN 'High Priority'
- WHEN genre_type = 'Soft Rock' THEN 'High Priority'
- WHEN genre_type = 'Disco' THEN 'High Priority'
- ELSE 'Maybe'
- END AS priority
- FROM top_albums;
[sceondary_label Output]
+-----------------+-------------------------------+--------------+---------------+
| artist_name | album_name | release_date | priority |
+-----------------+-------------------------------+--------------+---------------+
| Michael Jackson | Thriller | 1982-11-30 | High Priority |
| Eagles | Hotel California | 1976-12-08 | High Priority |
| Pink Floyd | The Dark Side of the Moon | 1973-03-01 | Maybe |
| Shania Twain | Come On Over | 1997-11-04 | Maybe |
| AC/DC | Back in Black | 1980-07-25 | Maybe |
| Whitney Houston | The Bodyguard | 1992-11-25 | Maybe |
| Fleetwood Mac | Rumours | 1977-02-04 | High Priority |
| Meat Loaf | Bat Out of Hell | 1977-10-11 | Maybe |
| Eagles | Their Greatest Hits 1971-1975 | 1976-02-17 | Maybe |
| Bee Gees | Saturday Night Fever | 1977-11-15 | High Priority |
+-----------------+-------------------------------+--------------+---------------+
10 rows in set (0.00 sec)
This output is now much more representative of the conditions you’ve set for those albums with the highest priority and those without it. Even though this helps prioritize the top four albums — Thriller
, Hotel California
, Rumours
, and Saturday Night Fever
— you’re convinced there needs to be more variety on this setlist. But you’ll have to persuade Aunt Carol of this as well.
You decide to perform a small experiment and ask Aunt Carol to broaden her musical palette and listen to the remaining albums. You don’t provide any context about the albums, and instruct her to score them truthfully as "Mellow, “Fun”, or “Boring.” Once she’s done, she hands you a handwritten list with her scores. You now have the information you need to set the conditions for your query as follows:
- SELECT artist_name, album_name, release_date,
- CASE WHEN genre_type = 'Hard Rock' THEN 'Boring'
- WHEN genre_type = 'Country Rock' THEN 'Mellow'
- WHEN genre_type = 'Progressive Rock' THEN 'Fun'
- WHEN genre_type = 'Country' THEN 'Fun'
- WHEN genre_type = 'R&B' THEN 'Boring'
- ELSE 'High Priority'
- END AS score
- FROM top_albums;
Output
+-----------------+-------------------------------+--------------+---------------+
| artist_name | album_name | release_date | score |
+-----------------+-------------------------------+--------------+---------------+
| Michael Jackson | Thriller | 1982-11-30 | High Priority |
| Eagles | Hotel California | 1976-12-08 | High Priority |
| Pink Floyd | The Dark Side of the Moon | 1973-03-01 | Fun |
| Shania Twain | Come On Over | 1997-11-04 | Fun |
| AC/DC | Back in Black | 1980-07-25 | Boring |
| Whitney Houston | The Bodyguard | 1992-11-25 | Boring |
| Fleetwood Mac | Rumours | 1977-02-04 | High Priority |
| Meat Loaf | Bat Out of Hell | 1977-10-11 | Boring |
| Eagles | Their Greatest Hits 1971-1975 | 1976-02-17 | Mellow |
| Bee Gees | Saturday Night Fever | 1977-11-15 | High Priority |
+-----------------+-------------------------------+--------------+---------------+
10 rows in set (0.00 sec)
Based on this output, Aunt Carol seems open to new sounds, and you’re pleasantly surprised by her score for Pink Floyd. But you’re a little disappointed in her lack of interest in the excellent tunes of AC/DC, Meat Loaf, and Whitney Houston.
Aunt Carol may be more flexible if you can show her that some albums are objectively more popular than others, so you decide to bring in some numbers to sway the decision. The fact is that these are the top ten albums because they’ve sold millions of copies to fans over the decades. Therefore, for this next query, you’ll create a new CASE
expression that sets a score based on the numerical data from copies_sold
of albums that have been sold thus far.
You’ll use the CASE
expression to set conditions for albums selling at least 35 million copies as “best”, those with 25 million as “great”, those with 20 million as “good”, and anything less than that as “mediocre” as in the following example:
- SELECT artist_name, album_name, release_date, CASE WHEN copies_sold >35.0 THEN 'best'
- WHEN copies_sold >25.0 THEN 'great'
- WHEN copies_sold >20.0 THEN 'good'
- ELSE 'mediocre' END AS score FROM top_albums;
Output+-----------------+-------------------------------+--------------+-------+
| artist_name | album_name | release_date | score |
+-----------------+-------------------------------+--------------+-------+
| Michael Jackson | Thriller | 1982-11-30 | best |
| Eagles | Hotel California | 1976-12-08 | great |
| Pink Floyd | The Dark Side of the Moon | 1973-03-01 | good |
| Shania Twain | Come On Over | 1997-11-04 | great |
| AC/DC | Back in Black | 1980-07-25 | great |
| Whitney Houston | The Bodyguard | 1992-11-25 | great |
| Fleetwood Mac | Rumours | 1977-02-04 | great |
| Meat Loaf | Bat Out of Hell | 1977-10-11 | good |
| Eagles | Their Greatest Hits 1971-1975 | 1976-02-17 | best |
| Bee Gees | Saturday Night Fever | 1977-11-15 | good |
+-----------------+-------------------------------+--------------+-------+
10 rows in set (0.00 sec)
Based on this output, no album was scored as “mediocre” since they’ve each sold more than 20 million copies. However, there are some albums that stand out among the rest based on the scores. Now you can provide Aunt Carol with solid evidence for playing AC/DC or Whitney Houston since their albums sold more than 25 million copies, making them two of the greatest musical works out there.
Now you have an understanding of how to use the CASE
expression to set conditions for various purposes and with character and numerical data values. Also, how CASE
uses the if-then
logic to compare those values and generate the responses based on your desired conditions.
Understanding how to use the CASE
expression can help narrow down your data to whatever conditions you set. Whether you want to set different priorities for certain values or score them based on criteria from popular opinion or numbers, it’s flexible to your needs. If you want to learn about other ways you can manipulate data values in your result sets, check out our guide on CAST
functions and concatenation expressions.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
Structured Query Language — commonly known as SQL — is a language used to define, control, manipulate, and query data held in a relational database. SQL has been widely adopted since it was first developed in the 1970s, and today it’s the predominant language used to manage relational database management systems.
Ideal for managing structured data (data that can fit neatly into an existing data model), SQL is an essential tool for developers and system administrators in a wide variety of contexts. Also, because of its maturity and prevalence, candidates with SQL experience are highly sought after for jobs across a number of industries.
This series is intended to help you get started with using SQL. It includes a mix of conceptual articles and tutorials which provide introductions to various SQL concepts and practices. You can also use the entries in this series for reference while you continue to hone your skills with SQL.
Note: Please be aware that the tutorials in this series use MySQL in examples, but many RDBMSs use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!