In Structured Query Language (SQL) statements, WHERE
clauses limit what rows the given operation will affect. They do this by defining specific criteria, referred to as search conditions, that each row must meet in order for it to be impacted by the operation.
This guide will go over the general syntax used in WHERE
clauses. It will also outline how to combine multiple search condition predicates in a single WHERE
clause to filter data with more granularity, as well as how to use the NOT
operator to exclude, rather than include, rows that meet a given search condition.
Although this guide will exclusively use SELECT
statements in its examples, the concepts explained here can be used in a number of SQL operations. In fact, WHERE
clauses are critical components of UPDATE
and DELETE
operations.
In order to follow this guide, you will need a computer running some type of relational database management system (RDBMS) that uses SQL. The instructions and examples in this guide were validated using the following environment:
WHERE
clauses. We encourage you to go through the following Connecting to MySQL and Setting up a Sample Database section for details on how to connect to a MySQL server and create the testing database used in examples throughout this guide.Note: Please note that many RDBMSs use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.
If your SQL database system runs on a remote server, SSH into your server from your local machine:
- ssh sammy@your_server_ip
Then open up the MySQL server prompt, replacing sammy
with the name of your MySQL user account:
- mysql -u sammy -p
From the prompt, create a database named where_db
:
- CREATE DATABASE where_db;
If the database was created successfully, you’ll receive output like this:
OutputQuery OK, 1 row affected (0.01 sec)
To select the where_db
database, run the following USE
statement:
- USE where_db;
OutputDatabase changed
After selecting where_db
, create a table within it.
To follow along with the examples used in this guide, imagine that you run a golf league at a local golf course. You decide to track information about the individual performances of the league’s players at the outings they attend. To do so, you decide to store the information in an SQL database.
You decide that this table will need six columns:
name
: each of the golfers’ names, expressed using the varchar
data type with a maximum of 20 charactersrounds_played
: the total number of complete rounds each golfer has played, expressed with the int
data typebest
: each golfer’s best, or lowest, score for an individual outing, also expressed as an int
worst
: each golfer’s worst, or highest, score for an individual outing, again expressed as an int
average
: an approximate average of each golfers’ scores over the rounds they’ve played. This column will hold values of the decimal
type, limited to a maximum of 4 digits with one of those digits to the right of the decimal pointwins
: the number of rounds in which each golfer had the lowest score of everyone in the playing group, expressed using the int
typeRun the following CREATE TABLE
statement to create a table named golfers
that has these six columns:
- CREATE TABLE golfers (
- name varchar(20),
- rounds_played int,
- best int,
- worst int,
- average decimal (4,1),
- wins int
- );
Then load the golfers
table with some sample data. Run the following INSERT INTO
operation to add seven rows of data representing seven of the league’s golfers:
- INSERT INTO golfers
- VALUES
- ('George', 22, 68, 103, 84.6, 3),
- ('Pat', 25, 65, 74, 68.7, 9),
- ('Grady', 11, 78, 118, 97.6, 0),
- ('Diane', 23, 70, 92, 78.8, 1),
- ('Calvin', NULL, 63, 76, 68.5, 7),
- ('Rose', NULL, 69, 84, 76.7, 4),
- ('Raymond', 18, 67, 92, 81.3, 1);
Notice that two of these rows’ rounds_played
values are NULL
. For the purposes of this tutorial, assume that these golfers haven’t reported how many rounds they’ve played and thus those values are recorded as NULL
.
You may also notice that each golfer’s best
value is less than their worst
. This is because, in common golf rules, a golfer’s score is determined by the number of strokes it takes for them to get their ball into each hole in the course, with the winner being the person with the fewest total strokes. Thus, unlike most other sports, a golfer’s best score will be lower than their worst.
With that, you’re ready to follow the rest of the guide and begin learning how to use WHERE
clauses in SQL.
WHERE
clausesIn SQL, a statement is any operation sent to the database system that will perform some sort of task, like creating a table, inserting or deleting data, or changing the structure of a column or table. SQL statements are made up of various clauses, which consist of certain keywords and the information that they require.
As mentioned in the introduction, WHERE
clauses allow you to filter out certain rows of data from being impacted by an SQL operation. In queries, the WHERE
clause comes after the FROM
clause, as in the following example:
- SELECT columns_to_query
- FROM table_to_query
- WHERE search_condition;
Following the WHERE
keyword is a search condition. A search condition is a set of one or more predicates, or expressions that can evaluate one or more value expressions and return a result of either “true,” “false,” or “unknown.” Note that in cases where a search condition only contains a single predicate, the terms “search condition” and “predicate” are synonymous.
Predicates in a WHERE
clause search condition can take many forms, but they typically follow this syntax:
. . .
WHERE column_name OPERATOR value_expression
. . .
In SQL, a value expression — also sometimes referred to as a scalar expression — is any expression that will return a single value. A value expression can be a literal value, like a string or numeric value, a mathematical expression. Most often, though, at least one of the value expressions in a WHERE
clause’s search condition will be a column name.
When running SQL queries that contain a WHERE
clause, the database management system will apply the search condition to every row in the logical table defined by the FROM
clause. It will then return only the rows for which every predicate in the search condition evaluates to “true.”
To illustrate this idea, run the following query. This will return every value from the golfers
table’s name
column:
- SELECT name
- FROM golfers
- WHERE (2 + 2) = 4;
This query includes a WHERE
clause, but instead of specifying a column name it uses (2 + 2)
as the first value expression and tests whether it’s equal to the second value expression, 4
. Because (2 + 2)
is always equal to 4
, this search condition evaluates to “true” for every row. Consequently, every row gets returned in the result set:
Output+---------+
| name |
+---------+
| George |
| Pat |
| Grady |
| Diane |
| Calvin |
| Rose |
| Raymond |
+---------+
7 rows in set (0.01 sec)
This WHERE
clause isn’t very useful, since it always evaluates to “true” and always returns every row in the table. As mentioned previously, you will typically use at least one column name as a value expression in WHERE
clause search conditions. When running queries, the database system will apply the search condition to each row individually in sequence. By providing a column name as a value expression in a search condition, you’re telling the DBMS to use each row’s value from that column as the value expression for that row’s iteration of the search condition.
The following query’s WHERE
clause applies a more exclusive search condition to each row than the previous example. It will return the name
and wins
values from any row whose wins
column value is equal to 1
:
- SELECT name, wins
- FROM golfers
- WHERE wins = 1;
Only two golfers have won exactly one round, so the query only returns those two rows:
Output+---------+------+
| name | wins |
+---------+------+
| Diane | 1 |
| Raymond | 1 |
+---------+------+
2 rows in set (0.01 sec)
The previous examples use the equals sign (=
) to test whether two value expressions are equivalent, but the operator you use depends on what type of predicate you want to use to filter your result sets.
The SQL standard defines 18 types of predicates, although not all of them are included in every SQL implementation. Here are five of the most commonly used predicate types, as well as a brief explanation of each one and the operators they use:
Comparison predicates use a comparison operator to compare one value (in queries, typically values in a specified column) with another. The six comparison operators are:
=
: tests whether the two values are equal- SELECT name
- FROM golfers
- WHERE name = 'George';
Output+--------+
| name |
+--------+
| George |
+--------+
1 row in set (0.00 sec)
<>
: tests whether two values are not equal- SELECT name, wins
- FROM golfers
- WHERE wins <> 1;
Output+--------+------+
| name | wins |
+--------+------+
| George | 3 |
| Pat | 9 |
| Grady | 0 |
| Calvin | 7 |
| Rose | 4 |
+--------+------+
5 rows in set (0.00 sec)
<
: tests whether the first value is less than the second- SELECT name, wins
- FROM golfers
- WHERE wins < 1;
Output+-------+------+
| name | wins |
+-------+------+
| Grady | 0 |
+-------+------+
1 row in set (0.00 sec)
>
: tests whether the first value is greater than the second- SELECT name, wins
- FROM golfers
- WHERE wins > 1;
Output+--------+------+
| name | wins |
+--------+------+
| George | 3 |
| Pat | 9 |
| Calvin | 7 |
| Rose | 4 |
+--------+------+
4 rows in set (0.00 sec)
<=
: tests whether the first value is less than or equal to the second- SELECT name, wins
- FROM golfers
- WHERE wins <= 1;
Output+---------+------+
| name | wins |
+---------+------+
| Grady | 0 |
| Diane | 1 |
| Raymond | 1 |
+---------+------+
3 rows in set (0.00 sec)
>=
: tests whether the first value is greater than or equal to the second- SELECT name, wins
- FROM golfers
- WHERE wins >= 1;
Output+---------+------+
| name | wins |
+---------+------+
| George | 3 |
| Pat | 9 |
| Diane | 1 |
| Calvin | 7 |
| Rose | 4 |
| Raymond | 1 |
+---------+------+
6 rows in set (0.00 sec)
Predicates that use the IS NULL
operator test whether values in a given column are Null. If so, the predicate evaluates to “true” and the row is included in the result set:
- SELECT name, rounds_played
- FROM golfers
- WHERE rounds_played IS NULL;
Output+--------+---------------+
| name | rounds_played |
+--------+---------------+
| Calvin | NULL |
| Rose | NULL |
+--------+---------------+
2 rows in set (0.00 sec)
Range predicates use the BETWEEN
operator to test whether the specified column values fall between two value expressions:
- SELECT name, best
- FROM golfers
- WHERE best BETWEEN 67 AND 73;
Output+---------+------+
| name | best |
+---------+------+
| George | 68 |
| Diane | 70 |
| Rose | 69 |
| Raymond | 67 |
+---------+------+
4 rows in set (0.00 sec)
Membership predicates use the IN
operator to test whether a value is a member of a given set:
- SELECT name, best
- FROM golfers
- WHERE best IN (65, 67, 69, 71);
Output+---------+------+
| name | best |
+---------+------+
| Pat | 65 |
| Rose | 69 |
| Raymond | 67 |
+---------+------+
3 rows in set (0.00 sec)
Pattern matching predicates use the LIKE
operator to test whether a value matches a string pattern that contains one or more wildcard characters, also known as wildcards. SQL defines two wildcards, %
and _
:
_
: an underscore represents a single unknown character- SELECT name, rounds_played
- FROM golfers
- WHERE rounds_played LIKE '2_';
Output+--------+---------------+
| name | rounds_played |
+--------+---------------+
| George | 22 |
| Pat | 25 |
| Diane | 23 |
+--------+---------------+
3 rows in set (0.00 sec)
%
: a percentage sign represents zero or more unknown characters- SELECT name, rounds_played
- FROM golfers
- WHERE name LIKE 'G%';
Output+--------+---------------+
| name | rounds_played |
+--------+---------------+
| George | 22 |
| Grady | 11 |
+--------+---------------+
2 rows in set (0.00 sec)
It’s beyond the scope of this tutorial to go into each of these predicate types in greater detail. If you’d like to learn more about them, though, we encourage you to check out the following guides:
BETWEEN
and IN
Operators in SQLAND
and OR
There may be times when you need more granularly filtered results than a WHERE
clause with a single search condition predicate can provide. On the other hand, there could also be times when rows that meet one of several search conditions would be acceptable in a result set. In cases like these, you can write WHERE
clauses that include multiple predicates with the AND
or OR
operators, respectively.
To get started with these operators, run the following query which returns values from the golfers
table’s name
, best
, worst
, and average
columns. Its WHERE
clause includes two predicates, separated by AND
:
- SELECT name, best, worst, average
- FROM golfers
- WHERE best < 70 AND worst < 96;
The first predicate tests whether each row’s best
value is less than 70, while the second tests whether each row’s worst
value is less than 96. If either test evaluates to “false” for a row, that row will not get returned in the result set:
Output+---------+------+-------+---------+
| name | best | worst | average |
+---------+------+-------+---------+
| Pat | 65 | 74 | 68.7 |
| Calvin | 63 | 76 | 68.5 |
| Rose | 69 | 84 | 76.7 |
| Raymond | 67 | 92 | 81.3 |
+---------+------+-------+---------+
4 rows in set (0.00 sec)
Next, run the following query. This is identical to the previous example, except that it separates the two predicates with the OR
operator instead of AND
:
- SELECT name, best, worst, average
- FROM golfers
- WHERE best < 70 OR worst < 96;
Because only one of the predicates must evaluate to “true” for a row to be returned, this result set includes two more rows than the previous example:
Output+---------+------+-------+---------+
| name | best | worst | average |
+---------+------+-------+---------+
| George | 68 | 103 | 84.6 |
| Pat | 65 | 74 | 68.7 |
| Diane | 70 | 92 | 78.8 |
| Calvin | 63 | 76 | 68.5 |
| Rose | 69 | 84 | 76.7 |
| Raymond | 67 | 92 | 81.3 |
+---------+------+-------+---------+
6 rows in set (0.00 sec)
You can include as many predicates as you’d like in a single WHERE
clause as long as you combine them with the correct syntax. As your search conditions get more complex, though, it can become difficult to predict what data they will filter.
It’s important to note that database systems generally give precedence to AND
operators. This means that any predicates separated by an AND
operator (or operators in the case of more than two predicates) are treated as a single, isolated search condition that gets tested before any other predicates in a WHERE
clause.
To illustrate, run the following query, which returns values from the name
, average
, worst
, and rounds_played
columns for any row that meets the search conditions defined in the WHERE
clause:
- SELECT name, average, worst, rounds_played
- FROM golfers
- WHERE average < 85 OR worst < 95 AND rounds_played BETWEEN 19 AND 23;
This query first tests whether the predicates separated by the AND
operator — worst < 95
and rounds_played BETWEEN 19 AND 23
— both evaluate to “true” for the row in the current iteration. If so, then that row will appear in the result set. But if either evaluates to “false”, the query will then check whether the current row’s average
value is less than 85. If so, the row will be returned:
Output+---------+---------+-------+---------------+
| name | average | worst | rounds_played |
+---------+---------+-------+---------------+
| George | 84.6 | 103 | 22 |
| Pat | 68.7 | 74 | 25 |
| Diane | 78.8 | 92 | 23 |
| Calvin | 68.5 | 76 | NULL |
| Rose | 76.7 | 84 | NULL |
| Raymond | 81.3 | 92 | 18 |
+---------+---------+-------+---------------+
6 rows in set (0.00 sec)
You can prioritize a set two or more predicates by wrapping them in parentheses. The following example is identical to the previous one, but it wraps the average < 85
and worst < 95
predicates, separated by an OR
operator, in parentheses:
- SELECT name, average, worst, rounds_played
- FROM golfers
- WHERE (average < 85 OR worst < 95) AND rounds_played BETWEEN 19 AND 23;
Because the first two predicates are surrounded by parentheses, the subsequent AND
operator treats them as a discrete search condition that must evaluate to “true”. If both of these predicates — average < 85
and worst < 95
— evaluate to “false”, then the entire search condition evaluates to “false” and the query immediately drops the row from the result set before moving on to evaluate the next one.
However, if either of these first two predicates evaluate to “true”, the query then tests whether the given golfer’s rounds_played
value is between 19 and 23. If so, that row gets returned in the result set:
Output+--------+---------+-------+---------------+
| name | average | worst | rounds_played |
+--------+---------+-------+---------------+
| George | 84.6 | 103 | 22 |
| Diane | 78.8 | 92 | 23 |
+--------+---------+-------+---------------+
2 rows in set (0.00 sec)
As this output indicates, by prioritizing sets of predicates and wrapping them in parentheses, otherwise identical queries can return significantly different result sets.
Although it isn’t always necessary to do so, it’s recommended that you always include parentheses when combining more than two predicates in a single search condition. Doing so can help make queries more readable and easier to understand.
NOT
All of this guide’s examples so far have focused on how to write queries with WHERE
clauses that only include rows that meet the specified search conditions in their result set. However, you can write queries that exclude specific rows by including the NOT
operator in your WHERE
clauses.
Range, membership, and pattern matching predicates clauses that include the NOT
operator generally follow this syntax:
- . . .
- WHERE column_name NOT OPERATOR value_expression
- . . .
To illustrate, run the following query. This will return values from the golfers
table’s name
column, but the NOT
operator in its WHERE
clause will cause the DBMS to exclude any rows that match the wildcard pattern:
- SELECT name
- FROM golfers
- WHERE name NOT LIKE 'R%';
Output+--------+
| name |
+--------+
| George |
| Pat |
| Grady |
| Diane |
| Calvin |
+--------+
5 rows in set (0.00 sec)
Things get a little different when adding the NOT
operator to IS NULL
predicates. In such cases you place the NOT
between IS
and NULL
, as in the following example. This query returns the name
and rounds_played
values of every golfer whose rounds_played
value isn’t Null:
- SELECT name, rounds_played
- FROM golfers
- WHERE rounds_played IS NOT NULL;
Output+---------+---------------+
| name | rounds_played |
+---------+---------------+
| George | 22 |
| Pat | 25 |
| Grady | 11 |
| Diane | 23 |
| Raymond | 18 |
+---------+---------------+
5 rows in set (0.00 sec)
You can also place the NOT
operator immediately after the WHERE
keyword. This is useful if you’re excluding rows based on whether they meet multiple search conditions, as in this example query which returns the golfers’ name
, average
, best
, and wins
values:
- SELECT name, average, best, wins
- FROM golfers
- WHERE NOT (average < 80 AND best < 70) OR wins = 9;
Output+---------+---------+------+------+
| name | average | best | wins |
+---------+---------+------+------+
| George | 84.6 | 68 | 3 |
| Pat | 68.7 | 65 | 9 |
| Grady | 97.6 | 78 | 0 |
| Diane | 78.8 | 70 | 1 |
| Raymond | 81.3 | 67 | 1 |
+---------+---------+------+------+
5 rows in set (0.00 sec)
Take note of this result set’s second row. Pat’s average
score is less than 80 and her best
score is less than 70. However, her row is still included in the result set, as the NOT
operator only negates the search condition wrapped in parentheses.
Recall that when you wrap multiple predicates separated by AND
or OR
in parentheses, SQL will prioritize those predicates and treat them as a single isolated search condition. Because of this, the NOT
operator only excludes rows based on the first two predicates, average < 80
and best < 70
. But it includes rows based on the third predicate, wins = 9
.
You can rewrite the query to exclude rows based on the third predicate along with the first two by wrapping all three in parentheses, like this:
- SELECT name, average, best, wins
- FROM golfers
- WHERE NOT ((average < 80 AND best < 70) OR wins = 9);
Output+---------+---------+------+------+
| name | average | best | wins |
+---------+---------+------+------+
| George | 84.6 | 68 | 3 |
| Grady | 97.6 | 78 | 0 |
| Diane | 78.8 | 70 | 1 |
| Raymond | 81.3 | 67 | 1 |
+---------+---------+------+------+
4 rows in set (0.00 sec)
Depending on its SQL implementation, your database system may consider a query’s syntax invalid if you include NOT
before a comparison operator. As an example, try running this query:
- SELECT name
- FROM golfers
- WHERE name NOT = 'Grady';
On MySQL and its derivatives, this will cause an error:
OutputERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '= 'Grady'' at line 1
The reason for this error is that the NOT
operator generally isn’t used with comparison operators (=
, <>
, <
, <=
, >
, and >=
), since you can achieve the opposite effect of one comparison operator by replacing it with another that would return the rows that the first would exclude. For example, you can replace the equivalence operator (=
) with the inequivalence operator (<>
).
By reading this guide, you learned how to write WHERE
clauses so a query only returns rows that meet a specified condition. You also learned how to combine multiple predicates and search conditions in a single query, as well as how to use the NOT
keyword to exclude information from result sets.
While the commands shown here should work on most relational databases, be aware that every SQL database uses its own unique implementation of the language. You should consult your DBMS’s official documentation for a more complete description of each command and their full sets of options.
If you’d like to learn more about working with SQL, we encourage you to check out the other tutorials in this series on How To Use SQL.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
Structured Query Language — commonly known as SQL — is a language used to define, control, manipulate, and query data held in a relational database. SQL has been widely adopted since it was first developed in the 1970s, and today it’s the predominant language used to manage relational database management systems.
Ideal for managing structured data (data that can fit neatly into an existing data model), SQL is an essential tool for developers and system administrators in a wide variety of contexts. Also, because of its maturity and prevalence, candidates with SQL experience are highly sought after for jobs across a number of industries.
This series is intended to help you get started with using SQL. It includes a mix of conceptual articles and tutorials which provide introductions to various SQL concepts and practices. You can also use the entries in this series for reference while you continue to hone your skills with SQL.
Note: Please be aware that the tutorials in this series use MySQL in examples, but many RDBMSs use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!