One of the most fundamental parts of working with databases is the practice of retrieving information about the data held within them. In relational database management systems, any operation used to retrieve information from a table is referred to as a query.
In this guide, we will discuss the syntax of queries in Structured Query Language (SQL) as well as some of their more commonly used functions and operators.
In order to follow this guide, you will need a computer running some type of relational database management system (RDBMS) that uses SQL. The instructions and examples in this guide were validated using the following environment:
Note: Please note that many RDBMSs use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.
If your SQL database system runs on a remote server, SSH into your server from your local machine:
- ssh sammy@your_server_ip
Then open up the MySQL server prompt, replacing sammy
with the name of your MySQL user account:
- mysql -u sammy -p
From the prompt, create a database named queries_db
:
- CREATE DATABASE queries_db;
If the database was created successfully, you’ll receive output like this:
OutputQuery OK, 1 row affected (0.01 sec)
To select the queries_db
database, run the following USE
statement:
- USE queries_db;
OutputDatabase changed
After selecting queries_db
, create a few tables within it.
To follow along with the examples used in this guide, imagine that you run a public parks cleanup initiative in New York City. The program is made up of volunteers who commit to cleaning up a city park near their homes by regularly picking up litter. Upon joining the initiative, these volunteers each set a goal of how many trash bags of litter they’d like to pick up each week. You decide to store information about the volunteers’ goals in an SQL database with a table that has five columns:
vol_id
: each volunteer’s identification number, expressed with the int
data type. This column will serve as the table’s primary key, meaning that each value will function as a unique identifier for its respective row. Because every value in a primary key must be unique, this column will also have a UNIQUE
constraint applied to itname
: each volunteer’s name, expressed using the varchar
data type with a maximum of 20 characterspark
: the name of the park where each volunteer will pick up litter, expressed using the varchar
data type with a maximum of 20 characters. Note that multiple volunteers can clean up litter in the same parkweekly_goal
: each volunteer’s goal for how many bags of litter they’d like to pick up in a week, expressed with the int
typemax_bags
: each volunteer’s personal record for the most bags of litter they picked up in a single week, expressed as an int
Run the following CREATE TABLE
statement to create a table named volunteers
that has these five columns:
- CREATE TABLE volunteers (
- vol_id int UNIQUE,
- name varchar(20),
- park varchar(30),
- weekly_goal int,
- max_bags int,
- PRIMARY KEY (vol_id)
- );
Then load the volunteers
table with some sample data. Run the following INSERT INTO
operation to add seven rows of data representing seven of the program’s volunteers:
- INSERT INTO volunteers
- VALUES
- (1, 'Gladys', 'Prospect Park', 3, 5),
- (2, 'Catherine', 'Central Park', 2, 2),
- (3, 'Georgeanna', 'Central Park', 2, 1),
- (4, 'Wanda', 'Van Cortland Park', 1, 1),
- (5, 'Ann', 'Prospect Park', 2, 7),
- (6, 'Juanita', 'Riverside Park', 1, 4),
- (7, 'Georgia', 'Prospect Park', 1, 3);
With that, you’re ready to follow the rest of the guide and begin learning how to create queries in SQL.
SELECT
and FROM
ClausesIn SQL, a statement is any operation sent to the database system that will perform some sort of task, like creating a table, inserting or deleting data, or changing the structure of a column or table. A query is an SQL statement that retrieves information about data held in a database.
On its own, a query will not change any existing data held in a table. It will only return the information about the data which the author of the query explicitly requests. The information returned by a given query is referred to as its result set. Result sets typically consist of one or more columns from a specified table, and each column returned in a result set can hold one or more rows of information.
Here’s the general syntax of an SQL query:
- SELECT columns_to_return
- FROM table_to_query;
SQL statements are made up of various clauses, which consist of certain keywords and the information that these keywords require. At a minimum, SQL queries only require you to include two clauses: the SELECT
and FROM
clauses.
Note: In this example syntax, both clauses are written on their own line. However, any SQL statement can alternatively be written on a single line, like this:
- SELECT columns_to_return FROM table_to_query;
This guide will follow the common SQL style convention of separating statements onto multiple lines so each line contains only one clause. This aimed to make each example more readable and understandable, but be aware that as long as you don’t include any syntax errors you can write any query on a single line or on as many lines as you’d like.
Every SQL query begins with a SELECT
clause, leading some to refer to queries generally as SELECT
statements. After the SELECT
keyword comes a list of whatever columns you want returned in the result set. These columns are drawn from the table specified in the FROM
clause.
In SQL queries, the order of execution begins with the FROM
clause. This can be confusing since the SELECT
clause is written before the FROM
clause, but keep in mind that the RDBMS must first know the full working data set to be queried before it starts retrieving information from it. It can be helpful to think of queries as SELECT
-ing the specified columns FROM
the specified table. Lastly, it’s important to note that every SQL statement must end with a semicolon (;
).
As an example, run the following query. This will retrieve the name
column from the volunteers
table:
- SELECT name
- FROM volunteers;
Here’s this query’s result set:
Output+------------+
| name |
+------------+
| Gladys |
| Catherine |
| Georgeanna |
| Wanda |
| Ann |
| Juanita |
| Georgia |
+------------+
7 rows in set (0.00 sec)
Even though this operation looked at the entire volunteers
table, it only returns the specified column, name
.
You can retrieve information from multiple columns by separating each one’s name with a comma, as in the following query. This will return the vol_id
, name
, and park
columns from the volunteers
table:
- SELECT park, name, vol_id
- FROM volunteers;
Output+-------------------+------------+--------+
| park | name | vol_id |
+-------------------+------------+--------+
| Prospect Park | Gladys | 1 |
| Central Park | Catherine | 2 |
| Central Park | Georgeanna | 3 |
| Van Cortland Park | Wanda | 4 |
| Prospect Park | Ann | 5 |
| Riverside Park | Juanita | 6 |
| Prospect Park | Georgia | 7 |
+-------------------+------------+--------+
7 rows in set (0.00 sec)
Notice that this result set returns the park
column first, followed by the name
column and then vol_id
. SQL databases will generally return columns in whatever order they’re listed in the SELECT
clause.
There may be times when you want to retrieve every column from a table. Rather than writing out the name of every column in your query, you can instead enter an asterisk (*
). In SQL, this is shorthand for “every column.”
The following query will return every column from the volunteers
table:
- SELECT *
- FROM volunteers;
Output+--------+------------+-------------------+-------------+----------+
| vol_id | name | park | weekly_goal | max_bags |
+--------+------------+-------------------+-------------+----------+
| 1 | Gladys | Prospect Park | 3 | 5 |
| 2 | Catherine | Central Park | 2 | 2 |
| 3 | Georgeanna | Central Park | 2 | 1 |
| 4 | Wanda | Van Cortland Park | 1 | 1 |
| 5 | Ann | Prospect Park | 2 | 7 |
| 6 | Juanita | Riverside Park | 1 | 4 |
| 7 | Georgia | Prospect Park | 1 | 3 |
+--------+------------+-------------------+-------------+----------+
7 rows in set (0.00 sec)
Notice how this result set’s columns are listed in the same order in which they were defined in the CREATE TABLE
statement from the previous Connecting to MySQL and Setting up a Sample Database section. This is how most relational database systems will order columns in the result set when running a query that uses an asterisk in place of individual column names.
Be aware that you can retrieve information from multiple tables in the same query with the JOIN
keyword. We encourage you to follow our guide on How To Use Joins in SQL for details on how to do this.
DISTINCT
By default, RDBMSs will return every value from a column returned by a query, including duplicate values.
As an example, run the following query. This will return the values from the volunteers
table’s park
column:
- SELECT park
- FROM volunteers;
Output+-------------------+
| park |
+-------------------+
| Prospect Park |
| Central Park |
| Central Park |
| Van Cortland Park |
| Prospect Park |
| Riverside Park |
| Prospect Park |
+-------------------+
7 rows in set (0.00 sec)
Notice how this result set includes two duplicated values: Prospect Park
and Central Park
. This makes sense, since multiple volunteers can clean up litter in the same park. There may be times, though, when you only want to know what unique values are held in a column. You can issue queries that remove duplicate values by following SELECT
with the DISTINCT
keyword.
The following query will return every unique value in the parks
column, removing any duplicates. It’s identical to the previous query except that it includes the DISTINCT
keyword:
- SELECT DISTINCT park
- FROM volunteers;
Output+-------------------+
| park |
+-------------------+
| Prospect Park |
| Central Park |
| Van Cortland Park |
| Riverside Park |
+-------------------+
4 rows in set (0.00 sec)
This query’s result set has three fewer rows than that of the previous one, since it removed one of the Central Park
values and two of the Prospect Park
values.
Note that SQL treats every row of a result set as an individual record, and DISTINCT
will only eliminate duplicates if multiple rows share identical values in each column
To illustrate this, issue the following query that includes the DISTINCT
keyword but returns both the name
and park
columns:
- SELECT DISTINCT name, park
- FROM volunteers;
Output+------------+-------------------+
| name | park |
+------------+-------------------+
| Gladys | Prospect Park |
| Catherine | Central Park |
| Georgeanna | Central Park |
| Wanda | Van Cortland Park |
| Ann | Prospect Park |
| Juanita | Riverside Park |
| Georgia | Prospect Park |
+------------+-------------------+
7 rows in set (0.00 sec)
The duplicate values in the park
column — three occurrences of Prospect Park
and two of Central Park
— appear in this result set, even though the query included the DISTINCT
keyword. Although individual columns in a result set may contain duplicate values, an entire row must be an exact duplicate of another for it to be removed by DISTINCT
. In this case, every value in the name
column is unique so DISTINCT
doesn’t remove any rows when that column is specified in the SELECT
clause.
WHERE
clausesThere may be times when you want to retrieve more granular information from tables in your database. You can filter out certain rows by including a WHERE
clause in your query after the FROM
clause, as in:
- SELECT columns_to_return
- FROM table_to_query
- WHERE search_condition;
Following the WHERE
keyword in this example syntax is a search condition, which is what actually determines which rows get filtered out from the result set. A search condition is a set of one or more predicates, or expressions that can evaluate one or more value expressions. In SQL, a value expression — also sometimes referred to as a scalar expression — is any expression that will return a single value. A value expression can be a literal value (like a string or numeric value), a mathematical expression, or a column name.
Predicates in a WHERE
clause search condition can take many forms, but they typically follow this syntax:
. . .
WHERE value expression OPERATOR value_expression
. . .
After the WHERE
keyword, you provide a value expression followed by one of several special SQL operators used to evaluate the column’s values against the value expression (or value expressions) that comes after the operator. There are several such operators available in SQL and this guide will briefly highlight a few of them later in this section, but for illustration purposes it will focus only on one of the most commonly used operators: the equals sign (=
). This operator tests whether two value expressions are equivalent.
Predicates always return a result of either “true,” “false,” or “unknown.” When running SQL queries that contain a WHERE
clause, the DBMS will apply the search condition sequentially to every row in the table defined in the FROM
clause. It will only return the rows for which every predicate in the search condition evaluates to “true.”
To illustrate this idea, run the following SELECT
statement. This query returns values from the volunteers
table’s name
column. Instead of evaluating values from one of the table’s columns, however, this WHERE
clause tests whether two value expressions — (2 + 2)
and 4
— are equivalent:
- SELECT name
- FROM volunteers
- WHERE (2 + 2) = 4;
Because (2 + 2)
is always equal to 4
, this search condition evaluates to “true” for every row in the table. Consequently, every row’s name
value gets returned in the result set:
Output+------------+
| name |
+------------+
| Gladys |
| Catherine |
| Georgeanna |
| Wanda |
| Ann |
| Juanita |
| Georgia |
+------------+
7 rows in set (0.00 sec)
Because this search condition always returns a result of “true,” it isn’t very useful. You may as well not include a WHERE
clause at all, since SELECT name FROM volunteers;
will produce the same result set.
Rather than comparing two literal values like this, you’ll typically use a column name as one of the value expressions in a WHERE
clause’s search condition. By doing so, you’re telling the database management system to use each row’s value from that column as a value expression for that row’s iteration of the search condition.
The following query’s WHERE
clause applies a more exclusive search condition to each row. It will return the name
and max_bags
values from any row whose max_bags
value is equal to 4
:
- SELECT name, max_bags
- FROM volunteers
- WHERE max_bags = 4;
Only one volunteer’s max_bags
value is exactly equal to 4
, so the query only returns that volunteer’s record:
Output+---------+----------+
| name | max_bags |
+---------+----------+
| Juanita | 4 |
+---------+----------+
1 row in set (0.00 sec)
You can also evaluate character string values in search condition predicates. The following query returns the vol_id
and name
values of every row whose name
value is equal to 'Wanda'
:
- SELECT vol_id, name
- FROM volunteers
- WHERE name = 'Wanda';
Because there’s only one volunteer named Wanda
, the query only returns the information from that row:
Output+--------+-------+
| vol_id | name |
+--------+-------+
| 4 | Wanda |
+--------+-------+
1 row in set (0.00 sec)
To reiterate, this section’s examples all use the same search condition operator — the equals sign — to filter data. However, there are a number of other types of operators that allow you to write a variety of predicates, offering a high level of control over what information your queries return.
The SQL standard defines 18 different types of predicates, though not all of them are supported on every RDBMS. Here are five of the most commonly used search condition predicate types and the operators they use:
Comparison: Comparison predicates compare one value expression with another; in queries, it’s almost always the case that at least one of these value expressions is the name of a column. The six comparison operators are:
=
: tests whether the two values are equivalent<>
: tests whether two values are not equivalent<
: tests whether the first value is less than the second>
: tests whether the first value is greater than the second<=
: tests whether the first value is less than or equal to the second>=
: tests whether the first value is greater than or equal to the secondNull: Predicates that use the IS NULL
operator test whether values in a given column are Null
Range: Range predicates use the BETWEEN
operator to test whether one value expression falls between two others
Membership: This type of predicate uses the IN
operator to test whether a value is a member of a given set
Pattern Match: Pattern matching predicates use the LIKE
operator to test whether a value matches a string pattern containing wildcard values
It’s beyond the scope of this tutorial to go into each of these predicate types in greater detail. If you’d like to learn more about them, though, we encourage you to check out the following guides:
BETWEEN
and IN
Operators in SQLTo learn more about WHERE
clauses generally, please see our guide on How To Use WHERE Clauses in SQL.
ORDER BY
Sometimes queries will return information in ways that may not be intuitive, or may not suit your particular needs. You can sort query results by appending an ORDER BY
clause to the end of your query statement.
Here’s the general syntax of a query with an ORDER BY
clause:
- SELECT columns_to_return
- FROM table_to_query
- ORDER BY column_name;
To illustrate how this works, say you wanted to know which of your volunteers has the highest max_bags
value. You could run the following query which returns the name
and max_bags
values from the volunteers
table:
- SELECT name, max_bags
- FROM volunteers;
However, this query sorts the result set in the order in which each row was added:
Output+------------+----------+
| name | max_bags |
+------------+----------+
| Gladys | 5 |
| Catherine | 2 |
| Georgeanna | 1 |
| Wanda | 1 |
| Ann | 7 |
| Juanita | 4 |
| Georgia | 3 |
+------------+----------+
7 rows in set (0.00 sec)
For a relatively small data set like this, the order of a result set isn’t that important and you could just scan this result set’s max_bags
values to find the highest one. However, this can quickly become tedious when working with larger amounts of data.
Instead, you could run the same query but add an ORDER BY
clause that sorts the result set based each row’s max_bags
value:
- SELECT name, max_bags
- FROM volunteers
- ORDER BY max_bags;
Output+------------+----------+
| name | max_bags |
+------------+----------+
| Georgeanna | 1 |
| Wanda | 1 |
| Catherine | 2 |
| Georgia | 3 |
| Juanita | 4 |
| Gladys | 5 |
| Ann | 7 |
+------------+----------+
7 rows in set (0.00 sec)
As this output indicates, the default behavior of SQL queries that include an ORDER BY
clause is to sort the specified column’s values in ascending (increasing) order. You can change this behavior and sort them in descending order by appending the DESC
keyword to the ORDER BY
clause:
- SELECT name, max_bags
- FROM volunteers
- ORDER BY max_bags DESC;
Output+------------+----------+
| name | max_bags |
+------------+----------+
| Ann | 7 |
| Gladys | 5 |
| Juanita | 4 |
| Georgia | 3 |
| Catherine | 2 |
| Georgeanna | 1 |
| Wanda | 1 |
+------------+----------+
7 rows in set (0.00 sec)
By reading this guide, you learned how to write basic queries, as well as filter and sort query result sets. While the commands shown here should work on most relational databases, be aware that every SQL database uses its own unique implementation of the language. You should consult your DBMS’s official documentation for a more complete description of each command and their full sets of options.
If you’d like to learn more about working with SQL, we encourage you to check out the other tutorials in this series on How To Use SQL.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
Structured Query Language — commonly known as SQL — is a language used to define, control, manipulate, and query data held in a relational database. SQL has been widely adopted since it was first developed in the 1970s, and today it’s the predominant language used to manage relational database management systems.
Ideal for managing structured data (data that can fit neatly into an existing data model), SQL is an essential tool for developers and system administrators in a wide variety of contexts. Also, because of its maturity and prevalence, candidates with SQL experience are highly sought after for jobs across a number of industries.
This series is intended to help you get started with using SQL. It includes a mix of conceptual articles and tutorials which provide introductions to various SQL concepts and practices. You can also use the entries in this series for reference while you continue to hone your skills with SQL.
Note: Please be aware that the tutorials in this series use MySQL in examples, but many RDBMSs use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or output may differ if you test them on a system other than MySQL.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!