Structured Query Language (SQL)
Structured Query Language (SQL) is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling structured data, i.e., data incorporating relations among entities and variables.
— Wikipedia
NOTE
Follow-up information mostly added from Ben Forta’s SQL book 1 and quick-SQL-cheatsheet 2. In examples, I use sqlite databases from w3school DB and Ben Forta’s book.
SQL is a language of structured queries, created to interaction with databases.
SQL is almost universal language, mostly all databases support it (with some
dialect specific differences for non ANSI SQL
features).
SQL is relatively simple but powerful language, it’s consist of limited set of English keywords. It’s possible to make very complex operations on databases with it.
Never use SQL reserved keywords as table or column name.
Query language used to manage data in a relational RDBMS (database).
CRUD term is short name of Create, Read, Update, Delete operations (queries) of data, based on specific criteria.
High-level overview of SQL language:
SQL overview.excalidrawSQL components diagram
DDL, DQL, DML, DCL and TCL Commands
DDL is short name of Data Definition Language, which deals with database schemas and descriptions, of how the data should reside in the database. DDL example:
DQL is short name of Data Query Language which used for performing queries on the data within schema objects. The purpose of the DQL Command is to get some schema relation based on the query passed to it. SELECT statement is used to retrieve data from the database.
DML is short name of Data Manipulation Language which deals with data manipulation, and includes most common SQL statements such INSERT, UPDATE, DELETE etc., and it is used to store, modify, delete and update data in database. Example:
DCL is short name of Data Control Language which includes commands such as GRANT, and mostly concerned with rights, permissions and other controls of the database system. DCL example:
The Transaction Control Language (TCL) is used in conjunction with the Data Manipulation Language to control the processing and exposure of changes, in other words control the transactions.
How to create the people table with the following fields (sqlite):
The SQL SELECT Statement
SELECT
is used to extract data from one or multiple tables, usually with
specific conditions.
Select CustomerName
, CategoryName
from customers and categories (no extra
filtering and join).
CustomerName CategoryName
------------------------------------ --------------
Alfreds Futterkiste Beverages
Alfreds Futterkiste Condiments
Alfreds Futterkiste Confections
Select unique CustomerName
from customers
(filters away duplicate values and
returns rows of specified column).
CustomerName
------------------------------------
Alfreds Futterkiste
Ana Trujillo Emparedados y helados
Antonio Moreno Taquería
For operators that occur at the same precedence level within an expression, evaluation proceeds left to right, with the exception that assignments evaluate right to left.
Select CustomerID
, CustomerName
from users where CustomerId
!= 3 and 1 < id
< 10
AND
has precedence over OR
CustomerID CustomerName
---------- ----------------------------------
2 Ana Trujillo Emparedados y helados
4 Around the Horn
5 Berglunds snabbköp
Lists the suppliers
with a product
price less than 20, use EXISTS
clause.
ProductID ProductName SupplierID CategoryID Unit Price
1 Chais 1 1 10 boxes x 20 bags 18
SupplierID SupplierName ContactName Address City PostalCode Country
1 Exotic Liquid Charlotte Cooper 49 Gilbert St. London EC1 4SD UK
SupplierName
---------------------------------
Exotic Liquid
New Orleans Cajun Delights
Tokyo Traders
The EXISTS operator is used to test for the existence of any record in a subquery. The EXISTS operator returns TRUE if the subquery returns one or more records.
ORDER BY: used to sort the result-set in ascending or descending order.
Select OrderDetailID
, OrderID
, from order_details
, sort by OrderID
in
ascending order and ProductID
in descending order.
OrderDetailID OrderID
------------- -------
3 10248
2 10248
1 10248
The LIMIT clause is used to place an upper bound on the number of rows returned
by the entire SELECT statement.
How to select the 10-20
rows from orders
?
Some databases support SELECT TOP
clause, but not all.
I’ll use LIMIT
clause instead.
OrderID CustomerID EmployeeID OrderDate ShipperID
------- ---------- ---------- ---------- ---------
10258 20 1 1996-07-17 1
10259 13 4 1996-07-18 3
10260 55 4 1996-07-19 1
LIKE
operator used in a WHERE clause to search for a specific pattern in a
column.
How to find all customers
, where CustomerName
contains an
and
their address start with “a” and are at least 3 characters in length, and
address starting have a
in second position.
CustomerName City
----------------------- --------------
Antonio Moreno Taquería México D.F.
Island Trading Cowes
Lehmanns Marktstand Frankfurt a.M.
%
(percent sign) is a wildcard character that represents zero, one, or
multiple characters. _
(underscore) is a wildcard character that represents a
single character.
LIKE
‘a%’ (find any values that start with “a”)LIKE
‘%a’ (find any values that end with “a”)LIKE
‘%or%’ (find any values that have “or” in any position)LIKE
‘_r%’ (find any values that have “r” in the second position)LIKE
‘a_%_%’ (find any values that start with “a” and are at least 3 characters in length)LIKE
‘[a-c]%’ (find any values starting with “a”, “b”, or “c”
IN
operator allows you to specify multiple values in a WHERE
clause
essentially the IN operator is shorthand for multiple OR conditions.
Search for all customers
where their country is one of the following:
- Germany
- France
- UK
SELECT
column_namesFROM
table_nameWHERE
column_nameIN
(value1, value2, …);SELECT
column_namesFROM
table_nameWHERE
column_nameIN
(SELECT STATEMENT
);
BETWEEN operator selects values within a given range inclusive (begin and end
values are included).
Selects all orders with an OrderDate
between '1996-07-04'
and
'1996-07-31'
:
OrderID CustomerID EmployeeID OrderDate ShipperID
------- ---------- ---------- ---------- ---------
10248 90 5 1996-07-04 3
10249 81 6 1996-07-05 1
10250 34 4 1996-07-08 2
SELECT
*FROM
ProductsWHERE
(column_nameBETWEEN
value1AND
value2)AND NOT
column_name2IN
(value3, value4);
If a field in a table is optional, it is possible to insert a new record or
update a record without adding a value to this field. Then, the field will be
saved with a NULL
value.
How to list all customers
with a NULL
value in the Address
field:
no rows in result set
SELECT
*FROM
table_nameWHERE
column_nameIS NULL
;SELECT
*FROM
table_nameWHERE
column_nameIS NOT NULL
;
AS
aliases are used to assign a alias (temporary name) to a table or column.
An alias only exists for the duration of the query.
Create an alias named “Address” that combine four columns (Address, PostalCode,
City and Country):
SELECT CustomerName, Address + ', ' + PostalCode + ' ' + City + ', ' + Country AS Address
FROM Customers;
To get the SQL statement above to work in MySQL use CONCAT()
SELECT
column_nameAS
alias_nameFROM
table_name;SELECT
column_nameFROM
table_nameAS
alias_name;SELECT
column_nameAS
alias_name1, column_name2AS
alias_name2;SELECT
column_name1, column_name2 + ‘, ‘ + column_name3AS
alias_name;
UNION
set operator used to combine the result-set of two or more SELECT
statements. Each SELECT
statement within UNION
must have the same number of
columns. The columns must have similar data types. The columns in each SELECT
statement must also be in the same order. UNION
operator only selects distinct
values, UNION ALL
will allow duplicates.
Return the cities (duplicate values also) from both the "Customers"
and the
"Suppliers"
table:
SELECT
columns_namesFROM
table1UNION SELECT
column_nameFROM
table2;
INTERSECT set operator which is used to return the records that two SELECT
statements have in common. Generally used the same way as UNION (both
queries has same columns).
Returns a list of cities that have both customers and suppliers.
SELECT
columns_namesFROM
table1INTERSECT SELECT
column_nameFROM
table2;
EXCEPT set operator used to return all the records in the first SELECT statement
that are not found in the second SELECT statement. Generally used the same way
as UNION.
Select CustomerID
from customers without any orders:
SELECT
columns_namesFROM
table1EXCEPT SELECT
column_nameFROM
table2;
The ANY
and ALL
operators allow you to perform a comparison between a single
column value and a range of other values.
ANY means that the condition will be true if the operation is true for any of
the values in the range. The ANY
operator:
- returns a
boolean
value as a result - returns
TRUE
if ANY of the subquery values meet the conditionANY
means that the condition will be true if the operation is true for any of the values in the range.
List theProductName
if it findsANY
records in theOrderDetails
table hasQuantity
equal to 10 (this will return TRUE because theQuantity
column has some values of 10). SQLite does not support theANY
operator.
The ALL
operator:
- returns a boolean value as a result
- returns TRUE if
ALL
of the subquery values meet the condition - is used with
SELECT
,WHERE
andHAVING
statementsALL
means that the condition will be true only if the operation is true for all values in the range. Lists theProductName
if ALL the records in theOrderDetails
table hasQuantity
equal to 10. This will of course return FALSE because the Quantity column has many different values (not only the value of 10). SQLite doesn’t supportALL
in theWHERE
clause.
Lists ALL
the product names, including duplicates (using ALL
).
ALL
is the default, and most people write just SELECT
instead of SELECT ALL
.
The GROUP BY
statement groups rows that have the same values into summary
rows, like “find the number of customers in each country”. The GROUP BY
statement is often used with aggregate functions (COUNT()
, MAX()
, MIN()
,
SUM()
, AVG()
) to group the result-set by one or more columns.
List the number of orders (Orders.OrderID
) sent by each shipper
(Shippers.ShipperName
):
HAVING
this clause was added to SQL because the WHERE keyword could not be
used with aggregate functions.
How lists the number of customers in each country, sorted high to low
(Only
include countries with more than 5 customers)?
WITH
often used for retrieving hierarchical data or re-using temp result set
several times in a query. Also referred to as "Common Table Expression"
WITH RECURSIVE
cte AS
(
SELECT
c0.* FROM
categories AS
c0 WHERE
id = 1 # Starting point
UNION ALL
SELECT
c1.* FROM
categories AS
c1 JOIN
cte ON
c1.parent_category_id = cte.id
)
SELECT
*
FROM
cte
Data Modification Queries
INSERT INTO
used to insert new records/rows in a table
How to insert multiple rows of data with INSERT
?
We use the INSERT INTO
statement, but with multiple values:
UPDATE
used to modify the existing records in a table.
How to update the ContactName
to "Juan"
for all customers
records where
country is "Mexico"
:
DELETE
used to delete existing records/rows in a table
Recommended using it with WHERE
clause.
How to delete the customer "Alfreds Futterkiste"
from the "Customers"
table:
Reporting Queries
COUNT returns the number of occurrences.
Use the COUNT()
function and the GROUP BY
clause, to return the number
of records for each category in the Products
table:
MIN()
and MAX()
returns the smallest/largest value of the selected column.
Use the MIN()
function and the GROUP BY
clause, to return the smallest price
for each category in the Products
table:
AVG()
returns the average value of a numeric column.
List all records with a higher price than average, we can use the AVG()
function in a sub query:
SUM()
returns the total sum of a numeric column.
Use the SUM()
function and the GROUP BY
clause, to return the Quantity
for
each OrderID
in the OrderDetails
table:
Join Queries
SQL Join Queries
The joined table will contain all records from both the tables and fill in NULL values for missing matches on either side.
**INNER JOIN returns records that have matching value in both tables.
Create the following SQL statement (that contains a INNER JOIN
), that selects
records that have matching values in orders
and customers
tables,
CustomerID
is less than 10:
SELECT
column_namesFROM
table1INNER JOIN
table2ON
table1.column_name=table2.column_name;SELECT
table1.column_name1, table2.column_name2, table3.column_name3FROM
((table1INNER JOIN
table2ON
relationship)INNER JOIN
table3ON
relationship);
LEFT (OUTER) JOIN
returns all records from the left table (table1
), and the
matched records from the right table (table2
).
Select all customers
(CustomerName
), and any orders
(OrderID
) they might
have:
RIGHT (OUTER) JOIN
returns all records from the right table (table2
), and
the matched records from the left table (table1
).
The result is 0 records from the left side, if there is no match.
Return all employees
(LastName
, FirstName
), and any orders (OrderID
)
they might have placed:
SELECT Orders.OrderID, Employees.LastName, Employees.FirstName
FROM Orders
RIGHT JOIN Employees ON Orders.EmployeeID = Employees.EmployeeID
ORDER BY Orders.OrderID;
FULL (OUTER)
JOIN returns all records when there is a match in either left or right table
Selects all customers
(CustomerName
), and all orders (OrderID
):
Self JOIN a regular join, but the table is joined with itself, as if the table
were two tables, temporarily renaming at least one table in the SQL statement.
Match customers
(CustomerName
) that are from the same city
(City
):
CARTESIAN JOIN: returns the Cartesian product of the sets of records from the
two or more joined tables. It’s essentially a multiplication operation between
the rows of the involved tables.
Create a new result set that contains every possible combination of a customer
(CustomerName
) and order
(OrderId
).
View Queries
What is a view?
Creates a view that shows all customers
from Brazil:
In SQL, a view is a virtual table based on the result-set of an SQL statement.
Is it possible to overwrite a view?
Yes, by using CREATE OR REPLACE VIEW
:
How to drop the “Brazil Customers” view:
Altering Table Queries
How to add a "Email"
column to the "Customers"
table?
How to delete the "Email"
column from the "Customers"
table?
How to rename the "City"
column to "Location"
?
How to change the data type of the "City"
column to "varchar(100)"
in
Customers
table?
Creating Table Query
Create a table called "Persons"
that contains five columns: PersonID
,
LastName
, FirstName
, Address
, and City
:
Security
Methods to avoid SQL injection?
- First refer to language/library documentation, how to use placeholder for user’s input. Avoiding SQL injection risk - The Go Programming Language
- Filter query elements from user’s input (or need just avoid this) with elements allowlist.