SQL > Advanced SQL > WITH

Sometimes you are faced with a complex question that cannot easily be answered via a single SQL statement.

You could attempt to arrive at the answer in a single pass by using several inline views or multiple subqueries, but that will likely make your SQL difficult to understand, and some SQL varieties, such as Hive SQL, do not allow more than one subquery.

A second approach is to break the complexity down into several different steps. One way to do this is to create several tables, with each table storing the results of each step. This is often a good strategy, as this makes debugging easier, and following how the code is executed becomes more straightforward.

However, the undesirable side effect is that you need to remember to drop these tables after you are done with your analysis; otherwise these tables are left inside the database, thus creating database management issues.

The alternative to creating multiple tables is to use the WITH clause in SQL.

Syntax

The basic syntax for the WITH clause is as follows:

WITH <query_name_1> AS (
SELECT Statement 1
)
Main SELECT Statement

The usage of WITH clause is very similar to creating tables. When you create a table, you give it a name. Similarly, when you use the WITH clause, you also give it a name, and this name essentially acts like a table name in the main SQL statement.

Because WITH does not create a table or a view, the object associated with the WITH statement disappears after the main SQL statement is executed, and there is nothing to clean up.

You can also have multiple WITH clauses. The syntax is as follows:

WITH <query_name_1> AS (
SELECT Statement 1
), <query_name_2> AS (
SELECT Statement 2
),
..
<query_name_n> AS (
SELECT Statement N
)
Main SELECT Statement

To have multiple WITH clauses, you do not need to specify WITH multiple times. Rather, after the first WITH clause is completed, add a comma, then you can specify the next clause by starting with <query_name> followed by AS. There is no comma between the final WITH clause and the main SQL query.

Examples

We use the following table in our examples.

Table Store_Sales

 Store_Name  Sales 
 Los Angeles  15000 
 San Diego  2500 
 New York  3000 
 Boston  7000 

Example 1: Use With on a SELECT statement

Let's say we want to list all the stores that have more sales more than the average. To do that, we can use the following WITH statement:

WITH t1 AS (
SELECT AVG(Sales) AVG_SALES FROM Store_Sales
)
SELECT a1.* FROM Store_Sales a1, t1
WHERE a1.Sales > t1.AVG_SALES;

Result:

Store_Name Sales
Los Angeles 15000
Boston 7000

The WITH statement calculates the average sales amount (which is $6,875), then the main SQL query simply returns all rows where the value of the Sales column is larger than this average amount.

An equivalent SQL using inline view would be:

SELECT a1.* FROM Store_Sales a1,
(SELECT AVG(Sales) AVG_SALES FROM Store_Sales) t1
WHERE a1.Sales > t1.AVG_SALES;

Similarly, this can be achieved via the following SQL using subquery:

SELECT a1.* FROM Store_Sales a1
WHERE a1.Sales >
(SELECT AVG(Sales) AVG_SALES FROM Store_Sales);

Note the subquery version does not work in Hive SQL due to how Hive SQL supports subqueries.

Example 2: Use WITH on a CREATE TABLE statement

We can also use WITH together with a CREATE TABLE statement. Let's say we want to create a table using the WITH clause in the previous example, we would type in,

CREATE TABLE Above_Average_Sales
AS
WITH t1 AS (
SELECT AVG(Sales) AVG_SALES FROM Store_Sales
)
SELECT a1.* FROM Store_Sales a1, t1
WHERE a1.Sales > t1.AVG_SALES;

The WITH portion of the query comes after CREATE TABLE AS. Do not put the WITH clause at the beginning. The query below will give an error:

WITH t1 AS (
SELECT AVG(Sales) AVG_SALES FROM Store_Sales
)
CREATE TABLE Above_Average_Sales
AS
SELECT a1.* FROM Store_Sales a1, t1
WHERE a1.Sales > t1.AVG_SALES;

Next: SQL EXISTS

This page was last updated on October 12, 2024.




Copyright © 2024   1keydata.com   All Rights Reserved     Privacy Policy     About   Contact