Understanding SQL Server Common Table Expression - CTE

·

12 min read

Common Table Expressions shortly called CTE or WITH clauses are named subqueries returning the data set. With CTE, we can break a long query into smaller parts making it simpler to understand and readable, and also supporting recursive implementation where traversal of hierarchical models is needed. In this post, we will be exploring the basics of CTE with examples.

Syntax

The most basic syntax of CTE is as follows,

WITH expression_name  ( column_names )  AS (
  CTE_query -- inner query
) --- cte defination

SELECT
  ...
FROM expression_name -- outer query

expression_name: it is an identifier for the CTE. The same expression_name is not allowed in the case of nested or recursive CTE. Any reference to the identifier in the statement uses CTE.

column_names: its list of names for the columns or attributes of CTE is optional. And the number of column_names should match the number of columns in the result set of the CTE_query.

CTE_query: denotes a SELECT statement to populate the common table expression with the result set. Also known as an inner query. In the rest part of the post, we will also refer to it as an inner query for simplicity. If more than one CTE_query statement is mentioned then, it must be joined using set operators namely UNION ALL, UNION, EXCEPT, and INTERSECT This is the most basic version of the CTE definition and its select outer query. We need to provide a name for CTE and define the CTE within the parentheses. After the parentheses, we can select from the CTE as we do with tables.

To illustrate CTE with examples, let's create a table with some data. In the table, we will be having our employees, departments, and department_expenses. As the names suggest, the employees' table holds records of the employee in a company, the departments' table holds the data of all the departments in a company, and department_expenses records the expenditure made by each department.

drop table if exists departments;
create table departments(
    id int identity(1,1) primary key,
    name varchar(40),
    description varchar(100),
    petty_cash_alloted decimal(6,2)
);

drop table if exists employees;
create table employees(
    id int identity(1,1) primary key,
    first_name varchar(30),
    last_name varchar(30),
    department_id int,
    joining_date date
);

drop table if exists department_expenses;
create table department_expenses(
    id int identity(1,1) primary key,
    employee_id int,
    department_id int,
    expense_amount decimal(6,2),
    expense_date date
);

insert into departments(name, petty_cash_alloted) values ('accounting', 324.50), ('human resources', 1000), ('technical support', 780), ('cleaning staff', 890.50), ('legal', 770.60);

insert into employees (first_name, last_name, department_id, joining_date) values ('Kaitlynn', 'Epelett', 5, '2021-03-24');
insert into employees (first_name, last_name, department_id, joining_date) values ('Bernhard', 'Cuttell', 5, '2021-03-27');
insert into employees (first_name, last_name, department_id, joining_date) values ('Zitella', 'Griniov', 4, '2021-03-25');
insert into employees (first_name, last_name, department_id, joining_date) values ('Janelle', 'Raxworthy', 3, '2021-03-12');
insert into employees (first_name, last_name, department_id, joining_date) values ('Isabeau', 'Handsheart', 1, '2021-03-06');
insert into employees (first_name, last_name, department_id, joining_date) values ('Rowland', 'Jewess', 3, '2021-03-02');
insert into employees (first_name, last_name, department_id, joining_date) values ('Yevette', 'Featenby', 1, '2021-03-27');
insert into employees (first_name, last_name, department_id, joining_date) values ('Cammy', 'Silverton', 3, '2021-03-03');
insert into employees (first_name, last_name, department_id, joining_date) values ('Derward', 'Joules', 4, '2021-03-30');
insert into employees (first_name, last_name, department_id, joining_date) values ('Neilla', 'Hincham', 2, '2021-03-25');
insert into employees (first_name, last_name, department_id, joining_date) values ('Seumas', 'Cribbins', 4, '2021-03-13');
insert into employees (first_name, last_name, department_id, joining_date) values ('Star', 'Bladon', 4, '2021-03-07');
insert into employees (first_name, last_name, department_id, joining_date) values ('Britta', 'Waddilow', 5, '2021-03-15');
insert into employees (first_name, last_name, department_id, joining_date) values ('Beth', 'Burch', 2, '2021-03-16');
insert into employees (first_name, last_name, department_id, joining_date) values ('Essa', 'Gebhardt', 2, '2021-03-22');

insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (4, 3, 134.51, '2021-04-20');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (8, 3, 71.18, '2021-03-19');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (11, 4, 756.75, '2021-03-04');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (8, 3, 84.88, '2021-03-11');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 115.41, '2021-03-18');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 56.15, '2021-03-04');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (12, 4, 771.54, '2021-04-27');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (8, 3, 65.9, '2021-04-26');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (5, 1, 281.28, '2021-04-20');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 72.35, '2021-03-09');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (13, 5, 62.28, '2021-04-12');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (7, 1, 137.1, '2021-04-13');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (2, 5, 61.91, '2021-05-01');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (3, 4, 389.08, '2021-04-27');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (11, 4, 46.78, '2021-03-30');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (6, 3, 81.24, '2021-03-29');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (5, 1, 177.91, '2021-04-26');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (6, 3, 114.13, '2021-03-12');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (4, 3, 146.22, '2021-04-29');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (7, 1, 144.58, '2021-03-11');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (3, 4, 150.95, '2021-04-19');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 122.52, '2021-03-12');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (9, 4, 86.35, '2021-04-28');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (6, 3, 148.5, '2021-03-29');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (11, 4, 442.45, '2021-03-21');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (4, 3, 93.41, '2021-03-18');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (11, 4, 51.72, '2021-03-23');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (7, 1, 138.73, '2021-04-10');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (14, 2, 46.89, '2021-03-13');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (11, 4, 551.32, '2021-04-01');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (8, 3, 62.44, '2021-04-28');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (6, 3, 95.08, '2021-04-13');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (14, 2, 56.33, '2021-03-28');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (13, 5, 146.43, '2021-04-08');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (1, 5, 43.98, '2021-03-12');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (2, 5, 124.71, '2021-03-03');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (13, 5, 41.12, '2021-04-14');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (11, 4, 102.71, '2021-04-22');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (6, 3, 94.0, '2021-03-29');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (14, 2, 46.05, '2021-04-08');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (14, 2, 95.15, '2021-04-04');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (8, 3, 81.07, '2021-03-26');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (7, 1, 112.0, '2021-03-23');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (7, 1, 147.27, '2021-04-18');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (8, 3, 47.96, '2021-05-01');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (4, 3, 129.75, '2021-03-05');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 95.6, '2021-03-19');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (8, 3, 98.88, '2021-03-13');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 91.24, '2021-05-01');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 47.64, '2021-03-14');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (3, 4, 773.39, '2021-04-22');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 115.93, '2021-04-25');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (13, 5, 86.04, '2021-03-12');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (4, 3, 53.06, '2021-03-07');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (2, 5, 141.01, '2021-04-02');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (6, 3, 41.77, '2021-04-05');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (1, 5, 63.59, '2021-04-12');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 71.02, '2021-04-29');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 49.68, '2021-04-18');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (12, 4, 909.59, '2021-04-06');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (9, 4, 954.25, '2021-04-25');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (13, 5, 121.98, '2021-03-03');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (8, 3, 108.31, '2021-03-11');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (4, 3, 96.56, '2021-03-04');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (15, 3, 140.92, '2021-03-19');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (7, 1, 239.05, '2021-03-20');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (14, 2, 72.51, '2021-03-27');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (9, 4, 591.58, '2021-03-25');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (12, 4, 646.08, '2021-04-13');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (12, 4, 116.68, '2021-04-22');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (12, 4, 109.93, '2021-04-16');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (1, 5, 42.47, '2021-04-07');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (8, 3, 146.0, '2021-03-19');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (12, 4, 94.43, '2021-04-20');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (9, 4, 152.56, '2021-03-31');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (5, 1, 60.98, '2021-03-23');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (4, 3, 78.66, '2021-03-12');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (15, 1, 152.08, '2021-03-28');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (13, 5, 58.16, '2021-03-24');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (3, 4, 62.87, '2021-03-09');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (4, 3, 79.71, '2021-04-19');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (11, 4, 249.22, '2021-03-10');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (3, 4, 378.52, '2021-04-23');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (12, 4, 142.84, '2021-03-10');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (2, 5, 63.81, '2021-04-18');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (12, 4, 106.42, '2021-03-12');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (13, 5, 82.64, '2021-04-19');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (3, 4, 120.39, '2021-04-02');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (7, 1, 244.59, '2021-04-07');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (1, 5, 50.45, '2021-03-10');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (1, 5, 44.8, '2021-03-28');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (6, 3, 104.68, '2021-04-05');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (2, 5, 136.04, '2021-03-05');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (3, 4, 113.32, '2021-03-04');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (11, 4, 128.12, '2021-03-31');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (9, 4, 576.22, '2021-04-27');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (1, 5, 127.25, '2021-03-22');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (10, 2, 129.69, '2021-03-12');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (2, 5, 116.48, '2021-03-24');
insert into department_expenses (employee_id, department_id, expense_amount, expense_date) values (7, 1, 144.76, '2021-04-02');

The above script will create a table and populate with data for the examples in this post.

Basic CTE Example

We wanted to return the department name as department_name, total expenses of the departments as total_expenses for each department

with department_expenditures as (
    select
        d.name,
        sum(de.expense_amount) as total_expenses
    from departments d
    join department_expenses de
        on d.id = de.department_id
    group by d.name
)

select * from department_expenditures;

basic cte example.PNG

With the definition of department_expenditure CTE as name and CTE query inside the parenthesis, we return the required data set i.e. name of each department and their total expenses. This is the most basic example of CTE implementation.

CTE with columns

As mentioned earlier the alternative syntax is a CTE definition with a column list before the parenthesis of the CTE query. The only change is that an extra pair of parenthesis is added before the parathesis of the CTE inner query part. The columns list mentioned are the columns we want our CTE definition to have and we need to construct our CTE query based on these columns we are returning. The column name part is optional and not mandatory, as we did in the above illustration. But we need to use it when we have to return aggregate or analytic function values, we want to use it in another nested or recursive CTE, or even when we want to refer to the values of these functions in the outer query part. We can always use an alias as did in the previous example snippet and not use a column list in the CTE definition, either way, is fine. We will update the above code snippets, to use a column list with a CTE definition.

with department_expenditures (department_name, total_expenses) as (
    select
        d.name,
        sum(de.expense_amount)
    from departments d
    join department_expenses de
        on d.id = de.department_id
    group by d.name
)

select * from department_expenditures;

Now let's modify the return required a little, we only wanted to return those departments who had made total expenses more than 75 percent of their allotted petty cash amount. So we wanted to return those departments that made total expenses more than 75 percent of their allotted petty cash amount.

with department_expenditures (department_name, total_expenses) as (
    select
        d.name,
        sum(de.expense_amount)
    from departments d
    join department_expenses de
        on d.id = de.department_id
    group by d.name, d.petty_cash_alloted
    having sum(de.expense_amount) > d.petty_cash_alloted * 0.75
)

select * from department_expenditures;

The above query will return those department names and their total expenses whose total expenses have crossed more than 75 percent of the allotted petty cash to them. In the above case, all department names and their amount will be returned as all departments have crossed the limit of expenditure.

Multiple CTEs

These CTE are separated by commas and commas should not be given at the end of the CTE. The WITH keyword will be mentioned only once during the start of the CTE definition. This comma-separated multiple CTE makes sense when we have to refer to the return of one CTE in another CTE and so on. For multiple CTE we need to do set-based operations like UNION, UNION ALL, etc. Let's see an example. We wanted to show the list of employees who had made expenses of more than 1.5 times the allotted petty cash amount of both the accounting and cleaning departments.

with accounting_department_expenditure (first_name, last_name, department_name, total_expenses) as (
    select
        e.first_name,
        e.last_name,
        d.name,
        sum(de.expense_amount)
    from employees e
    join departments d
        on e.department_id = d.id
    join department_expenses de
        on de.employee_id = e.id
        and de.department_id = d.id
    where d.name = 'accounting'
    group by e.first_name, e.last_name, d.name, d.petty_cash_alloted
    having sum(de.expense_amount) > 1.5 * d.petty_cash_alloted
), cleaning_department_expenditure (first_name, last_name, department_name, total_expenses) as (
    select
        e.first_name,
        e.last_name,
        d.name,
        sum(de.expense_amount)
    from employees e
    join departments d
        on e.department_id = d.id
    join department_expenses de
        on de.employee_id = e.id
        and de.department_id = d.id
    where d.name = 'cleaning staff'
    group by e.first_name, e.last_name, d.name, d.petty_cash_alloted
    having sum(de.expense_amount) > 1.5 * d.petty_cash_alloted
)

select  
    first_name, 
    last_name, 
    department_name, 
    total_expenses
from accounting_department_expenditure
union all
select
    first_name, 
    last_name, 
    department_name, 
    total_expenses
from cleaning_department_expenditure;

multple cte example.PNG

After defining both CTEs we will show the result of those, together with the SET operator in the outer query, here we are using UNION ALL. This will output by combining the output of the above two CTE definitions as in the image above. Do remember that it is not possible to create correlated subqueries with CTEs.

Summary

  1. CTEs are named temporary sets of rows similar to subqueries.

  2. CTEs are placed at the start of the query. They are introduced with WITH.

  3. You can skip column definitions if you provide aliases for columns containing aggregates and other function results.

  4. Not possible to create correlated subqueries with CTEs.