SQL Window Function OVER() Clause

As mentioned in the previous post, let's first create a dummy dataset for the demonstration of examples and better understanding. Below is the link to download the script file for the dummy dataset. It has been added to Github Gist gist.github.com/kiransabne04/6a042eae3982b0..

As we know, the OVER() clause is used to define a set of rows on which the Window functions will operate and when the brackets are left empty, it sets the Window to all the rows in the query. We will demonstrate the basic and simple use case of the OVER() clause below with a simple aggregate function.

With the tables and data created above, the below query helps us to find the total salary of all the employees along with the individual salary for each employee.

SELECT
    first_name,
    last_name,
    salary,
    SUM(salary) OVER() as total_salary
FROM employees;

first.PNG

Here ‘SUM(salary) OVER()’ is the important part. Obviously, SUM(salary) means the sum of salary. But the OVER() part here is empty which denotes that, “for all the rows in query result” find the salary sum. The output of the above query shows that we haven’t grouped rows based on any column, still, we got the total_sum of salary in the result set along with the individual row for each employee with their first_name, last_name, and salary. Similarly, to get the average the salary, we can use AVG() with OVER() as below,

SELECT
    first_name,
    last_name,
    salary,
    SUM(salary) OVER() as total_salary,
    AVG(salary) OVER() as avg_salary
FROM employees;

second.PNG

Computation with OVER():

Since we know that OVER() defines the row sets for the operation of Window functions, we can compare the current row with the aggregate.

SELECT
    first_name,
    last_name,
    salary,
    AVG(salary) OVER() as avg_salary,
    salary - AVG(salary) OVER() as difference_in_salary 
FROM employees;

third.PNG

In the above query, the last column of the select query shows the difference between an employee’s current salary and the average salary.

Now, let's find the part of each employee's salary against the total salary, I.e. ratio of each employee’s salary to the total salary in the result-set.

SELECT
    first_name,
    last_name,
    salary,
    SUM(salary) OVER() as total_salary,
    CAST(salary as NUMERIC(10,2)) / SUM(salary) OVER() as salary_ratio 
FROM employees;

fourth.PNG

In the above query, we divided each employee's salary with the total salary of all the employees in the query set. This gives the ratio of the salary of each employee against the total salary.

OVER() with COUNT():

Now, Let's see an example with COUNT(),

SELECT
    first_name,
    last_name,
    salary,
    SUM(salary) OVER() as total_salary,
    CAST(salary as NUMERIC(10,2)) / SUM(salary) OVER() as salary_ratio,
    COUNT(id) OVER() as count
FROM employees;

fifth.PNG

In the above query, we show the first_name, last_name, salary, total_salary, and salary_ratio as well as the number of all employees. This demonstrates the usage of COUNT() with OVER().

Now, we want to find out the maximum salary and the difference between the employee’s salary and the maximum salary along with the first_name, and last_name, salary of the employees.

SELECT
    first_name,
    last_name,
    salary,
    MAX(salary) OVER() as maximum_salary,
    MAX(salary) OVER() - salary as difference_salary
FROM employees;

sixth.PNG

In the last column of the above query, we got the difference in salary between the employee’s salary and the maximum salary of the query set.

OVER() with WHERE Clause:

Like normal queries, we can add the WHERE clause to the query with the OVER(). Just in the previous part, we got the difference in salary between the employee’s salary and the maximum salary of the query set, but now we wanted to the difference in the salary only for department_id 1. Check the below example,

SELECT
    first_name,
    last_name,
    salary,
    MAX(salary) OVER() as maximum_salary,
    MAX(salary) OVER() - salary as difference_salary
FROM employees 
WHERE department_id = 1;

seventh.PNG

In this query, we only calculated the difference between the salary of the employee of department_id 1 and the maximum salary of department_id 1. Remember that, Window functions work only on the rows returned by the query. In the above query, while taking the difference of the salary, we didn’t get the maximum_salary based on all the employees of the company, but we only got the salary and difference salary of each department employed in that department and not for the whole company. The point to note is that Window functions are always executed after the WHERE clause, so they operate on whatever they got returned as a result. And we cannot use the Window function in the Where clause. It won’t work, because we get Circular Dependency.