SQL Server Window Ranking Functions
In the previous post, we had seen the implementations of the Analytical Window Functions. In this post, we will be looking after the implementations and working of the Ranking Functions. The Ranking Window Functions will help compute the outputs like listing the trending topics, highest selling product, worst feedbacks, best and worst performers, etc.
Up until now, we have clearly understood that Window refers to the set of rows in the result sets on which the function does computations.
Note: We are using the same tables and datasets as a dummy for our example implementations, the link to the script file is in the second post of this series.
To implement Ranking Functions, we will use the other part of the OVER() clause and that is the ORDER BY part.
The general syntax of the OVER() clause is as follows OVER( [PARTITION BY partition_list ] [ ORDER BY order_list ] ) Now, we will work with the ORDER BY part of the clause. This ORDER BY part specifies the order in which rows needed to be sorted and ranked or numbered. We will demonstrate RANK(), DENSE_RANK(),ROW_NUMBER() Window Functions.
RANK()
RANK() is the widely used Window Function and the basic syntax is as follows, RANK() OVER(ORDER BY column_list)
The RANK() function returns the rank of each row as a number and this number is determined based on the sorting sequence and criteria mentioned in the ORDER BY part of the OVER() clause.
Let’s start with some basic implementations. We will rank the employees based on the salary, that is higher the salary better the rank and the lower the salary lower the rank.
SELECT
first_name,
last_name,
salary,
department_id,
RANK() OVER(ORDER BY salary DESC) as ranking
FROM employees;
In the above query output, we can see the query result set has been sorted based on salary without any partitioning or other criteria. The pink colored part is the ranking column, computed using the RANK() Function. And the rows with blue part are the rows having the same ranking due to their same salary. Hence, the rank number of the next employee is 31 as it skipped 30.
DENSE_RANK()
As we saw in the previous output, there is a gap in numbering due to the same salary of the two employees. If we want to avoid those gaps in numbering then we can use DENSE_RANK() function.
Let's again number employees based on their salary, this time with DENSE_RANK().
SELECT
first_name,
last_name,
salary,
department_id,
DENSE_RANK() OVER(ORDER BY salary DESC) as dense_ranking
FROM employees;
In the above output, we can see that the ranking numbers don’t have any gaps in its spite, still has the same salary value for that two employees. So, DENSE_RANK() gives a “dense” rank without any gaps in numbering.
ROW_NUMBER()
This Ranking Function, returns each row with its unique rank number, even if the row has the same value it will get consecutive numbers. Example is below
SELECT
first_name,
last_name,
salary,
department_id,
ROW_NUMBER() OVER(ORDER BY salary DESC) as row_num
FROM employees;
Here in the above output set, the pink part is the ROW_NUMBER() ranking, and we can see that even with the same salary value, the ranking numbers are unique and consecutive.
Multiple ORDER BY columns
Its to be noted that in the ORDER BY part of the OVER() clause in the Ranking Function if we want to sort the dataset in ascending order, you can do so by either mentioning ASC after the column name or just don’t mention it after writing column names. It is to be decided for each column separately in case of multiple columns sorting criteria. The multi-column example is given below
SELECT
first_name,
last_name,
salary,
department_id,
RANK() OVER(ORDER BY salary DESC, department_id) as ranking
FROM employees;
As can be seen in the above output, the rows are sorted by salary in descending order and then sorted by department_id in ascending order.
Ranking and ORDER BY
Yes, we can use our regular order with the Ranking Functions. The Ranking function returns the rank numbering for the order and criteria provided within the OVER() clause. It is independent of the outer ORDER BY.
NTILE(x)
We will be understanding the last Ranking Function and it is NTILE(x). NTILE(x) function distributes the rows into a set of numbers of groups, specified by x in brackets. Below is a demonstration for better understanding.
SELECT
first_name,
last_name,
salary,
department_id,
NTILE(4) OVER(ORDER BY salary DESC) as grouping_number
FROM employees;
We grouped the employees into 4 grouping sets sorted based on their salary in descending order. The first set is those whose salary can be considered as Higher. The second set is those whose salary is lower than the first part, but still can be considered as Lower-Higher salary. The third set is those whose salary can be considered as Average salary and the Fourth set are those whose salary is lower than other sets.
Note, if the number of rows is not divisible by the number of groups, some groups will have one more element than others and larger groups will be listed first.
Select Nth rows
In this part, we will see a practical demonstration and use of Ranking Functions to return rows only satisfying certain rank conditions. To implement this, we will take the help of CTE(Common Table Expression).
Now, Suppose we want to output employees getting the highest salary in each department. In this case, we have to rank number of each employee’s
WITH ranking_cte AS (
SELECT
first_name,
last_name,
salary,
department_id,
RANK() OVER(PARTITION BY department_id ORDER BY salary DESC) as ranking
FROM employees
)
SELECT * FROM ranking_cte where ranking = 1 ;
In the above output, we can see that we have only returned all those employees who had the highest salary in their departments. We have to use the RANK() function with OVER() clause with PARTITION BY part to make partitioning or grouping rows based on the department as we wanted output for each department, and in ORDER BY part we gave salary column which we want to sort and arrange it in descending order, so highest salary employee will receive a rank number of 1. So in the outer query for the CTE, we only choose rows whose ranking is equal to 1, to list out the highest-paid employees in each department.
In the next post series, we will see ways to set the Window Frame and understand RANGE and ROWS with FRAME setup.