Data Science with PostgreSQL – Ranking Window Functions

This is 3rd post in my series featuring Window Function in PostgreSQL. In this post, I’ll explain how to use Ranking window functions – that we can use to calculate various aggregations such as Row Number, Rank, and Dense Rank within each window or partition.

Super Simple test dataset

Before we begin, let’s create a test table and insert sample records to test our queries. Here we have an Orders tables with columns like order id, order date, customer name, City and Amount.

--DROP TABLE Orders
CREATE TABLE IF NOT EXISTS Orders
(
    order_id INT,
    order_date DATE,
    customer_name VARCHAR(250),
    city VARCHAR(100),  
    order_amount INT
)
 
-- TRUNCATE TABLE Orders
INSERT INTO Orders
VALUES  (101,'2022-02-01','David Smith','Charleston',20000),
        (102,'2022-02-02','David Jones','Savannah',30000),
        (103,'2022-02-03','John Smith','Seattle',6000),
        (104,'2022-02-04','Michael Smith','Charleston',16000),
        (105,'2022-02-05','David Williams','Seattle',8000),
        (106,'2022-02-06','Paum Smith','Charleston',26000),
        (107,'2022-02-10','Andrew Smith','Savannah',16000),
        (108,'2022-02-11','David Brown','Savannah',3000),
        (109,'2022-02-20','Robert Smith','Seattle',2000),
        (110,'2022-02-25','Peter Smith','Charleston',600);

Types of Ranking window functions

ROW_NUMBER() –  Assign a unique row number to each record.

RANK() – Give a unique rank to each record based on a specified value.

DENSE_RANK() – Identical to Rank(), excepts it ranks each record without any gaps.

Using Ranking Window Functions

I. ROW_NUMBER() – with PARTITION BY

ROW_NUMBER() does just what it sounds like—displays the number of a given row. It starts from 1 and numbers the rows according to the ORDER BY part of the window statement. ROW_NUMBER() does not require you to specify a variable within the parentheses.

SELECT customer_name AS "Name",
       city AS "City",
       ROW_NUMBER() OVER(ORDER BY order_date) 
             AS "Sort by Order-Date"
FROM   orders; 
II. ROW_NUMBER() – without PARTITION BY

Using the PARTITION BY clause will allow you to begin counting 1 again in each partition. The following query starts the count customer order over again for each city.

SELECT customer_name AS "Name",
       city AS "City",
       ROW_NUMBER() OVER(PARTITION BY city) 
               AS "Order Count by City"
FROM   orders;  

ROW_NUMBER() does just what it sounds like—displays the number of a given row. It starts from 1 and numbers the rows according to the ORDER BY part of the window statement. So

III. RANK() & DENSE_RANK()

RANK() is slightly different from ROW_NUMBER().  If you order by order_amount, chances are there might be rows with same order amount however with different customer name and city. So using RANK() on amount column will give such rows same rank. In addition, you can also use DENSE_RANK() instead of RANK() based on your use case. Here’s how they are different:

SELECT customer_name AS "Name",
       city AS "City",
       order_amount AS "Order-Amount",
       RANK() OVER(ORDER BY order_amount DESC) AS "RANK()",
       DENSE_RANK() OVER(ORDER BY order_amount DESC) AS "DENSE-RANK()"
FROM   orders; 
  • RANK() would give the identical rows a rank of 4, then skip ranks 5, so the next result would be 6.
  • DENSE_RANK() would still give all the identical rows a rank of 4, but the following row would be 5—no ranks would be skipped.

Note: Ranking window function does not require you to specify a variable within the parentheses:

As a practice exercise, learn how to use PERCENT_RANK() and NTILE() and then try those functions for yourself. To read additional details about postgres window functions, please refer to the PostgreSQL Window Function Docs.

In the next post, I’ll cover next category of Value Window Functions.

This is 3rd article in four-part series on Window Functions in PostgreSQL
Enjoy what you’re reading?

Join my mailing list, and I’ll let you know whenever I write another post. No spam, I promise! 🧑🏻‍💻

Author: Varun Dhawan

Hello dear reader, I'm a Product Manager @Microsoft based in MN, US (beautiful land of 10,000 lakes). I am perpetually curious and always willing to learn and engineer systems that can help solve complex problems using data. When I am not engineering or blogging, you’ll find me cooking and spending time with my family. Varun.Dhawan@gmail.com

2 thoughts on “Data Science with PostgreSQL – Ranking Window Functions”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: