Data Science with PostgreSQL – Ranking Window Functions

This is 3rd post in my series featuring Window Function in PostgreSQL. In this post, I’ll explain how to use Ranking window functions – that we can use to calculate various aggregations such as Row Number, Rank, and Dense Rank within each window or partition.

Super Simple test dataset

Before we begin, let’s create a test table and insert sample records to test our queries. Here we have an Orders tables with columns like order id, order date, customer name, City and Amount.

--DROP TABLE Orders
CREATE TABLE IF NOT EXISTS Orders
(
    order_id INT,
    order_date DATE,
    customer_name VARCHAR(250),
    city VARCHAR(100),  
    order_amount INT
)
 
-- TRUNCATE TABLE Orders
INSERT INTO Orders
VALUES  (101,'2022-02-01','David Smith','Charleston',20000),
        (102,'2022-02-02','David Jones','Savannah',30000),
        (103,'2022-02-03','John Smith','Seattle',6000),
        (104,'2022-02-04','Michael Smith','Charleston',16000),
        (105,'2022-02-05','David Williams','Seattle',8000),
        (106,'2022-02-06','Paum Smith','Charleston',26000),
        (107,'2022-02-10','Andrew Smith','Savannah',16000),
        (108,'2022-02-11','David Brown','Savannah',3000),
        (109,'2022-02-20','Robert Smith','Seattle',2000),
        (110,'2022-02-25','Peter Smith','Charleston',600);

Types of Ranking window functions

ROW_NUMBER() – Assign a unique row number to each record.
RANK() – Give a unique rank to each record based on a specified value.
DENSE_RANK() – Identical to Rank(), excepts it ranks each record without any gaps.

Using Ranking Window Functions

I. ROW_NUMBER() – with PARTITION BY

ROW_NUMBER() does just what it sounds like—displays the number of a given row. It starts from 1 and numbers the rows according to the ORDER BY part of the window statement. ROW_NUMBER() does not require you to specify a variable within the parentheses.

SELECT customer_name AS "Name",
       city AS "City",
       ROW_NUMBER() OVER(ORDER BY order_date) 
             AS "Sort by Order-Date"
FROM   orders;

II. ROW_NUMBER() – without PARTITION BY

Using the PARTITION BY clause will allow you to begin counting 1 again in each partition. The following query starts the count customer order over again for each city.

SELECT customer_name AS "Name",
       city AS "City",
       ROW_NUMBER() OVER(PARTITION BY city) 
               AS "Order Count by City"
FROM   orders;

ROW_NUMBER() does just what it sounds like—displays the number of a given row. It starts from 1 and numbers the rows according to the ORDER BY part of the window statement. So

III. RANK() & DENSE_RANK()

RANK() is slightly different from ROW_NUMBER(). If you order by order_amount, chances are there might be rows with same order amount however with different customer name and city. So using RANK() on amount column will give such rows same rank. In addition, you can also use DENSE_RANK() instead of RANK() based on your use case. Here’s how they are different:

SELECT customer_name AS "Name",
       city AS "City",
       order_amount AS "Order-Amount",
       RANK() OVER(ORDER BY order_amount DESC) AS "RANK()",
       DENSE_RANK() OVER(ORDER BY order_amount DESC) AS "DENSE-RANK()"
FROM   orders;

RANK() would give the identical rows a rank of 4, then skip ranks 5, so the next result would be 6.
DENSE_RANK() would still give all the identical rows a rank of 4, but the following row would be 5—no ranks would be skipped.

Note: Ranking window function does not require you to specify a variable within the parentheses:

As a practice exercise, learn how to use PERCENT_RANK() and NTILE() and then try those functions for yourself. To read additional details about postgres window functions, please refer to the PostgreSQL Window Function Docs.

In the next post, I’ll cover next category of Value Window Functions.

This is 3rd article in four-part series on Window Functions in PostgreSQL

Enjoy what you’re reading?

Join my mailing list, and I’ll let you know whenever I write another post. No spam, I promise! 🧑🏻‍💻

Author: Varun Dhawan

I’m Varun. I used to be a Software Engineer building data applications for large corporations like McKinsey and Target. Now, I’m a Product Manager at Microsoft, making Azure PostgreSQL the go-to platform for running mission-critical workloads (and no, I’m not obsessing over every little detail… I swear). When I’m not working, you can find me blogging at data-nerd.blog, where I help fellow data enthusiasts master PostgreSQL, sharpen their coding skills, and navigate their careers with confidence. And if there’s one thing you absolutely need to know about me, it’s that…I'm apparently a great cook—just don’t ask why I’m the only one who eats my food. View all posts by Varun Dhawan

Data Science with PostgreSQL – Ranking Window Functions

Super Simple test dataset

Types of Ranking window functions

Using Ranking Window Functions

I. ROW_NUMBER() – with PARTITION BY

II. ROW_NUMBER() – without PARTITION BY

III. RANK() & DENSE_RANK()

Enjoy what you’re reading?

Author: Varun Dhawan

2 thoughts on “Data Science with PostgreSQL – Ranking Window Functions”

Leave a comment Cancel reply

Super Simple test dataset

Types of Ranking window functions

Using Ranking Window Functions

I. ROW_NUMBER() – with PARTITION BY

II. ROW_NUMBER() – without PARTITION BY

III. RANK() & DENSE_RANK()

Enjoy what you’re reading?

Share this:

Related

Author: Varun Dhawan

2 thoughts on “Data Science with PostgreSQL – Ranking Window Functions”

Leave a comment Cancel reply