Data Science with PostgreSQL – Using the Window frame_clause

This is my 5th (and final) post on Window Function in PostgreSQL series. In previous posts on this topic, I have covered window function basics, using aggregate window functionsranking window function and value window functions. While you’re here, I’ll recommend to you check the previous posts on this topic.

In this article, we’ll learn a new concept frame_clause. Let’s jump right in!

What’s a window frame?

Windows function works on a group of rows called partitions (or window). A PARTITION BY clause divides the rows into partitions, so we can choose the specific rows to be included in a computation. A FRAME is a subset of this “current partition” and is defied using a frame clause.

Syntax for frame_clause

ROWS BETWEEN <starting_row> AND <ending_row>

Below are the possible values that can be specified for <starting_row> and the <ending_row> bounds of a window frame:

  • UNBOUNDED PRECEDING — all rows before the current row in the partition, i.e. the first row of the partition
  • [N] PRECEDING — No. of rows before the current row
  • CURRENT ROW — the current row
  • [M] FOLLOWING— No. of rows after the current row
  • UNBOUNDED FOLLOWING — all rows after the current row in the partition, i.e. the last row of the partition

Let’s see this using sample dataset of a car dealership group. Here, we have a table car_sales_by_month that stores car sales information by months.

CREATE TABLE IF NOT EXISTS car_sales_by_month
(
	year INT,
	month INT,
	brand VARCHAR(250),
	model VARCHAR(250),
	type VARCHAR(250),
	quantity INT,
	sales INT
);
INSERT INTO car_sales_by_month VALUES 
	(2022, 1 ,'Toyota','RAV4','Suv',50,1500000),
	(2022, 1 ,'Toyota','Corolla','Sedan',10,2010000),
	(2022, 1 ,'Honda','CRV','Suv',30,8000000),
	(2022, 2 ,'Honda','CRV','Suv',40,33000000),
	(2022, 2 ,'Toyota','RAV4','Suv',30,1000000),
	(2022, 2 ,'Toyota','Corolla','Sedan',20,2050000),
	(2022, 3 ,'Honda','Civic','Sedan',60,10000000),
	(2022, 3 ,'Honda','NSX','Hybrid ',25,2004000),
	(2022, 3 ,'Toyota','Corolla','Sedan',30,1080000),
	(2022, 4 ,'Honda','Civic','Sedan',60,30000000),
	(2022, 4 ,'Honda','NSX','Hybrid ',25,2004000),
	(2022, 4 ,'Toyota','Corolla','Sedan',35,4520000);
Scenario 1. Query to report yearly revenue by make of the car.
SELECT brand,
       SUM(sales) AS "Total sales $"
FROM   car_sales_by_month
	GROUP BY brand;
Scenario 2. Query to report monthly sales revenue changes for a specific car model.
SELECT brand,
       model,
       month,
       sales AS current_month_sales,
       LAG(sales) OVER ( ORDER BY month) AS previous_month_sales,
       sales - LAG (sales) OVER (ORDER BY month) AS difference
FROM car_sales_by_month
WHERE year = 2022
  AND model = 'Corolla'
Scenario 3.  Query to report the total sales for current month and previous month, alongwith the maximum sales in any individual month throughout the year.
SELECT brand,
       model,
       month,
       sales AS current_month_sales,
       LAG(sales) OVER (PARTITION BY brand, model ORDER BY month) AS prev_month,
       MAX(sales) OVER (PARTITION BY brand, model ORDER BY month ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS max_year_sales
FROM car_sales_by_month
WHERE year = 2022

💡 One worthy note is that anytime that you specify an ORDER BY clause, PostgreSQL will automatically set the default window as ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

Conclusion

The real magic of window functions is actually the frame i.e. subset of records over which the window function works. Here we just learned that a window frame depends on the current row and is defined by over clause.

If you have followed through all my posts, you now have seen most of the common Window functions/patterns. Great Job! Remember, window functions often get discussed during data engineer / scientist job interviews. So if you want to deepen your knowledge, I’d recommend reading PostgreSQL docs. And I hope the details and examples shared in this series will help in acing your next interview. All the best 👍

Enjoy what you’re reading?

Join my mailing list, and I’ll let you know whenever I write another post. No spam, I promise! 🧑🏻‍💻

Author: Varun Dhawan

Hello dear reader, I'm a DevOps Engineer based in MN, US (beautiful land of 10,000 lakes). I am perpetually curious and always willing to learn and engineer systems that can help solve complex problems using data. When I am not engineering or blogging, you’ll find me cooking and spending time with my family. Varun.Dhawan@gmail.com

One thought on “Data Science with PostgreSQL – Using the Window frame_clause”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: