Data Science with PostgreSQL – Aggregate Window Functions

Here we go, after weeks for procrastination finally the 2nd post in my series featuring Window Function in PostgreSQL. In this post, I’ll explain how to use Aggregate window functions – that we can use to calculate various aggregations such as average, counts, minimum / maximum values, and sum within each window or partition.

Super Simple test dataset

Before we begin, let’s create a test table and insert sample records to test our queries. Here we have an Orders tables with columns like order id, order date, customer name, City and Amount.

--Create test table
CREATE TABLE IF NOT EXISTS Orders
(
	order_id INT,
	order_date DATE,
	customer_name VARCHAR(250),
	city VARCHAR(100),	
	order_amount INT
)

-- Insert sample rows
INSERT INTO Orders
VALUES	(101,'2022-02-01','David Smith','Charleston',20000),
  		(102,'2022-02-02','David Jones','Savannah',30000),
  		(103,'2022-02-03','John Smith','Seattle',6000),
  		(104,'2022-02-04','Michael Smith','Charleston',16000),
  		(105,'2022-02-05','David Williams','Seattle',8000),
  		(106,'2022-02-06','Paum Smith','Charleston',26000),
  		(107,'2022-02-10','Andrew Smith','Savannah',16000),
  		(108,'2022-02-11','David Brown','Savannah',3000),
  		(109,'2022-02-20','Robert Smith','Seattle',2000),
  		(110,'2022-02-25','Peter Smith','Charleston',600),
  		(111,'2022-02-26','Smith Ross','Savannah',600),
  		(112,'2022-02-27','Mike Pearl','Seattle',600);

Types of Aggregate window functions

1. AVG() – Computes the average of a given list of elements or buckets

2. COUNT() – Computes the total number of elements in a given list of elements or buckets

3. MAX() – Outputs maximum value in a given list of elements or buckets

4. MIN() – Outputs minimum value in a given list of elements or buckets

5. SUM() – Computes the total number of elements in a given list of elements or buckets

Using Aggregate Window Functions

-- AGGREGATE WINDOW FUNCTIONS 
SELECT city,
       customer_name,
       order_amount,
       AVG(order_amount)
         OVER(
           partition BY city) AS "average_order_amount",
       COUNT(order_amount)
         OVER(
           partition BY city) AS "number_of_customers",
       MIN(order_amount)
         OVER(
           partition BY city) AS "minimum_order_amount",
       MAX(order_amount)
         OVER(
           partition BY city) AS "minimum_order_amount",
       SUM(order_amount)
         OVER(
           partition BY city) AS "sum_order_amount"
FROM   orders; 

The key thing to note is this statement OVER(partition BY city) that tells the query to GROUP BY CITY (i.e. Charleston, Savannah etc.) and calculate the Average for the order_amount.

Easy to grasp right! So we’re getting used to this concept of PARTITION BY. In the next post, I’ll cover next category of Ranking Window Functions. To read additional details about postgres window functions, please refer to the PostgreSQL Window Function Docs.

This is 2nd article in four-part series on Window Functions in PostgreSQL
Enjoy what you’re reading?

Join my mailing list, and I’ll let you know whenever I write another post. No spam, I promise! 🧑🏻‍💻

Author: Varun Dhawan

Hello dear reader, I'm a Product Manager @Microsoft based in MN, US (beautiful land of 10,000 lakes). I am perpetually curious and always willing to learn and engineer systems that can help solve complex problems using data. When I am not engineering or blogging, you’ll find me cooking and spending time with my family. Varun.Dhawan@gmail.com

3 thoughts on “Data Science with PostgreSQL – Aggregate Window Functions”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: