Why Averages suck and what make Percentiles great

Average(mean), median, mode are core statistical concepts, that are often applied in software engineering. Whether you are new to programming or have multi years of computer science experience, you’ve likely at some point in time will have used these statistical functions, say to calculate system resource utilization, or network traffic or a website latency. In my current role, my team is responsible for running a telemetry platform to help dev teams measure application performance. We do this by collecting point-in-time data points referred as metrics.

A common use case for metrics is to tell the application latency (i.e. amount of time it took between a user action and web app response to that action). For examples, amount of time it took you to click on twitter photo and till it finally showed up on your device screen. So if you have this metrics collected at regular intervals (say every 1s), you can simply average it over a period of time like an hour or a day to calculate latency. Simple right!

Well, it might not be that simple. Averages are bad in this case, Let me explain why.

Disclaimer: By no means I claim to be expert on statistics,  so please correct me if I’m wrong. 😀

Why do averages suck?

Consider this:

  • You have a code to measures how long a user waits for an image to render.
  • You collected 5 data points over a period of time: 3s, 5s, 7s, 4s, and 2s.
  • If you average them, you get 3.5 (i.e. ((3+5+7+4+2) / 5))
  • However, 3.5 seconds is NOT representative of your actual users experience. From this data, it’s clear some of your users are having a very fast experience (less than 3 seconds ✅), and some are having a very slow experience (greater than 7 seconds ❌). But none of them are having a mathematically average experience. This isn’t helping

Are Percentiles better in this case? Yes!

Percentiles is a value on a scale of 100 that indicates the percent of a distribution that is equal to or below it. For example, the 95th percentile is the value which is greater than 95% of the all observed values. Coming back to our app latency scenario, instead of calculating the average of all observed data points, we calculate Percentile50 or Percentile90.

P50 – 50th Percentile

  • Sort the data points in ascending order: 2s, 3s, 4s, 6s, 7s.
  • You get P50 by throwing out the bottom 50% of the points and looking at the first point that remains: 6s

P90 – 90th Percentile

  • Sort the data points in ascending order: 2s, 3s, 4s, 6s, 7s.
  • You get P90 by throwing out the bottom 90% of the points and looking at the first point which remains: 7s

Using percentiles has these advantages:

  1. Percentiles aren’t skewed by outliers like averages are.
  2. Every percentile data point is an actual user experience, unlike averages.

You can plot percentiles on a time series graph just like averages. And you can also setup threshold alerts on them. So say if P90 is greater than 5 seconds (i.e. 90% of observed values have greater than 5s latency) you can be alerted. Below is a spreadsheet to explain Percentile.

As you might have noticed, when you use percentile based metrics, you get a much better sense for reality.

Some interesting facts about Percentile

  • Percentile are commonly referred: p99 (or P99, or P₉₉) means “99th percentile”, p50 means “50th percentile”…you get the drift.
  • P50 is same as the median (mid-point of a distribution)
  • And Percentile are NOT Percentage!


Now armed with some basic knowledge about percentiles, hopefully you’ll start seeing your metrics in whole different way.

PostgreSQL Table Partitioning Part III – Partition Elimination

Understanding Partition Elimination in PostgreSQL 11

This is Part-III for my series on Postgres Table partitioning. I’ll encourage you to also read Part-I and II

PostgreSQL Table Partitioning Part I – Implementation Using Inheritance

PostgreSQL Table Partitioning Part II – Declarative Partitioning

In this post, lets compare the READ performance between partitioned and an un-partitioned table. Before that lets first review our table schema and data distribution.

Un-Partitioned Data Set
-- Data Distribution un-partitioned tbl
SELECT logdate, COUNT (logdate)
FROM measurement_np
GROUP BY logdate;

Partitioned Data Set
-- Data Distribution partitioned tbl
SELECT logdate, COUNT (logdate)
FROM measurement
GROUP BY logdate;

Comparing Query Plan (EXPLAIN ANALYZE) and Partition Elimination

1st Execution – on cold cache
-- Un-Partitioned tbl
Index Scan using measurement_np_indx_logdate on measurement_np (cost=0.44..416.13 rows=10672 width=16) (actual time=0.031..40.625 rows=10000 loops=1)
Index Cond: (logdate = '2006-04-11'::date)

-- Partitioned tbl
Append (cost=0.43..4.46 rows=1 width=16) (actual time=0.028..2.316 rows=10000 loops=1)
-> Index Scan using measurement_y2006m04_logdate_idx on measurement_y2006m04 (cost=0.43..4.45 rows=1 width=16) (actual time=0.028..1.813 rows=10000 loops=1)
Index Cond: (logdate = '2006-04-11'::date)
2nd Execution – on hot cache
-- Un-Partitioned tbl
Index Scan using measurement_np_indx_logdate on measurement_np (cost=0.44..337.14 rows=8671 width=16) (actual time=0.040..1.750 rows=10000 loops=1)
Index Cond: (logdate = '2006-04-11'::date)

-- Partitioned tbl
Append (cost=0.43..405.61 rows=9519 width=16) (actual time=0.022..1.942 rows=10000 loops=1)
-> Index Scan using measurement_y2006m04_logdate_idx on measurement_y2006m04 (cost=0.43..358.01 rows=9519 width=16) (actual time=0.021..1.426 rows=10000 loops=1)
Index Cond: (logdate = '2006-04-11'::date)

On in both attempts (cold and hot cache), the data retrieval from  ‘partitioned’ tables was faster than the un-partitioned table.

PostgreSQL Table Partitioning Part II – Declarative Partitioning

Starting Postgres 10.x and onward, it is now possible to create declarative partitions.

In my previous post ‘postgresql-table-partitioning-part-i-implementation-using-inheritance‘, I discussed about implementing Partitioning in PostgreSQL using ‘Inheritance’. Up until PostgreSQL 9, it was only way to partition tables in PostgreSQL. It was simple to implement, however had some limitations like:

  • Row INSERT does not automatically propagate data to a child tables (aka partition), instead it uses explicit ‘BEFORE INSERT’ trigger, making them slower
  • INDEXES and constraints have to be separately created on child tables
  • Significant manual work is required to create and maintain child tables ranges

‘Declarative’ partitioning released with Postgres 10 does not have these limitations and requires much less manual work to manage partitions.

Let’s see the implementation of ‘Declarative’ partitioning with example:

— Step 1.
Create a partitioned table using the PARTITION BY clause,                              — which includes the partitioning method (RANGE in this example) and the list of column(s) to use as the partition key

CREATE TABLE measurement (
 	city_id int not null,
 	logdate date not null,
 	peaktemp int,
 	unitsales int

— Step 2.
— Create Index on parent table
— Note: creation of seperate indexes on parent table is not required

CREATE INDEX measurement_indx_logdate ON measurement (logdate);

— Step 3.
— Create Default partition

CREATE TABLE measurement_default PARTITION OF measurement DEFAULT;

— Create partitions with exclusive range and dfeualt partition (catch-all for out of range values)

CREATE TABLE measurement_y2006m02 PARTITION OF measurement 
             FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement 
             FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
CREATE TABLE measurement_y2006m04 PARTITION OF measurement 
             FOR VALUES FROM ('2006-04-01') TO ('2006-05-01');
CREATE TABLE measurement_y2007m11 PARTITION OF measurement 
             FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');

Let’s review our schema now

— Step 4
— Insert sample rows

    FOR i in 1..10 loop
      INSERT INTO measurement VALUES (1,'2006-02-07',1,1);
    end loop;
  end $;

    FOR i in 1..1000000 loop
      INSERT INTO measurement VALUES (1,'2006-03-06',1,1);
    end loop;
  end $;

    FOR i in 1..1000000 loop
      INSERT INTO measurement VALUES (1,'2006-04-09',1,1);
    end loop;
  end $;

    FOR i in 1..1000000 loop
      INSERT INTO measurement VALUES (1,'2007-11-11',1,1);
    end loop;
  end $;

--Optional Step
ANALYZE measurement;

–Step 5
–Test partition elimination

  SELECT * FROM measurement
    WHERE logdate = '2007-11-11';

Let’s look at the Query Plan

As you can see, the ‘Declarative’ partitioning is much more intuitive and requires less manual steps in declaring  partitions compares to inheritance.

thanks for reading!

PostgreSQL Table Partitioning Part I – Implementation Using Inheritance

In earlier PostgreSQL versions, it was not possible to declare table partitions syntactically. Partitioning can be implemented using table inheritance. The inheritance approach involves creating a single parent table and multiple child tables (aka. Partitions) to hold data in each partition range.

In this post, I’ll discuss the implementation of table partitions using inheritance. However before proceeding, let’s first understand why do we need partitioning?

Why Partition?

The simple answer is to improve the scalability and manageability of large data sets and tables that have varying access patterns.

Typically, you create tables to store information about an entity, such as customers or sales, and each table has attributes that describe only that entity. While a single table for each entity is the easiest to design and understand, these tables are not necessarily optimized for performance, scalability, and manageability, particularly as the table grows larger.

How can partitioning help? 

  1. When tables and indexes become very large, partitioning can help by partitioning the data into smaller and more manageable sections.
  2. It allows you to speed up loading and archiving of data, so that you can perform maintenance operations on individual partitions instead of the whole table, and this in turn improves the query performance.

There is a ton of information published on partitioning, But if you new to partitioning in PostgreSQL, below are some great examples:

Now that we have some insight what table partitioning is, let’s do a real partitioning  using below scripts:

How to Partition a Table?

— Step 1.

id int,
col_a varchar,
col_b varchar);
--Child Table 1
CREATE TABLE range1() INHERITS (parent);
--Child Table 2
CREATE TABLE range2() INHERITS (parent);
--Child Table 3
CREATE TABLE range3() INHERITS (parent);

Let’s review the schema now

— Step 2.

CREATE OR REPLACE FUNCTION partition_parent() RETURNS trigger as $$
    IF (new.id < 10) THEN
         INSERT INTO range1 VALUES (new.*) ;
    ELSEIF (new.id >= 10 AND NEW.id < 20 ) then
         INSERT INTO range2 VALUES (new.*) ;
    ELSEIF (new.id >= 20 AND NEW.id < 30 ) then
         INSERT INTO range3 VALUES (new.*) ;
         RAISE EXCEPTION 'out of range';

$$ language plpgsql;

— Step 3.

CREATE TRIGGER partition_parent_trigger
        FOR EACH ROW EXECUTE PROCEDURE partition_parent();for each row execute PROCEDURE partition_parent();

— Step 4.

     FOR i in 1..29 LOOP
          INSERT INTO parent(id, col_a, col_b) VALUES (i, 'a', 'b');
     END LOOP;
END $$;

— Optional Step (stats update)

ANALYZE parent;
-- Step 5.

SELECT * from parent
   WHERE id = 5;

Query Plan 

In the next Post in this series, I’ll discuss the new ‘Declarative Partitioning‘ implementation

thanks for reading! 

Learning beyond SQL…PostgreSQL – Indexes

It’s been some time since I’ve blogged, even though I’ve been reading a lot all this while. One of the reason is that I couldn’t find enough compelling topics to write and share. Microsoft has been moving too fast with their SQL Server releases (2012, 2014, 2016….and we’re talking  Linux beta right now) and I’ve always been catching up.

However, between all this, something has changed. Due to an ever-growing buzz around Open Source, I haltingly started looking into PostgreSQL.  Truth be told, I’m starting from ground zero (so nothing to loose) and will be writing on topics that might sound too simple for some of you, nevertheless you may still find ’em helpful.

So starting with Indexes in PostgreSQL

PostgreSQL offers several index types:

  • B-tree
  • Hash
  • GiST  and GIN

Each index type uses a different algorithm that is best suited to different types of queries. In this post we’ll talk about B-tree indices.

Why Index?

  • Speed up data retrievals
  • Indexes reference data locations, explicitly, for the indexed column, consequently reducing data retrieval time
  • Without indices, SQL performs sequential table scans in search for data (applies to SELECT and DMLs)
  • B-tree index is sorted in ascending order by default


#Create Index Syntax
CREATE INDEX name ON table USING btree (coulmn);

#Check of the existing indices on  a table (as they can also be created implicitly by PRIMARY or UNIQUE key definition)
SELECT * FROM pg_indexes WHERE schemaname = 'public';
SELECT * FROM pg_stat_all_indexes WHERE schemaname NOT IN ('pg_catalog','pg_toast');

#Query a table without a filter condition and get query plan using EXPLAIN
SELECT * FROM public.film;

Seq Scan on film  (cost=0.00..127.00 rows=2000 width=384)


#Query a table with a filter condition and get query plan using EXPLAIN
SELECT title AS Name, release_year AS year 
FROM public.film
WHERE title in ('Clones Pinocchio','Vanilla Day');

Index Scan using idx_title on film (cost=0.28..21.63 rows=4 width=19)
Index Cond: ((title)::text = ANY (‘{“Clones Pinocchio”,”Vanilla Day”}’::text[]))


Here, after specifying the WHERE condition Postgres Planner (aka Optimizer) decided to choose an index, instead of sequentially scanning the table. Postgres is able to find the targeted rows in an index, and then fetch them from disk selectively.

It’s a very wide topic, so I’ll write more about indexes. However to sum it up, PostgreSQL provides lot of flexibility with B-tree indexes so they can be optimized to suit your requirement and keep your queries snappy.

Identify queries that consume a large amount of log space in SQL Server

One of regular issues DBA’s get are about the T-log growth. Situations, wherein one “bad”  or “poorly-designed” query can eat up entire T-log space, bring the free space to zero and then bring your application down. The cause and remedy of most these issue is discussed in this KB # 317375(I’m  big fan on Microsoft KB’s).

While the KB discussed about the causes and approaches to deal with high T-log growth situations, it also hints about how we can ‘proactively’ find the queries that are consuming your T-log space at any given moment using DMV’s. Taking cue from this, I have written a below T-SQL Code:

Identify queries consuming large T-log space:

Use <your_database_name>
-- Description: get queries consuming maximum T-log space
-- Source: based on KB317375
-- Author: varun.dhawan@gmail.com                                               

SELECT dtst.session_id AS 'spid'
, CAST(DB_NAME(dtdt.database_id) AS VARCHAR(20)) AS 'database'
, der.command
, SUBSTRING(st.text, ( der.statement_start_offset / 2 ) + 1
, ((CASE der.statement_end_offset 
ELSE der.statement_end_offset 
END - der.statement_start_offset)/2)+1) AS statement_text
+ QUOTENAME(OBJECT_SCHEMA_NAME(st.objectid, st.dbid)) 
+ N'.' 
+ QUOTENAME(OBJECT_NAME(st.objectid, st.dbid)), '') AS command_text
, der.wait_type
, der.wait_time
, dtdt.database_transaction_log_bytes_used / 1024.0 / 1024.0 AS 'mb used'
, dtdt.database_transaction_log_bytes_used_system / 1024.0 / 1024.0 AS 'mb used system'
, dtdt.database_transaction_log_bytes_reserved / 1024.0 / 1024.0 AS 'mb reserved'
, dtdt.database_transaction_log_bytes_reserved_system / 1024.0 / 1024.0 AS 'mb reserved system'
, dtdt.database_transaction_log_record_count AS 'record count' 
FROM sys.dm_tran_database_transactions dtdt 
JOIN sys.dm_tran_session_transactions dtst 
ON dtdt.transaction_id = dtst.transaction_id 
JOIN sys.dm_exec_requests der 
CROSS apply sys.Dm_exec_sql_text(der.sql_handle) AS st 
ON dtst.session_id = der.session_id 
Hope that this will help you too!

Disclaimer: Everything here, is my personal opinion and is not read or approved by my employer before it is posted. No warranties or other guarantees will be offered as to the quality of the opinions or anything else offered here.