Recursive SQL for querying hierarchical data: Part 2 – Levels and Ancestors

Ok, so you’ve stored hierarchical data in a relational database, and written recursive CTEs to query the data and find relationships. Now the application team wants to query hierarchical levels and print the complete ancestry tree. Time to deep dive into some advance CTE constructs.

This is the second post of the series about the Recursive SQL for querying hierarchical data started in the previous post . If you haven’t read it already, I recommend reading it to understand the key concepts:

  1. What is a hierarchical data?
  2. How to store hierarchical data in a relational database?
  3. And how to query hierarchical data using:
    • Self-Joins
    • Common Table Expressions (CTE)

In this post, we’ll discuss the advanced scenarios like displaying hierarchical levels and printing the “ancestry tree”. Let’s dive in…

Note: I’ll be re-using the employee dataset we created in the earlier post. Below is our hierarchical model and how it’s stored inside a database table.

Hierarchical model
     CEO -->  1
/ \
/ \
/ \
/ \
CIO --> 2 3 <-- COO
/ \ / \
/ \ / \
4 5 6 7
/ \ / \
/ \ / \
8 9 10 11
\
12
Relational model
employee_id full_name title manager_id
1 Smith Ross ceo NULL
2 Mike Pearl cio 1
3 Green Field coo 1
4 Dewane Paul svp 2
5 Matts Sh assistant 2
6 Plank Pto assistant 3
7 kayling Metcalfe svp 3
8 blaze Mills dir 4
9 clare Glover vp 4
10 jonas Henderson dir 7
11 scarlet Kelly dir 7
12 frank Climo assistant 9

Scenario 1. Show hierarchical level

This can done by adding the starting level as a constant in the non-recursive branch. In the recursive branch you simply add 1 to the level:

WITH RECURSIVE ctename AS (
      SELECT employee_id
      		 , full_name
      		 , 0 AS Level
      FROM employees
      WHERE employee_id = 1
   UNION ALL
      SELECT e.employee_id, 
      		 e.full_name,
             ctename.level + 1
      FROM employees e
         JOIN ctename ON e.manager_id = ctename.employee_id
)
SELECT * FROM ctename;
employee_idfull_namelevel
1Smith Ross0
2Mike Pearl1
3Green Field1
4Dewane Paul2
5Matts Sh2
6Plank Pto2
7kayling Metcalfe2
8blaze Mills3
9clare Glover3
10jonas Henderson3
11scarlet Kelly3
12frank Climo4
Employee Table Ordered by Level

Scenario 2. Display Ancestor Tree

Another frequent requirement is to collect all the ancestors in a “hierarchy”

WITH RECURSIVE ctename AS (
      SELECT employee_id
            , '(' || upper(title) || ' ) ' || full_name AS hierarchy
      FROM employees
      WHERE employee_id = 1
   UNION ALL
      SELECT e.employee_id
             , ctename.hierarchy || ' -> ' || '(' || upper(e.title) || ') ' ||e.full_name
      FROM employees e
         JOIN ctename ON e.manager_id = ctename.employee_id
)
SELECT hierarchy AS "Org Hierarchy Chart" FROM ctename;
ORG CHART – ANCESTOR TREE
(CEO ) Smith Ross
(CEO ) Smith Ross -> (CIO) Mike Pearl
(CEO ) Smith Ross -> (COO) Green Field
(CEO ) Smith Ross -> (CIO) Mike Pearl -> (SVP) Dewane Paul
(CEO ) Smith Ross -> (CIO) Mike Pearl -> (ASSISTANT) Matts Sh
(CEO ) Smith Ross -> (COO) Green Field -> (ASSISTANT) Plank Pto
(CEO ) Smith Ross -> (COO) Green Field -> (SVP) kayling Metcalfe
(CEO ) Smith Ross -> (CIO) Mike Pearl -> (SVP) Dewane Paul -> (DIR) blaze Mills
(CEO ) Smith Ross -> (CIO) Mike Pearl -> (SVP) Dewane Paul -> (VP) clare Glover
(CEO ) Smith Ross -> (COO) Green Field -> (SVP) kayling Metcalfe -> (DIR) jonas Henderson
(CEO ) Smith Ross -> (COO) Green Field -> (SVP) kayling Metcalfe -> (DIR) scarlet Kelly
(CEO ) Smith Ross -> (CIO) Mike Pearl -> (SVP) Dewane Paul -> (VP) clare Glover -> (ASSISTANT) frank Climo
Ancestry tree

Conclusion – Do more within the database

We learned simple way to traverse through deep hierarchical data using CTEs, without having to write any line of application code to do the recursion. And this is the ultimate application performance tuning tip i.e. to do more within the database. PostgreSQL is already optimized for data storage, retrieval, and complex operations such as aggregates, JOINs, etc. So have your application code deal with displaying data and allow PostgreSQL to do computationally heavy tasks such as data manipulation and aggregations.

As always, thanks for reading!

Like what I write? Join my mailing list, and I’ll let you know whenever I write another post. No spam, I promise! 👨‍💻

Author: Varun Dhawan

Hello dear reader, I'm a Product Manager @Microsoft based in MN, US (beautiful land of 10,000 lakes). I am perpetually curious and always willing to learn and engineer systems that can help solve complex problems using data. When I am not engineering or blogging, you’ll find me cooking and spending time with my family. Varun.Dhawan@gmail.com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: