My journey from DBA to DevOps

I’ve been working as a database engineer for over a decade, engineering enterprise data platforms. During the beginning of my career in early 2000, I chose the ‘safe’ path of being a DBA believing that relational systems being universal containers for storing critical data will never change. I started learning relational technologies like Oracle, MS-SQL, and eventually also learned Open Source systems including PostgreSQL & MySQL.

However contrary to my belief, enterprise technology did change a lot. Some things however remained constant like developers swinging by my cube to request a new database, performing DB refresh or tuning  a slow running query (my favorite). Sometimes these teams were frustrated with me when their app went down or when they couldn’t access their database. At times, my Dev and QA friends would come by my desk to learn what I was doing and even after my attempt to explain them how I am solving their production issue, they would walk away puzzled. I have grown up in this role realizing that a when things go wrong, the first one to be blamed is DBA (i.e. Default Blame Acceptor)

This was my life for long and as you might have guessed, it wasn’t very satisfying; professionally 😉

It was a vicious cycle of configuring environments, routine data refreshes, resolving outages, and late-night upgrades. The work which once gave meaning to my career, lately made me ask myself, “Is this it for a DBA?” , “Is it the time to change?”

The journey ahead…

I then started having discussions with my leaders, determined to find an answer to “What’s the future of my role?”.

During these discussions, I heard terms like Agile, Scrum, Continuous Delivery and DevOps. Since the last one was repeated more often than others, I started researching on “Skillset for DevOps”.

Initially I was all excited to learn a new skill set, maybe a new tool, however after multiple days of googling, I couldn’t find any specific Skill Set definition. During a follow-up discussion with my mentor, I learned something interesting,

“Think of DevOps as not a specific skill set, instead it’s a way of doing something.”

Wait, so all it means is Dev and Ops working together, that’s sounds weird! So now I have to learn all languages used by developers…C++, Java, Python…ugh? While I’ve been coding on PowerShell & SQL for infra automation, I never considered myself as a developer. Hmm…maybe that’s what needed to change.

Fast forward – Learning Go

In the past couple of years, there is a rise of a new programming language Go lang (or simply GO). Most developers in their code will have to interact with the database at some point in their project, and often that means working with PostgreSQL. I particularly got interested in learning Go’s integration with PostgreSQL. There are multiple Go libraries that allow Go PostgreSQL integration productive and fun. Excited about my journey and want to share whatever I’ve learned recently…so in the next few posts I’ll blog about PostgreSQL integration with Go.

Disclaimer: By NO means I claim to be a Go expert and I am not going to teach Go lang. There’s a lot of great online courses and material for that. I am only going to share my learnings on Postgres integration with Go.

The Phoenix Project – Book Review

A Novel about IT, DevOps, and Helping Your Business Win

Some people are lucky to find books that changes their life. While I’m yet to find my life-changing novel, I do come across books that make me think, question my beliefs and push me to learn. The Project Phoenix is that book for me.

Written by Gene Kim, George Spafford, and Kevin Behr, the book is about a large company’s transformation into a DevOps culture. Transformation driven not just to look cool, but as a necessity for the survival of the company.

The synopsis is simple. Bill, the protagonist, is the Director of Midrange Operations at Parts Unlimited, a US-based $4 billion per year manufacturing and retail company. Bill is swiftly pulled into the spotlight by the CEO and persuaded hoodwinked into taking up the post as VP of IT Operations. It soon becomes clear that among standard responsibilities, Bill and his team are responsible for making the launch of the risky doomed Project Phoenix a success. Project Phoenix not only seems to be hugely overscoped for its ambitious – and imminent – timelines, but it also faces enormous pressure elsewhere.

The characters and situations in the book are stereotypical, however that’s not a criticism. The intent is clearly for us to identify with the characters and events, to relate your workplace with the story . So here are my key learnings “spoiler-free

DevOps is a collaborative working relationship between Development and IT Operations 🤝

Outcome of this collaboration is fast flow of planned work, while increasing the reliability, stability of the production environment.

3️ Ways principle 📜

The First way – focuses on maximizing flow of work from left-to-right starting from business to development to IT operations to the end user.

The Second way – focuses on increasing the feedback loop from right to left. The focus is not only on getting feedback but also on how fast we can get the feedback in order to make necessary corrections/improvement quickly.

The Third way – The third way is all about developing and fostering a culture of continuous experimentation and learning.

Speed to Deliver is the key 🚀

Technology is life blood of all business today. It’s imperative that all business should strive to bring their applications to market more quickly so they don’t miss any opportunities and easily adjust to the market standards. To achieve these objectives, organizations must adopt the right DevOps practices in their software development processes to reduce time to market.

Overall, book does a great job at explaining all these ideas with examples and linking them together. It’s a super fun and easy read, and I would definitely recommend you.

Connect to PostgreSQL in VS Code

VS Code has a rich extension API that let you add languages, debuggers, and tools to your installation to support easy development. PostgreSQL extension allows following:

  • Connect to PostgreSQL instances
  • View object DDL with ‘Go to Definition’ and ‘Peek Definition’
  • Write queries with IntelliSense
  • Run queries and save results as JSON, csv, or Excel

Download link: https://marketplace.visualstudio.com/items?itemName=ms-ossdata.vscode-postgresql

Using VS Code PostgreSQL extension

  1. Open the Command Palette Ctrl + Shift + P  (On mac use  ⌘ + Shift + P)

a

  1. Search and select PostgreSQL: New Query

  2. In the command palette, select Create Connection Profile. Follow the prompts to enter your Postgres instance hostname, database, username, and password.

a

 

 

You are now connected to your Postgres database. You can confirm this via the Status Bar (the ribbon at the bottom of the VS Code window). It will show your connected hostname, database, and user.

Now, let’s try to query database.

  1. Type a query ex. SELECT * FROM pg_stat_activity;

  2. Right-click, select Execute Query / keyboard shortcut [⌘M ⌘R] and the results will show in a new window.

6. You can also save the query results as JSONCSV or Excel.

So now, you can seamlessly code for PostgreSQL from Microsoft VS Code without switching screens, leverage powerful intellisense and execute queries.

Enjoy Coding!

IMP NOTE: Result windows from queries won´t show up again after being closed. This is bug with current version and is being worked by dev team. Workaround is either to keep the result window Open Or close / re-open the VS code window.

PostgreSQL Table Partitioning Part II – Declarative Partitioning

Starting Postgres 10.x and onward, it is now possible to create declarative partitions.

In my previous post ‘postgresql-table-partitioning-part-i-implementation-using-inheritance‘, I discussed about implementing Partitioning in PostgreSQL using ‘Inheritance’. Up until PostgreSQL 9, it was only way to partition tables in PostgreSQL. It was simple to implement, however had some limitations like:

  • Row INSERT does not automatically propagate data to a child tables (aka partition), instead it uses explicit ‘BEFORE INSERT’ trigger, making them slower
  • INDEXES and constraints have to be separately created on child tables
  • Significant manual work is required to create and maintain child tables ranges

‘Declarative’ partitioning released with Postgres 10 does not have these limitations and requires much less manual work to manage partitions.

Let’s see the implementation of ‘Declarative’ partitioning with example:

— Step 1.
Create a partitioned table using the PARTITION BY clause,                              — which includes the partitioning method (RANGE in this example) and the list of column(s) to use as the partition key

CREATE TABLE measurement (
 	city_id int not null,
 	logdate date not null,
 	peaktemp int,
 	unitsales int
) PARTITION BY RANGE (logdate);

— Step 2.
— Create Index on parent table
— Note: creation of seperate indexes on parent table is not required

CREATE INDEX measurement_indx_logdate ON measurement (logdate);

— Step 3.
— Create Default partition

CREATE TABLE measurement_default PARTITION OF measurement DEFAULT;

— Create partitions with exclusive range and dfeualt partition (catch-all for out of range values)

CREATE TABLE measurement_y2006m02 PARTITION OF measurement 
             FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement 
             FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
CREATE TABLE measurement_y2006m04 PARTITION OF measurement 
             FOR VALUES FROM ('2006-04-01') TO ('2006-05-01');
CREATE TABLE measurement_y2007m11 PARTITION OF measurement 
             FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');

Let’s review our schema now

— Step 4
— Insert sample rows

DO $DECLARE
  BEGIN
    FOR i in 1..10 loop
      INSERT INTO measurement VALUES (1,'2006-02-07',1,1);
    end loop;
  end $;

DO $DECLARE
  BEGIN
    FOR i in 1..1000000 loop
      INSERT INTO measurement VALUES (1,'2006-03-06',1,1);
    end loop;
  end $;

DO $DECLARE
  BEGIN
    FOR i in 1..1000000 loop
      INSERT INTO measurement VALUES (1,'2006-04-09',1,1);
    end loop;
  end $;

DO $DECLARE
  BEGIN
    FOR i in 1..1000000 loop
      INSERT INTO measurement VALUES (1,'2007-11-11',1,1);
    end loop;
  end $;

--Optional Step
ANALYZE measurement;

–Step 5
–Test partition elimination

  SELECT * FROM measurement
    WHERE logdate = '2007-11-11';

Let’s look at the Query Plan

As you can see, the ‘Declarative’ partitioning is much more intuitive and requires less manual steps in declaring  partitions compares to inheritance.

thanks for reading!

PostgreSQL Table Partitioning Part I – Implementation Using Inheritance

In earlier PostgreSQL versions, it was not possible to declare table partitions syntactically. Partitioning can be implemented using table inheritance. The inheritance approach involves creating a single parent table and multiple child tables (aka. Partitions) to hold data in each partition range.

In this post, I’ll discuss the implementation of table partitions using inheritance. However before proceeding, let’s first understand why do we need partitioning?

Why Partition?

The simple answer is to improve the scalability and manageability of large data sets and tables that have varying access patterns.

Typically, you create tables to store information about an entity, such as customers or sales, and each table has attributes that describe only that entity. While a single table for each entity is the easiest to design and understand, these tables are not necessarily optimized for performance, scalability, and manageability, particularly as the table grows larger.

How can partitioning help? 

  1. When tables and indexes become very large, partitioning can help by partitioning the data into smaller and more manageable sections.
  2. It allows you to speed up loading and archiving of data, so that you can perform maintenance operations on individual partitions instead of the whole table, and this in turn improves the query performance.

There is a ton of information published on partitioning, But if you new to partitioning in PostgreSQL, below are some great examples:

Now that we have some insight what table partitioning is, let’s do a real partitioning  using below scripts:

How to Partition a Table?

— Step 1.
— CREATE PARENT & CHILD TABLES

CREATE TABLE parent (
id int,
col_a varchar,
col_b varchar);
--Child Table 1
CREATE TABLE range1() INHERITS (parent);
--Child Table 2
CREATE TABLE range2() INHERITS (parent);
--Child Table 3
CREATE TABLE range3() INHERITS (parent);

Let’s review the schema now

— Step 2.
— CREATE PARTITION FUNCTION THAT WILL HOLD LOGIC OF ‘ON-INSERT’ TRIGGER

CREATE OR REPLACE FUNCTION partition_parent() RETURNS trigger as $$
  BEGIN
    IF (new.id < 10) THEN
         INSERT INTO range1 VALUES (new.*) ;
    ELSEIF (new.id >= 10 AND NEW.id < 20 ) then
         INSERT INTO range2 VALUES (new.*) ;
    ELSEIF (new.id >= 20 AND NEW.id < 30 ) then
         INSERT INTO range3 VALUES (new.*) ;
    ELSE
         RAISE EXCEPTION 'out of range';
END IF;

RETURN NULL;
END;
$$ language plpgsql;

— Step 3.
— CREATE TRIGGER AND ATTACH IT TO THE PARENT TABLE TO BE PARTITIONED

CREATE TRIGGER partition_parent_trigger
    BEFORE INSERT ON parent
        FOR EACH ROW EXECUTE PROCEDURE partition_parent();for each row execute PROCEDURE partition_parent();

— Step 4.
— INSERT SAMPLE ROWS

DO $$DECLARE
  BEGIN
     FOR i in 1..29 LOOP
          INSERT INTO parent(id, col_a, col_b) VALUES (i, 'a', 'b');
     END LOOP;
END $$;

— Optional Step (stats update)

ANALYZE parent;
-- Step 5.
-- TEST PARTITION ELIMINATION (PRUNING)

EXPLAIN ANALYZE
SELECT * from parent
   WHERE id = 5;

Query Plan 

In the next Post in this series, I’ll discuss the new ‘Declarative Partitioning‘ implementation

thanks for reading! 

Hello world….once again

Hello World !

This is continuation to my SQL Server blog on MSDN. With this blog post, I intend to help fellow database engineers community using PostgreSQL / SQL Server as a Database Platform, with occasional other stuff thrown in related to other RDBMS platforms and their respective integration issues. Thanks to my wife who’d been pestering me to blog for a year or so.

I’ve spent more than a decade working on various different RDBMS platforms including PostgreSQL, SQL Server, Oracle, MySQL. My current focus area is PostgreSQL and it’s implementation in a large enterprise landscape overcoming challenges of scale, manageability and automation.

Thanks for your time reading and I hope all my effort will help (or at least entertain) you at many levels. I would really appreciate if you could let me know what you think about it, good or bad. I always appreciate feedback 🙂

Sincere Thanks!

Varun