PostgreSQL – how to UPSERT (Update or Insert into a table)

Most modern-day relational database systems use SQL MERGE (also called UPSERT) statements to INSERT new records or UPDATE existing records if a matching row already exists. UPSERT is a combination of Insert and Update, driven by a “PRIMARY KEY” on the table.

Why Upsert?

Say, we have a use case to insert all records from a text file sent by an external app every day into our database table that has a PRIMARY KEY defined. During the insert, if a duplicate record is found, entire insert batch will fail with “duplicate key violation error“. Annoying, right?!

Postgres 9.5 (and later) has introduced UPSERT (INSERT ON CONFLICT) operation, that can allow either updating the data for duplicate row in or just silently skipping the duplicate row, without any error.

Here, let me explain this using coffee shop scenario – coffee makes almost everything easy 😛

Coffee Shop Scenario ☕

I own a coffee shop, and I want to keep a record of drink preferences for all my customers. There’s one problem though, the preferences change over time. So how can I record all drink preference of for my customer in one place, without creating a duplicate row in database. Until now, to solve this, this would require writing a complex stored procedure in SQL, PL/pgSQL, C, Python, etc. In Postgres 9.5 (and later), this could be easily achieved using single line INSERT ON CONFLICT statement. Let’s see how…

Step 1. Create a test table

	id serial PRIMARY KEY,
	preference VARCHAR NOT NULL);

myCoffee table consists of three columns: id, name, and preference. The name column has a unique constraint to guarantee the uniqueness of customer names.

Step 2. INSERT sample rows

    myCoffee (name, preference)
    ('Tom', 'Cappuccino'),
    ('Joe', 'Espresso'),
    ('Varun', 'Frappé');

Step 3. UPSERT in action

Now suppose Joe visits my shop again on a hot summer day and asks for an ‘Iced Tea’ instead of his preferred drink. We have few options here:

Options A. Serve Joe the drink and skip updating his “new” drink preference (we already have his data in our table).
The below INSERT statement has ON CONFLICT DO NOTHING. This will ensure that if a row already exists, it will be skipped. Else, a new row will be inserted.

INSERT INTO myCoffee (NAME, preference)
	VALUES('Joe','Iced Tea') 
	ON CONFLICT (name) 

Options B. Or else, we can register ‘Iced Tea’ as his new drink preference using ON CONFLICT DO UPDATE as in the below statement

INSERT INTO myCoffee (name, preference)
	VALUES('Joe','Iced Tea is my 2nd favorite') 
	ON CONFLICT (name) 
	DO UPDATE SET preference = EXCLUDED.preference || ';' || myCoffee.preference;
2JoeIced Tea is my 2nd favorite; Espresso


So, we just learned that UPSERT is a combination of two different SQL statements UPDATE and INSERT. It works on a simple rule that if a new row being inserted does not have any duplicate then insert it, else if there are duplicate rows then either skip insert or update the new column value.

Reference – ON CONFLICT Clause 

UPSERT is a helpful feature to handle records that require to be updated frequently. I hope y’all found this post useful.

Like what I write? Please join my mailing list, and I’ll let you know whenever I write another post. No spam, I promise!  👨‍💻

Author: Varun Dhawan

Hello dear reader, I'm a Product Manager @Microsoft based in MN, US (beautiful land of 10,000 lakes). I am perpetually curious and always willing to learn and engineer systems that can help solve complex problems using data. When I am not engineering or blogging, you’ll find me cooking and spending time with my family.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: