PostgreSQL Advanced Pattern Matching – Beyond LIKE Operator

Whenever you are dealing with pattern matching in PostgreSQL, it is often required to write queries to match string patterns. The conventional wisdom to do pattern match was using the LIKE operator, which is not very effective for many reasons. PostgreSQL now offers more advanced pattern matching options such as SIMILAR TO expressions, POSIX regular expressions, and ANY operator. In this article, we’ll compare the LIKE operator with these other advanced pattern matching features.

1. LIKE operator

The LIKE expression returns true if the string matches the supplied pattern. For example, here is a query to find users with a “gmail.com” email account. This query uses the wildcard % to find users with emails that end with “@gmail.com”.

SELECT *
FROM   employees
 WHERE  email LIKE '%@gmail.com'; 
first_name last_name email
Steven K steven.k@gmail.com
Neena K neena.k@gmail.com

Now, what if, you have to find users with both “@outlook.com” and “@gmail.com”. Here we can use an OR condition to match against either one of these patterns.

SELECT *
FROM   employees
 WHERE  email LIKE '%@gmail.com'
     OR email LIKE '%@outlook.com'; 
first_name last_name email
Steven K steven.k@gmail.com
Neena K neena.k@gmail.com
David A david.a@outlook.com
Valli P valli.p@outlook.com

Say there are 20 distinct email patterns to match. The more patterns that we match, we add more OR conditions, and the query gets more complicated (and less readable). This is where SIMILAR TO operator comes in handy.

2. SIMILAR TO operator

SIMILAR TO operator lets you provide multiple patterns to match separated by a pipe operator. It is similar to LIKE, except that it interprets the pattern using the SQL standard’s definition of a regular expression. Also like LIKESIMILAR TO uses _ and % as wildcard characters denoting any single character and any string, respectively.

SELECT *
FROM employees
 WHERE email SIMILAR TO '%@gmail.com|%@outlook.com';
first_name last_name email
Steven K steven.k@gmail.com
Neena K neena.k@gmail.com
David A david.a@outlook.com
Valli P valli.p@outlook.com

Using SIMILAR TO you can match against multiple patterns and keep the query simple.

3. Introducing – REGEXP operator

Another way to match against multiple patterns is using the regular expression, or REGEXP. It is well known across many programming languages such as C, C++, .NET, Java that supports regexp. It is largely based on the powerful UNIX grep search tools.

SELECT *
FROM employees
 WHERE email ~ '@gmail\.com$|@outlook\.com$';
first_name last_name email
Steven K steven.k@gmail.com
Neena K neena.k@gmail.com
David A david.a@outlook.com
Valli P valli.p@outlook.com

When using REGEXP you need to take be a bit more cautious. A period “.” will match anything, so to match the period “.” in gmail.com or outlook.com, you need to add the escape character “\.“.

4. ANY to match against array of patterns

PostgreSQL has an ANY operator that is used to compare a scalar value with a set of values returned by a subquery. And in this case, the sub-query is an array for patterns to match.

SELECT *
FROM employees
 WHERE email ~ ANY(ARRAY['@gmail\.com$', '@outlook\.com$']);
first_name last_name email
Steven K steven.k@gmail.com
Neena K neena.k@gmail.com
David A david.a@outlook.com
Valli P valli.p@outlook.com

Here, this query uses the ANY operator to match against an array of email domain. If the email matches any of the patterns, the condition will be true. This approach is easier to work with from a host language such as Python.

Conclusion

  • PostgreSQL has a complete, and extremely tunable regular expression engine built right in.
  • Compared to LIKE, regular expressions offers much more flexibility and performance for pattern matching.
  • Also, everything you learn about PostgreSQL regular expressions is transferable to other programming environments. Regex is everywhere.
  • Finally, To learn more about pattern matching using regular expressions in PostgreSQL, I would recommend you follow the official tutorial.
Enjoy what you’re reading?

Join my mailing list, and I’ll let you know whenever I write another post. No spam, I promise! 🧑🏼‍💻

Unknown's avatar

Author: Varun Dhawan

I’m Varun. I used to be a Software Engineer building data applications for large corporations like McKinsey and Target. Now, I’m a Product Manager at Microsoft, making Azure PostgreSQL the go-to platform for running mission-critical workloads (and no, I’m not obsessing over every little detail… I swear). When I’m not working, you can find me blogging at data-nerd.blog, where I help fellow data enthusiasts master PostgreSQL, sharpen their coding skills, and navigate their careers with confidence. And if there’s one thing you absolutely need to know about me, it’s that…I'm apparently a great cook—just don’t ask why I’m the only one who eats my food.

Leave a comment