
Whenever you are dealing with pattern matching in PostgreSQL, it is often required to write queries to match string patterns. The conventional wisdom to do pattern match was using the LIKE
operator, which is not very effective for many reasons. PostgreSQL now offers more advanced pattern matching options such as SIMILAR TO
expressions, POSIX regular
expressions, and ANY
operator. In this article, we’ll compare the LIKE operator with these other advanced pattern matching features.
1. LIKE
operator
The LIKE
expression returns true if the string
matches the supplied pattern
. For example, here is a query to find users with a “gmail.com” email account. This query uses the wildcard %
to find users with emails that end with “@gmail.com”.
SELECT *
FROM employees
WHERE email LIKE '%@gmail.com';
first_name | last_name | |
---|---|---|
Steven | K | steven.k@gmail.com |
Neena | K | neena.k@gmail.com |
Now, what if, you have to find users with both “@outlook.com” and “@gmail.com”. Here we can use an OR
condition to match against either one of these patterns.
SELECT *
FROM employees
WHERE email LIKE '%@gmail.com'
OR email LIKE '%@outlook.com';
first_name | last_name | |
---|---|---|
Steven | K | steven.k@gmail.com |
Neena | K | neena.k@gmail.com |
David | A | david.a@outlook.com |
Valli | P | valli.p@outlook.com |
Say there are 20 distinct email patterns to match. The more patterns that we match, we add more OR conditions, and the query gets more complicated (and less readable). This is where SIMILAR TO
operator comes in handy.
2. SIMILAR TO
operator
SIMILAR TO
operator lets you provide multiple patterns to match separated by a pipe operator. It is similar to LIKE
, except that it interprets the pattern using the SQL standard’s definition of a regular expression. Also like LIKE
, SIMILAR TO
uses _
and %
as wildcard characters denoting any single character and any string, respectively.
SELECT *
FROM employees
WHERE email SIMILAR TO '%@gmail.com|%@outlook.com';
first_name | last_name | |
---|---|---|
Steven | K | steven.k@gmail.com |
Neena | K | neena.k@gmail.com |
David | A | david.a@outlook.com |
Valli | P | valli.p@outlook.com |
Using SIMILAR TO
you can match against multiple patterns and keep the query simple.
3. Introducing – REGEXP
operator
Another way to match against multiple patterns is using the regular expression, or REGEXP
. It is well known across many programming languages such as C, C++, .NET, Java that supports regexp. It is largely based on the powerful UNIX grep search tools.
SELECT *
FROM employees
WHERE email ~ '@gmail\.com$|@outlook\.com$';
first_name | last_name | |
---|---|---|
Steven | K | steven.k@gmail.com |
Neena | K | neena.k@gmail.com |
David | A | david.a@outlook.com |
Valli | P | valli.p@outlook.com |
When using REGEXP
you need to take be a bit more cautious. A period “.
” will match anything, so to match the period “.
” in gmail.com
or outlook.com
, you need to add the escape character “\.
“.
4. ANY
to match against array of patterns
PostgreSQL has an ANY
operator that is used to compare a scalar value with a set of values returned by a subquery. And in this case, the sub-query is an array for patterns to match.
SELECT *
FROM employees
WHERE email ~ ANY(ARRAY['@gmail\.com$', '@outlook\.com$']);
first_name | last_name | |
---|---|---|
Steven | K | steven.k@gmail.com |
Neena | K | neena.k@gmail.com |
David | A | david.a@outlook.com |
Valli | P | valli.p@outlook.com |
Here, this query uses the ANY
operator to match against an array of email domain. If the email matches any of the patterns, the condition will be true. This approach is easier to work with from a host language such as Python.
Conclusion
- PostgreSQL has a complete, and extremely tunable regular expression engine built right in.
- Compared to LIKE, regular expressions offers much more flexibility and performance for pattern matching.
- Also, everything you learn about PostgreSQL regular expressions is transferable to other programming environments. Regex is everywhere.
- Finally, To learn more about pattern matching using regular expressions in PostgreSQL, I would recommend you follow the official tutorial.
Enjoy what you’re reading?
Join my mailing list, and I’ll let you know whenever I write another post. No spam, I promise! 🧑🏼💻