MySQL REGEXP: Search Based On Regular Expressions
Summary: in this tutorial, you will learn how to use the MySQL REGEXP operator to perform complex searches based on regular expressions.
Introduction to regular expressions
A regular expression is a special string that describes a search pattern. It is a powerful tool that gives you a concise and flexible way to identify strings of text e.g., characters, and words, based on patterns.
For example, you can use regular expressions to search for email, IP address, phone number, social security number, or anything that has a specific pattern.
A regular expression uses its own syntax that can be interpreted by a regular expression processor. A regular expression is widely used in almost platforms from programming languages to databases including MySQL.
The advantage of using regular expression is that you are not limited to search for a string based on a fixed pattern with the percent sign (%) and underscore (_) in the LIKE
operator. The regular expressions have more meta-characters to construct flexible patterns.
The disadvantage of using regular expression is that it is quite difficult to understand and maintain such a complicated pattern. Therefore, you should describe the meaning of the regular expression in the comment of the SQL statement. In addition, the speed of data retrieval, in some cases, is decreased if you use complex patterns in a regular expression.
The abbreviation of regular expressions is regex or regexp.
MySQL REGEXP operator
MySQL adapts the regular expression implemented by Henry Spencer. MySQL allows you to match pattern right in the SQL statements by using REGEXP
operator.
The following illustrates the syntax of the REGEXP
operator in the WHERE
clause:
SELECT
column_list
FROM
table_name
WHERE
string_column REGEXP pattern;
This statement performs a pattern match of a string_column
against a pattern
.
If a value in the string_column
matches the pattern
, the expression in the WHERE
clause returns true, otherwise, it returns false.
If either string_column
or pattern
is NULL
, the result is NULL
.
In addition to the REGEXP
operator, you can use the RLIKE
operator, which is the synonym of the REGEXP
operator.
The negation form of the REGEXP
operator is NOT REGEXP
.
MySQL REGEXP examples
Suppose you want to find all products whose last names start with characters A, B, or C. You can use a regular expression in the following SELECT
statement:
SELECT
productname
FROM
products
WHERE
productname REGEXP '^(A|B|C)'
ORDER BY productname;
The pattern allows you to find the product whose name begins with A, B, or C.
- The character ^ means to match from the beginning of the string.
- The character | means to search for alternatives if one fails to match.
The following table illustrates some commonly used metacharacters and constructs in a regular expression.
Metacharacter | Behavior |
---|---|
^ | matches the position at the beginning of the searched string |
$ | matches the position at the end of the searched string |
. | matches any single character |
[…] | matches any character specified inside the square brackets |
[^…] | matches any character not specified inside the square brackets |
p1|p2 | matches any of the patterns p1 or p2 |
* | matches the preceding character zero or more times |
+ | matches preceding character one or more times |
{n} | matches n number of instances of the preceding character |
{m,n} | matches from m to n number of instances of the preceding character |
To find products whose names start with the character a
, you use the metacharacter '^'
to match at the beginning of the name:
SELECT
productname
FROM
products
WHERE
productname REGEXP '^a';
If you want the REGEXP
operator to compare strings in a case-sensitive fashion, you can use the BINARY operator to castcast a string to a binary string.
Because MySQL compares binary strings byte by byte rather than character by character. This allows the string comparison to be case-sensitive.
For example, the following statement matches the only uppercase "C"
at the beginning of the product name.
SELECT
productname
FROM
products
WHERE
productname REGEXP BINARY '^C';
To find the product whose name ends with f, you use 'f$'
to match the end of a string.
SELECT
productname
FROM
products
WHERE
productname REGEXP 'f$'
To find the product whose name contains the word "ford"
, you use the following query:
SELECT
productname
FROM
products
WHERE
productname REGEXP 'ford';
To find the product whose name contains exactly 10 characters, you use ‘^'
and ‘$
to match the beginning and end of the product name, and repeat {10}
times of any character ‘.'
in between as shown in the following query:
SELECT
productname
FROM
products
WHERE
productname REGEXP '^.{10}$';
In this tutorial, you have learned how to query data using the MySQL REGEXP operator with regular expressions.
0 Comments
CAN FEEDBACK
Emoji