JavaScript Alternation (OR)

JavaScript Alternation (OR)

JavaScript Alternation (OR)


This chapter is dedicated to another term in JavaScript regular expressions: Alternation. The simple form of alternation is known as OR.

It is signified by a vertical line character | in a regular expression.

Let’s consider an example where it is necessary to find the following programming languages: HTML, PHP, Java, or JavaScript. The matching regexp will look as follows:html|php|java(script)?.

Here it is in usage:

let regexp = /html|php|css|java(script)?/gi;

let str = "First learn HTML, then CSS, then JavaScript";

console.log(str.match(regexp)); // 'HTML', 'CSS', 'JavaScript'

A similar thing is square brackets, allowing to choose among multiple characters. For example, pr[ae]y corresponds to pray or prey. But, there is a difference between square brackets and alternation. While square brackets allow only characters or character sets, with alternation you can use any expressions. For instance, a regular expression C|D|E means one of the following expressions: CD or E.

For applying alternation to a given part of the pattern, it should be enclosed in parentheses, like here:

  • I use Java|JavaScript matches I use Java or JavaScript.
  • I use (Java|JavaScript matches I use Java or I like JavaScript.

Regexp for Time: example

Imagine you need to build a regular expression to search time in the hh:mm form ( for example, 11:00), but using a simple\d\d:\d\d can be confusing. It may accept 26:99 as the time (where 99 matches the pattern, and the time is not valid).

To make a better pattern, you should use a more careful matching. The first thing that you should pay attention to is the hours. In case the initial digit is either 0 or 1, the following digit should be [01]\d. Differently, if the initial digit is 2, then the next one should be [0-3]. Note that any other first digit is not allowed.

Writing both of the variants in a regexp with alternation will look like this: [01]\d|2[0-3].

The minutes should be from 00 to 59. In the language of regexp, it will be [0-5]\d. After connecting seconds and minutes, the pattern will look as follows: [01]\d|2[0-3]:[0-5]\d.

There is the result but with a problem. The alternation is between [01]\d and 2[0-3]:[0-5]\d . In other words, minutes are inserted into the second alternation variant. It looks like this:

[01]\d | 2[0-3]:[0-5]\d

So, the pattern searches for 2[0-3]:[0-5]\d or [01]\d, which is not correct. The alternation must always be inserted in the hours part of the regexp and allow 2[0-3] or[01]\d.

The final and correct variant will look like this:

let regexp = /([01]\d|2[0-3]):[0-5]\d/g;

console.log("00:00 11:11 23:59 26:99 1:2".match(regexp)); // 00:00,10:10,23:59

So, in this example, “hours” is enclosed into parentheses.

Reactions

Post a Comment

0 Comments

close