RegEx, an abbreviation of "regular expressions", are a set of characters which are used to create patterns that can be used to search, find, replace, or validate text.
RegEx is language agnostic, although some features are not available to some languages. You can test out RegEx here before putting it in your code.
We will look at some examples:
RegEx: /dog/
Text: "The dog is big dog"
Regular expressions start with a forward slash /
followed by the pattern we want to match, and another forward slash /
to end it. In this first example, we want to match a simple pattern: "dog". Our RegEx will highlight the first pattern it matches, which is the bold character in there.
Sometimes, we want our RegEx to highlight not only the first pattern, but all the patterns that can be found in the string. This is where expression flags are introduced. Expression flags are always introduced after the last forward slash. We will use the global expression flag to identify all patterns.
RegEx: /dog/g
Text: "The dog is big dog"
With the global flag, our pattern identifies all the characters in this string.
Here are some different expression flags that can be used.
g
- The global expression flag, identifies all the pattern in the texti
- The case insensitive flag, identifies a pattern regardless of the case (either lower case or upper case)m
- The multi line match, identifies a pattern that spans multiple lines
RegEx: /\d/
or /[0-9]/
Text: "1 2343"
There are two ways to match numbers: the first one, \d
, is a pattern that matches any single decimal number. The d
is case sensitive- the backward slash is used to escape the character d
, which can be used to identify only decimal numbers. The second one [0-9] also matches any decimal number, but in a different way, it starts and ends with a square bracket. We will touch more on square brackets soon. However, we want to identify a number from 0–9. So, like this indicates, we can also do [4-9], which means it will identify a number from 4–9, so it will have this instead of "12343".
If we want to identify a concurrent number, we will have to use some commands.
RegEx: /\d+/
or /[0-9]+/
Text: "12343"
Adding + to our pattern will highlight all the numbers because it means match 1 or more of the preceding pattern. Commands can help a lot when writing RegEx. I will display a command cheat sheet below.
Commands cheat sheat
Command | Description | Pattern | Matches |
---|---|---|---|
. | Matches any character | c.c | cac, cbc, cgc, cdc, cfc |
* | Matches 0 or more of the preceding token | cat\d* | cat, cat24, cat56 |
+ | Matches 1 or more of the preceding token | cat\d* | cat24, cat56 |
? | Matches 0 or 1 of the preceding token | cat\d* | cat, cat5 |
^ | Matches beginning of the text | ^cat | catdaddy, cat mummy |
$ | Matches end of the text | ^os | los, lagos |
[...] | Matches any pattern in the brackets | [0-9xyz] | x, y, 1 |
[^...] | Matches any pattern not in the brackets | [^0-9xyz] | a, b, cd |
[a-z] | Matches any character within that range | [0-9] | 0, 1, 2, 3 |
{a} | Matches a preceding pattern a number of (a) times | cat{2} | catcat |
{a,} | Matches a preceding pattern a number of (a) times or more | cat{2,} | catcat, catcatcatcat |
{a,b} | Matches a preceding pattern between a or b | cat{1,2} | cat, catcat |
. | Matches any character | co.l | cool, coal |
OR operator | cat |
With the commands above, we can create complex Regex:
Alphanumeric texts
RegEx: /[0-9A-Za-z]+/
Text: "bensoN123"
Twitter Handle
Twitter handles contain numbers, letters, and underscores, but start with @.
RegEx: /^@[0-9A-Za-z_]+/
Text: "@bensoN123"
Simple Email Validation
RegEx: /^[\w]+@[\w]+.[a-zA-Z]{2,4}$/ Text: "wole@mail.co"
RegEx can be complicated at first, but the more you work with it, the easier it gets. You can explore more options, like negative look behind, and positive look ahead. If you have more questions, be free to reach out to me through my Github.