Some useful Regular Expression examples

Some useful Regular Expression examples

Regular expression is a special sequence of characters that helps you match/find strings or sets of strings. Basically it defines a search pattern for strings. They can be used to search, edit, or manipulate text and data. The abbreviation for regular expression is regex

Below are some useful regex patterns that can be used in language that supports regex. In some languages you may have to use delimiters(/) for your regex, that is /regex/

1. Any string that contains any character of ‘a’,’n’,’o’,’o’,’p’ any number of times

[anoop]

[ ] Matches any single character in brackets.

2. Any string that contains word “anoop” any number of times

(anoop)

( ) defines a capturing group. That is the matching string should contain exact word “anoop”

3. Any string that contains word “anoop” at the starting of string/line

^(anoop)

^ Outside() depicts beginning of line
( ) defines a capturing group. That is the matching string should contain exact word “anoop”

4. Any string that contains word “anoop” at the end of string/line

(anoop)$

( ) defines a capturing group. That is the matching string should contain exact word “anoop”
$ Outside() depicts end of line

5. Any string that contains characters a-z, A-Z or integers 0-9

^[a-zA-Z0-9]+$

Lets quickly understand this pattern.

^ Matches beginning of line.
[ ] defines a set. So, basically in our pattern the set can contain any alphabet whether upper/lower case (a-z,A-Z) or any integer (i.e. 0-9)
+ this says that the occurrence can be one or more times
$ Matches end of line.

6. Anything that does not contains characters a-z, A-Z or integers 0-9

[^a-zA-Z0-9]

[ ] defines a set. So, basically in our pattern the set can contain any alphabet whether upper/lower case (a-z,A-Z) or any integer (i.e. 0-9)
[^ ] depicts negation. So, basically anything that does not matches alphabets (lower/upper case) or integers.

7. Matching set of words “bat”, “bet”, “bit”, “bot” and “but”

b[aeiou]t

Word is starting with ‘b’
Middle character can be ‘a’,’e’,’i’,’o’,’u’
Last character has to be ‘t’

8. Matching a username

^[a-z0-9_-]{3,16}$

We begin by telling the parser to find the beginning of the string (^), followed by any lowercase letter (a-z), number (0-9), an underscore, or a hyphen. Next, {3,16} makes sure that at least there are 3 of those characters, but no more than 16. Finally, we want the end of the string ($).

9. Matching a password

^[a-z0-9_-]{6,18}$

Matching a password is very similar to matching a username. The only difference is that instead of 3 to 16 letters, numbers, underscores, or hyphens, we want 6 to 18 of them {6,18}.

10. Matching an Email

^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$

Let’s understand this in pieces.
^([a-z0-9_\.-]+)@ We begin by telling the parser to find the beginning of the string (^). Inside the first group, we match one or more lowercase letters, numbers, underscores, dots, or hyphens(one or more occurrence is mandatory). I have escaped the dot because a non-escaped dot means any character. Directly after that, there must be an @ sign.
([\da-z\.-]+)\. Next is the domain name which must be: one or more lowercase letters, numbers, underscores, dots, or hyphens. Then another (escaped) dot.
([a-z\.]{2,6})$ The extension being two to six letters or dots. Finally, we want the end of the string ($).

11. Matching a URL

^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$

Let’s understand this in pieces.
^(https?:\/\/)? The first capturing group is all option. It allows the URL to begin with “http://”, “https://”, or neither of them. I have a question mark after the s to allow URL’s that have http or https. In order to make this entire group optional, I just added a question mark to the end of it which says zero or one occurrence.
([\da-z\.-]+)\. One or more numbers, letters, dots, or hypens. \d is any digit, short for [0-9]. Inside the group, we want to match any number of letters, numbers, dots, or hyphens. Then we say that this group can be matched as many times as we want, followed by a dot. That is anoop.rai.noi-da.com.
([a-z\.]{2,6}) Two to six letters or dots. That is in.hi OR anrai.
([\/\w \.-]*)*\/?$ Inside this group, we want to match any number of forward slashes, letters, numbers, underscores, spaces, dots, or hyphens. \w is a word character, short for [a-zA-Z_0-9]. Then we say that this group can be matched as many times as we want. Followed by a forward slash. I have used the star instead of the question mark because the star says zero or more, not zero or one. That is /anoop/rai/buxar

You can try these commands on

http://www.regexpal.com

Leave a Reply

Your email address will not be published. Required fields are marked *