Regular Expression or Regex
Regular expressions are useful to parse a file and validate or replace with our need, few of examples are regex are
- email address validation
- number validation
- password validation and strength check
in this article we will explore more on regex
Syntax Variation
there are 2 variation of writing regex pattern in #JavaScript
using back slash / /
- when using back slashes , we do not need to wrap the expression in quotes
using string pattern
using ” then no need to wrap pattern in /
/
and then to create a regexp using new RegExp('pattern')
eg.
also there are subtle difference between these 2 variation and
also when we need to use character class such as \d
( digit only ) \b
(word boundary) then in string pattern we need to use extra \
and we can get regex pattern from string pattern using regex.source
method
flags
flags are very distinctive usage while using regex, the very common flags we use mostly are
/g
—> do global search/i
—> do case insensitive search/m
—> do multiline search
and tis is how do we write flag in both regex syntax
in backslash pattern, write after ending /
, for eg. /[a-z0-9]/gi
in string pattern, we set as second argument, for eg. new RegExp('[a-z0-9]', 'gi')
apart from these common flags there are few other useful flags are which we talk here
/d flag
do not confuse this with \d
character class
this flag is useful when we use capture groups and it provide the capture group and matched group index array
also note this works only when we have /g flag , means both comes together
/y flag
this is conditional search in regex. In Regex we can not search from a specific range like we want to search after particular match
for eg. I want to capture all property of a css declaration block , so first search for opening bracket {
and after that we search for property and value and so on but this is not possible as regex always start from start of page
here sticky flag /y comes handy, we can set index of regex pattern and then match it
/u flag
this is unicode match flag as if we have smiley, or some unicode pattern in our string.
also when we use \p character class then it must have \u
flag
\p
is very useful character class , so we talk about character class
character class
very common character class are
\d
to capture digit\w
to capture all word\s
to capture white space\p
pre defined class,/u
flag is mandatory when we use this
RegEx return
normally we use while
loop for regex.match
to get all matched pattern, but es2023 introduce new method .matchAll()
which is easier to work on
with while loop
Note: above will return Array(21) every word and empty string as match. why?
because we are using *
so it means 0 or 1 time
to match only word; change *
with +
now it will return Array(10)
with matchAll
Capture Groups and Named Group
This is useful when you want to capture 2 or more instances but within array it is hard to identify which is matches what?
for eg, I have to find out browser and version both from a user agent string
navigator.userAgent gives below output in WritableStreamDefaultWriter
“‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36”
which will give below result
but above is the result I have coped from console which show each entry as Array(4)
but actually result structure is an mixed array ( PHP called it associative array, in Javascript it is called what?)
You only access other properties using dot notation only.
now we got all result but we want that it must separate each entry like
and so on
we can achieve this using capture group (?<name>)
; lets change the regex
and now see the result and each entry have value under groups
key
now you can use .matchAll()
and get groups
property value separately and we also get index of each match in index
property
still we are not getting complete version including minor and patch ; we are getting 5 but the value we need is 5.0
so modify the regex again and create a complete group for the version
Task: works fine when we have version up to 2 dots but for chrome we have 126.0.0.0
and we want to capture complete; what could be the proper regex?