Regular Expressions cheat sheet


Column 1 Positive Lookbehind A positive lookbehind assertion is used to match a group of characters that come before the main part of the pattern. It's denoted by (?<=...), where ... represents the desired preceding pattern. This allows for matching only if certain conditions are met behind the main part. `/(?<=prefix)mainpattern/` Negative Lookahead Negative lookahead is a type of assertion in regular expressions that matches any input string not followed by a specific pattern. It's denoted by (?!pattern). This can be useful for excluding certain patterns from the match. `/\b(?!un)\b\\w+/g` (?=pattern) - Positive lookahead assertion syntax and usage Positive lookahead is a non-capturing group that matches the search string only if it’s followed by another specified pattern. It does not include the matched text in its result. This can be useful for validating patterns without including them in the match. `/\b(?=\btest\b)\b\blanguage\b/` Positive Lookahead Positive lookahead is a type of assertion in regex that matches a group only if it is followed by another specific pattern. It does not consume any characters or include the matched text in the match result. Positive lookahead is denoted using (?=pattern) syntax. `/\b(?!un)\b\bt[a-z]+\b/` (?<=pattern) - Positive lookbehind assertion syntax and usage Positive lookbehind assertion is used to match a group of characters that come before the main pattern. It checks if the text preceding the current position matches 'pattern'. However, it does not include 'pattern' in the matched result. `/(?<=abc)def/g` (?!pattern) - Negative lookahead assertion syntax and usage The (?!pattern) is a negative lookahead assertion that matches any string not followed by the specified pattern. It does not include the matched text in its result, only indicating whether or not it's present. `/\b(?!un)\b\bt[a-z]+\b/` Negative Lookbehind Negative lookbehind is a type of assertion that matches a group only if it is not preceded by another specific pattern. It's denoted by the syntax (?<!pattern). This can be useful for excluding certain patterns from matching. `/(?<!\\d)abc/ will match 'abc' in 'x abc', but not in '123abc'` (?<!pattern) - Negative lookbehind assertion syntax and usage The negative lookbehind assertion, (?<!pattern), is used to match a specific position in the string that is not preceded by a certain pattern. It does not consume any characters or include the matched text in the result. This can be useful for finding matches based on what precedes them without including those preceding characters in the match. `/(?<!abc)def/g` Grouping with parentheses Parentheses are used to group subexpressions together and capture the matched substring. They can be used for applying quantifiers, alternation, or capturing groups. To create a non-capturing group, use (?:...). Backreferences to captured groups can be made using \\1 through \\9 in some regex flavors. `/(ab)+/` Capturing Groups Capturing groups are used to capture and extract specific parts of a match. They are defined by enclosing the pattern in parentheses, which creates a numbered capturing group. The captured content can be referenced using backreferences or accessed programmatically after the match is found. `/(\\d{3})-(\\d{2})-(\\d{4})/g` Non-capturing groups Non-capturing groups are used to group a subpattern without capturing the matched text. They do not create back references and can be useful for improving performance in cases where you don't need to capture the grouped elements. `/(?:regex)/` Nested Grouping Nested grouping in regular expressions allows for creating subgroups within a larger group. This can be useful when capturing multiple levels of information or applying quantifiers to the entire nested group. `/(ab(cd)ef)/` Conditional Grouping Conditional grouping in regular expressions allows for creating optional patterns. It is denoted by the '?' symbol and can be used to match zero or one occurrence of a specific pattern. `/colou?r/ matches both 'color' and 'colour'` Named Capturing Groups Named capturing groups allow you to assign a name to the subpattern within parentheses. This makes it easier to reference and extract specific parts of the matched pattern. To create a named capturing group, use the syntax (?<name>pattern). Named captured groups are useful for organizing complex regex patterns and improving code readability. `/(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})/` Backreferences to captured group In regex, backreferences allow you to match the same text as previously matched by a capturing group. They are represented using \\1 for the first capture group, \\2 for the second, and so on. Backreferences can be used in search patterns or replacements. `/(\\w)\\1/ // This pattern matches two consecutive identical word characters` Incremental numbering of back references When using multiple capturing groups in a regular expression, the numbered backreferences can be used to refer to these captured groups. The incremental numbering allows you to reference each group by its position starting from left to right. For example, \\1 refers to the first captured group, \\2 refers to the second one and so on. `/(\\w+)-(\\d+)-\\1-(\\d+)/` ^ (caret) The caret symbol ^ is used in regex to match the start of a string. It asserts that the following pattern must appear at the beginning of the text being searched. `/^hello/ matches 'hello world' but not 'world hello'` $ (dollar sign) The dollar sign is used in regular expressions to match the end of a line. It signifies that the pattern should be found at the end of the input string. `/pattern$/` \\B (non-word boundary) The \\B metacharacter matches a position that is not a word boundary. It asserts the opposite of \\b, meaning it matches any position within the input where there isn't a word boundary. `/\\Btest/ will match 'attest' but not 'test'` (?!...) negative lookahead Negative lookahead is a type of assertion in regular expressions that specifies a pattern not to be found ahead. It's useful for matching patterns only when they are not followed by another specific pattern. The syntax for negative lookahead is (?!pattern). For example, the regex 'foo(?!bar)' will match occurrences of 'foo' only if it's not followed by 'bar'. Negative lookaheads do not consume characters in the string. `/\b(?!\bnot\b)\b\ba+\b/` (?=...) positive lookahead Positive lookahead is a non-capturing group that matches the search pattern only if it's followed by another pattern. It does not include the matched text in its result. This allows for more complex matching conditions without including the trailing characters in the match. `/\b(\bword\b)(?=\bpattern\b)/g` (?<!...) negative lookbehind Negative lookbehind is a type of assertion that matches a specific pattern only if it's not preceded by another specified pattern. It allows you to define what should not precede the matching text. This can be useful for excluding certain patterns from your match. `/(?<!abc)def/ - Matches 'def' only if it's NOT preceded by 'abc'` (?<=...) positive lookbehind Positive lookbehind is a regex construct that matches a group of characters only if they are preceded by another specific pattern. It does not include the preceding pattern in the match result. `/(?<=prefix)target/` Column 2 Anchors (^ and $) ^ matches the start of a string, while $ matches the end. They are used to specify where in a line or string you want to match. `/^hello/ - This regex will only match if 'hello' is at the beginning of the input.` *Quantifiers (, +, ?, {n}, {n,m})** Quantifiers are used to specify the number of occurrences a character or group can have. The asterisk () matches zero or more occurrences. The plus sign (+) matches one or more occurrences. The question mark (?) matches zero or one occurrence. {n} specifies exactly n occurrences and {n,m} specifies at least n but no more than m occurrences. `/a/` Alternation (\|) The alternation operator \| allows for matching either the pattern before or after it. It is used to create multiple possible matches within a single regular expression. `/cat\|dog/ will match 'cat' in 'the cat is here', and will also match 'dog' in 'my dog loves treats'.` Character classes ([abc], [a-z], [^0-9]) Character classes allow you to define a set of characters that can match at a particular position in the input. For example, [abc] matches 'a', 'b', or 'c'. A range can be defined using hyphen notation like [a-z]. The caret (^) inside square brackets negates the character class; for instance, [^0-9] matches any non-digit. `/[A-Za-z]/` Backreferences ( ) In regular expressions, the backreference is used to match a previously captured group. It allows you to reuse part of the matched text in the regex pattern. The number after corresponds to the capturing group's index (1-based). For example, /(\\w)\\1/ will match repeated consecutive word characters. `/(\\w)\\1/` Assertions (?=..., ?!...) Assertions are used to define a positive or negative lookahead in the regex pattern. The syntax for positive lookahead is (?=...) and it asserts that the subpattern must be present at this position, while not consuming any characters. Negative lookahead has the syntax (?!...) and checks if the given pattern does not match after current position. `/\b(?=\bre)\b\ba\b/g` Escaping metasymbols (, \\d, \\w) When you need to match a literal parenthesis or other metacharacters like \\\\d and \\\\w in your regex pattern instead of their special meaning as a metacharacter, use the backslash (\\\\) before them. For example: To match a literal opening parenthesis '(', escape it with '\\\\('. Similarly for digits and word characters. `/^\\(123\\)-456-7890$/` Grouping and capturing ((...), (?:...)) Using parentheses in regular expressions allows for grouping of characters or subexpressions. This can be used to apply quantifiers, alternation, or capture specific parts of the matched pattern. To create a non-capturing group, use '?:' within the opening parenthesis. `/(ab)+/g` Using the pipe symbol (\|) to indicate alternation The pipe symbol (\|) is used in regular expressions to specify alternatives. It allows matching of multiple patterns, such as 'cat\|dog' which matches either 'cat' or 'dog'. The leftmost alternative takes precedence if more than one match is found. `/(cat\|dog)/` /(green\|red) apple/ matches 'green apple' or 'red apple' This regex pattern will match the string 'green apple' or 'red apple'. The '\|' symbol acts as an OR operator, allowing either of the specified options to be matched. This is useful for finding multiple variations of a particular pattern in text. `/(green\|red)apple/` Matching either of two patterns using OR operator The \| (pipe) symbol is used as the OR operator in regex. It allows you to match one pattern or another. For example, 'cat\|dog' will match either 'cat' or 'dog'. Parentheses can be used for grouping multiple items together when using the OR operator. `/(cat\|dog)/` /cat\|dog/ This regular expression matches either the word 'cat' or the word 'dog'. The '\|' character acts as an OR operator in regex, allowing for multiple possible matches. `// Example usage in JavaScript const pattern = /cat\|dog/; console.log(pattern.test('I have a cat')); // true` Differences between greedy and lazy quantifiers with alternation Greedy quantifiers match as much of the string as possible, while lazy (or non-greedy) quantifiers match as little as possible. Alternation allows for matching multiple patterns, trying each one in order. `/a+?b/ matches 'aab' in 'aaab', whereas /a+b/ matches 'aaab'` Nested Alternations When dealing with more complex matching scenarios, nested alternations can be used to create multiple levels of choices for pattern matching. This allows for greater flexibility in defining specific patterns within a larger search pattern. `/(ab(c\|d)\|ef(g\|h))/g` Using non-capturing groups When working with multiple alternatives, using non-capturing groups (?:) can help avoid capturing unnecessary data and improve performance. Non-capturing groups are useful when you need to group alternatives but don't want the matched result to be stored in a capture group. `/(?:regex1\|regex2)/` Impact on capturing groups when using alternation When using alternation, each branch of the pattern defines its own set of capturing groups. This means that if a group is not used in one branch but is used in another, it may capture different content or remain empty depending on which branch matched. It's important to be aware of this behavior and consider how it impacts the overall matching and extraction process. `/(foo\|bar)\\d+(baz)?/` Column 3 g - Global search (matches all occurrences) The 'g' flag is used to perform a global search, which means it matches all occurrences of the pattern in a string. Without this flag, only the first match would be returned. `/pattern/g` s - Dotall mode (. matches newline characters as well) In dotall mode, the period (.) metacharacter will match any character including newlines. This is useful when you want to search for patterns that span multiple lines. `/pattern/s` m - Multiline mode In multiline mode, the caret (^) and dollar sign ($) match the start and end of each line within a string. This is useful when working with multi-line strings where you want to apply these anchors on individual lines rather than the entire string. `/^start/m` u - Unicode support The 'u' flag enables full Unicode matching, including such features as case-insensitive matching and word boundaries. It allows regex to handle characters outside the ASCII range. `/pattern/u` i - Case-insensitive matching When the 'i' flag is used in a regex pattern, it allows for case-insensitive matching. This means that uppercase and lowercase letters will be treated as equal when searching for patterns. `/hello/i.test('Hello') // returns true` x – Verbose mode The verbose mode allows comments and whitespace within the pattern for readability. This can make complex patterns easier to understand by allowing you to add comments that explain different parts of the regex. `/^ # Start of line \\d{3} # Match exactly three digits [ -]? # Match a space or hyphen, which is optional \\d{4} # Match exactly four digits$/x` A-Z,0-9,-_ etc. : Character Classes/Range Modifiers Character classes allow you to match any one character from a set of characters. For example, [a-z] matches any lowercase letter and [0-9] matches any digit. Additionally, the hyphen (-) is used to specify a range within square brackets such as [A-Za-z], which will match all uppercase and lowercase letters. `/[a-zA-Z0-9_-]/` y – Sticky Mode The 'y' flag indicates sticky mode, which means the match must start at the current position in the target string for it to succeed. This is useful when searching within a long piece of text and you want to ensure that matches are found only after a specific point. `/pattern/y` Character Classes Character classes allow you to match any one character from a set of characters. They are defined using square brackets, for example, [aeiou] matches any vowel. You can also use ranges within the brackets like [a-z] which matches any lowercase letter. `/[0-9]/g` \\w - Matches a word character (alphanumeric or underscore) The \\w metacharacter matches any alphanumeric character, which includes letters from a to z and digits from 0-9. It also includes the underscore (_) symbol. This is equivalent to [a-zA-Z0-9_]. The \\w does not match spaces or punctuation. `/\\w+/g` \\d - Matches a digit (0-9) The \\d metacharacter matches any single digit from 0 to 9. It is commonly used for validating and extracting numerical data in strings. `/\\d+/g` . – matches any single character except for line terminators such as \ The dot (.) is a wildcard metacharacter that can represent any single character in regular expressions. It does not match line terminators, such as newline characters ( ). This allows the dot to be used for matching specific patterns within strings without crossing over multiple lines. `/a.c/ - The pattern will match 'abc', 'axc', or 'a-c' but not 'ab\ c'` \\s – matches whitespace characters The \\s metacharacter is used to match any whitespace character, including space, tab, newline, and more. It can be useful for finding or replacing spaces in a string. `/\\s+/g` [^abc] - Matches any single character not in the brackets The regex [^abc] will match any single character that is not 'a', 'b', or 'c'. This can be useful when you want to exclude specific characters from a search pattern. The caret (^) inside the square brackets negates the matching of characters within them. `/[^abc]/g` [abc] - Matches any single character within the brackets The [ ] construct matches a single character out of several possible characters. For example, the pattern '[abc]' will match either 'a', 'b', or 'c'. It does not match strings like 'ab' or 'ba'. This is also known as a character class. `/[abc]/` [0-9]+ This regex pattern matches one or more digits. It is commonly used to find and extract numerical values from text data. `/[0-9]+/g` [a-zA-Z] This regex pattern matches all lowercase and uppercase letters. It is commonly used to validate input that should only contain alphabetic characters. `/^[a-zA-Z]+$/` Wildcard: . The dot (.) is a wildcard character that matches any single character except for line terminators. It can be used to represent any letter, digit, whitespace or symbol in the search pattern. `/a.c/ would match 'abc', 'axc', but not 'ac' or 'abbc'` Zero or more: * The asterisk () in regex matches zero or more occurrences of the preceding element. It allows for flexibility when searching for patterns that may repeat multiple times, including not at all. `/ab/ - This pattern will match 'b', 'ab', and even 'aaab'.` {n}: Exactly n times The {n} quantifier matches the preceding element exactly n times. For example, 'a{3}' will match the string 'aaa' but not 'aa'. It can be used with any character or group of characters. `/a{2}/g` Optional: ? The question mark (?) indicates that the preceding character in the regular expression is optional, meaning it can appear zero or one time. It allows for flexibility when matching patterns. `/colou?r/ matches both 'color' and 'colour'` {n,m}: Between n and m times This quantifier specifies that the preceding character or group must occur at least n times, but no more than m times. It is used to define a range of occurrences for the pattern. `/a{2,4}/ matches 'aa', 'aaa' and 'aaaa'` One or more: + + is a quantifier that matches one or more occurrences of the preceding element. It ensures that there is at least one occurrence, but it can match multiple occurrences as well. `/a+/ will match 'a', 'aa', 'aaa', and so on in a given string.` {n,}: At least n times The {n,} quantifier matches the preceding element at least n times. It allows you to specify a minimum number of occurrences for the pattern it follows. `/a{2,4}/ will match 'aa', 'aaa', or 'aaaa' but not just one 'a'` Greedy vs. Lazy Quantifiers In regular expressions, greedy quantifiers match as much of the string as possible, while lazy (or non-greedy) quantifiers match as little as possible. Greedy quantifier is denoted by adding a '?' after the qualifier, and it will try to consume maximum characters before backtracking if needed. `/a.b/ matches 'aabab', but /a.?b/ matches only 'aab'`

Column 1

Positive Lookbehind

A positive lookbehind assertion is used to match a group of characters that come before the main part of the pattern. It's denoted by (?<=...), where ... represents the desired preceding pattern. This allows for matching only if certain conditions are met behind the main part.

/(?<=prefix)mainpattern/

Negative Lookahead

Negative lookahead is a type of assertion in regular expressions that matches any input string not followed by a specific pattern. It's denoted by (?!pattern). This can be useful for excluding certain patterns from the match.

/\b(?!un)\b\\w+/g

(?=pattern) - Positive lookahead assertion syntax and usage

Positive lookahead is a non-capturing group that matches the search string only if it’s followed by another specified pattern. It does not include the matched text in its result. This can be useful for validating patterns without including them in the match.

/\b(?=\btest\b)\b\blanguage\b/

Positive Lookahead

Positive lookahead is a type of assertion in regex that matches a group only if it is followed by another specific pattern. It does not consume any characters or include the matched text in the match result. Positive lookahead is denoted using (?=pattern) syntax.

/\b(?!un)\b\bt[a-z]+\b/

(?<=pattern) - Positive lookbehind assertion syntax and usage

Positive lookbehind assertion is used to match a group of characters that come before the main pattern. It checks if the text preceding the current position matches 'pattern'. However, it does not include 'pattern' in the matched result.

/(?<=abc)def/g

(?!pattern) - Negative lookahead assertion syntax and usage

The (?!pattern) is a negative lookahead assertion that matches any string not followed by the specified pattern. It does not include the matched text in its result, only indicating whether or not it's present.

/\b(?!un)\b\bt[a-z]+\b/

Negative Lookbehind

Negative lookbehind is a type of assertion that matches a group only if it is not preceded by another specific pattern. It's denoted by the syntax (?<!pattern). This can be useful for excluding certain patterns from matching.

/(?<!\\d)abc/ will match 'abc' in 'x abc', but not in '123abc'

(?<!pattern) - Negative lookbehind assertion syntax and usage

The negative lookbehind assertion, (?<!pattern), is used to match a specific position in the string that is not preceded by a certain pattern. It does not consume any characters or include the matched text in the result. This can be useful for finding matches based on what precedes them without including those preceding characters in the match.

/(?<!abc)def/g

Grouping with parentheses

Parentheses are used to group subexpressions together and capture the matched substring. They can be used for applying quantifiers, alternation, or capturing groups. To create a non-capturing group, use (?:...). Backreferences to captured groups can be made using \\1 through \\9 in some regex flavors.

/(ab)+/

Capturing Groups

Capturing groups are used to capture and extract specific parts of a match. They are defined by enclosing the pattern in parentheses, which creates a numbered capturing group. The captured content can be referenced using backreferences or accessed programmatically after the match is found.

/(\\d{3})-(\\d{2})-(\\d{4})/g

Non-capturing groups

Non-capturing groups are used to group a subpattern without capturing the matched text. They do not create back references and can be useful for improving performance in cases where you don't need to capture the grouped elements.

/(?:regex)/

Nested Grouping

Nested grouping in regular expressions allows for creating subgroups within a larger group. This can be useful when capturing multiple levels of information or applying quantifiers to the entire nested group.

/(ab(cd)ef)/

Conditional Grouping

Conditional grouping in regular expressions allows for creating optional patterns. It is denoted by the '?' symbol and can be used to match zero or one occurrence of a specific pattern.

/colou?r/ matches both 'color' and 'colour'

Named Capturing Groups

Named capturing groups allow you to assign a name to the subpattern within parentheses. This makes it easier to reference and extract specific parts of the matched pattern. To create a named capturing group, use the syntax (?<name>pattern). Named captured groups are useful for organizing complex regex patterns and improving code readability.

/(?<year>\\d{4})-(?<month>\\d{2})-(?<day>\\d{2})/

Backreferences to captured group

In regex, backreferences allow you to match the same text as previously matched by a capturing group. They are represented using \\1 for the first capture group, \\2 for the second, and so on. Backreferences can be used in search patterns or replacements.

/(\\w)\\1/ // This pattern matches two consecutive identical word characters

Incremental numbering of back references

When using multiple capturing groups in a regular expression, the numbered backreferences can be used to refer to these captured groups. The incremental numbering allows you to reference each group by its position starting from left to right. For example, \\1 refers to the first captured group, \\2 refers to the second one and so on.

/(\\w+)-(\\d+)-\\1-(\\d+)/

^ (caret)

The caret symbol ^ is used in regex to match the start of a string. It asserts that the following pattern must appear at the beginning of the text being searched.

/^hello/ matches 'hello world' but not 'world hello'

$ (dollar sign)

The dollar sign is used in regular expressions to match the end of a line. It signifies that the pattern should be found at the end of the input string.

/pattern$/

\\B (non-word boundary)

The \\B metacharacter matches a position that is not a word boundary. It asserts the opposite of \\b, meaning it matches any position within the input where there isn't a word boundary.

/\\Btest/ will match 'attest' but not 'test'

(?!...) negative lookahead

Negative lookahead is a type of assertion in regular expressions that specifies a pattern not to be found ahead. It's useful for matching patterns only when they are not followed by another specific pattern. The syntax for negative lookahead is (?!pattern). For example, the regex 'foo(?!bar)' will match occurrences of 'foo' only if it's not followed by 'bar'. Negative lookaheads do not consume characters in the string.

/\b(?!\bnot\b)\b\ba+\b/

(?=...) positive lookahead

Positive lookahead is a non-capturing group that matches the search pattern only if it's followed by another pattern. It does not include the matched text in its result. This allows for more complex matching conditions without including the trailing characters in the match.

/\b(\bword\b)(?=\bpattern\b)/g

(?<!...) negative lookbehind

Negative lookbehind is a type of assertion that matches a specific pattern only if it's not preceded by another specified pattern. It allows you to define what should not precede the matching text. This can be useful for excluding certain patterns from your match.

/(?<!abc)def/ - Matches 'def' only if it's NOT preceded by 'abc'

(?<=...) positive lookbehind

Positive lookbehind is a regex construct that matches a group of characters only if they are preceded by another specific pattern. It does not include the preceding pattern in the match result.

/(?<=prefix)target/

Column 2

Anchors (^ and $)

^ matches the start of a string, while $ matches the end. They are used to specify where in a line or string you want to match.

/^hello/ - This regex will only match if 'hello' is at the beginning of the input.

Quantifiers (*, +, ?, {n}, {n,m})

Quantifiers are used to specify the number of occurrences a character or group can have.
The asterisk (*) matches zero or more occurrences.
The plus sign (+) matches one or more occurrences.
The question mark (?) matches zero or one occurrence.
{n} specifies exactly n occurrences and {n,m} specifies at least n but no more than m occurrences.

/a*/

Alternation (|)

The alternation operator | allows for matching either the pattern before or after it. It is used to create multiple possible matches within a single regular expression.

/cat|dog/ will match 'cat' in 'the cat is here', and will also match 'dog' in 'my dog loves treats'.

Character classes ([abc], [a-z], [^0-9])

Character classes allow you to define a set of characters that can match at a particular position in the input. For example, [abc] matches 'a', 'b', or 'c'. A range can be defined using hyphen notation like [a-z]. The caret (^) inside square brackets negates the character class; for instance, [^0-9] matches any non-digit.

/[A-Za-z]/

Backreferences (
)

In regular expressions, the backreference
is used to match a previously captured group. It allows you to reuse part of the matched text in the regex pattern. The number after
corresponds to the capturing group's index (1-based). For example, /(\\w)\\1/ will match repeated consecutive word characters.

/(\\w)\\1/

Assertions (?=..., ?!...)

Assertions are used to define a positive or negative lookahead in the regex pattern. The syntax for positive lookahead is (?=...) and it asserts that the subpattern must be present at this position, while not consuming any characters. Negative lookahead has the syntax (?!...) and checks if the given pattern does not match after current position.

/\b(?=\bre)\b\ba\b/g

Escaping metasymbols (, \\d, \\w)

When you need to match a literal parenthesis or other metacharacters like \\\\d and \\\\w in your regex pattern instead of their special meaning as a metacharacter, use the backslash (\\\\) before them. For example: To match a literal opening parenthesis '(', escape it with '\\\\('. Similarly for digits and word characters.

/^\$123\$-456-7890$/

Grouping and capturing ((...), (?:...))

Using parentheses in regular expressions allows for grouping of characters or subexpressions. This can be used to apply quantifiers, alternation, or capture specific parts of the matched pattern. To create a non-capturing group, use '?:' within the opening parenthesis.

/(ab)+/g

Using the pipe symbol (|) to indicate alternation

The pipe symbol (|) is used in regular expressions to specify alternatives. It allows matching of multiple patterns, such as 'cat|dog' which matches either 'cat' or 'dog'. The leftmost alternative takes precedence if more than one match is found.

/(cat|dog)/

/(green|red) apple/ matches 'green apple' or 'red apple'

This regex pattern will match the string 'green apple' or 'red apple'. The '|' symbol acts as an OR operator, allowing either of the specified options to be matched. This is useful for finding multiple variations of a particular pattern in text.

/(green|red)apple/

Matching either of two patterns using OR operator

The | (pipe) symbol is used as the OR operator in regex. It allows you to match one pattern or another. For example, 'cat|dog' will match either 'cat' or 'dog'. Parentheses can be used for grouping multiple items together when using the OR operator.

/(cat|dog)/

/cat|dog/

This regular expression matches either the word 'cat' or the word 'dog'. The '|' character acts as an OR operator in regex, allowing for multiple possible matches.

// Example usage in JavaScript
const pattern = /cat|dog/;
console.log(pattern.test('I have a cat')); // true

Differences between greedy and lazy quantifiers with alternation

Greedy quantifiers match as much of the string as possible, while lazy (or non-greedy) quantifiers match as little as possible. Alternation allows for matching multiple patterns, trying each one in order.

/a+?b/ matches 'aab' in 'aaab', whereas /a+b/ matches 'aaab'

Nested Alternations

When dealing with more complex matching scenarios, nested alternations can be used to create multiple levels of choices for pattern matching. This allows for greater flexibility in defining specific patterns within a larger search pattern.

/(ab(c|d)|ef(g|h))/g

Using non-capturing groups

When working with multiple alternatives, using non-capturing groups (?:) can help avoid capturing unnecessary data and improve performance. Non-capturing groups are useful when you need to group alternatives but don't want the matched result to be stored in a capture group.

/(?:regex1|regex2)/

Impact on capturing groups when using alternation

When using alternation, each branch of the pattern defines its own set of capturing groups. This means that if a group is not used in one branch but is used in another, it may capture different content or remain empty depending on which branch matched. It's important to be aware of this behavior and consider how it impacts the overall matching and extraction process.

/(foo|bar)\\d+(baz)?/

Column 3

g - Global search (matches all occurrences)

The 'g' flag is used to perform a global search, which means it matches all occurrences of the pattern in a string. Without this flag, only the first match would be returned.

/pattern/g

s - Dotall mode (. matches newline characters as well)

In dotall mode, the period (.) metacharacter will match any character including newlines. This is useful when you want to search for patterns that span multiple lines.

/pattern/s

m - Multiline mode

In multiline mode, the caret (^) and dollar sign ($) match the start and end of each line within a string. This is useful when working with multi-line strings where you want to apply these anchors on individual lines rather than the entire string.

/^start/m

u - Unicode support

The 'u' flag enables full Unicode matching, including such features as case-insensitive matching and word boundaries. It allows regex to handle characters outside the ASCII range.

/pattern/u

i - Case-insensitive matching

When the 'i' flag is used in a regex pattern, it allows for case-insensitive matching. This means that uppercase and lowercase letters will be treated as equal when searching for patterns.

/hello/i.test('Hello') // returns true

x – Verbose mode

The verbose mode allows comments and whitespace within the pattern for readability. This can make complex patterns easier to understand by allowing you to add comments that explain different parts of the regex.

/^          # Start of line
\\d{3}       # Match exactly three digits
[ -]?        # Match a space or hyphen, which is optional
\\d{4}      # Match exactly four digits$/x

A-Z,0-9,-_ etc. : Character Classes/Range Modifiers

Character classes allow you to match any one character from a set of characters. For example, [a-z] matches any lowercase letter and [0-9] matches any digit. Additionally, the hyphen (-) is used to specify a range within square brackets such as [A-Za-z], which will match all uppercase and lowercase letters.

/[a-zA-Z0-9_-]/

y – Sticky Mode

The 'y' flag indicates sticky mode, which means the match must start at the current position in the target string for it to succeed. This is useful when searching within a long piece of text and you want to ensure that matches are found only after a specific point.

/pattern/y

Character Classes

Character classes allow you to match any one character from a set of characters. They are defined using square brackets, for example, [aeiou] matches any vowel. You can also use ranges within the brackets like [a-z] which matches any lowercase letter.

/[0-9]/g

\\w - Matches a word character (alphanumeric or underscore)

The \\w metacharacter matches any alphanumeric character, which includes letters from a to z and digits from 0-9. It also includes the underscore (_) symbol. This is equivalent to [a-zA-Z0-9_]. The \\w does not match spaces or punctuation.

/\\w+/g

\\d - Matches a digit (0-9)

The \\d metacharacter matches any single digit from 0 to 9. It is commonly used for validating and extracting numerical data in strings.

/\\d+/g

. – matches any single character except for line terminators such as \

The dot (.) is a wildcard metacharacter that can represent any single character in regular expressions. It does not match line terminators, such as newline characters (
). This allows the dot to be used for matching specific patterns within strings without crossing over multiple lines.

/a.c/ - The pattern will match 'abc', 'axc', or 'a-c' but not 'ab\
c'

\\s – matches whitespace characters

The \\s metacharacter is used to match any whitespace character, including space, tab, newline, and more. It can be useful for finding or replacing spaces in a string.

/\\s+/g

[^abc] - Matches any single character not in the brackets

The regex [^abc] will match any single character that is not 'a', 'b', or 'c'. This can be useful when you want to exclude specific characters from a search pattern. The caret (^) inside the square brackets negates the matching of characters within them.

/[^abc]/g

[abc] - Matches any single character within the brackets

The [ ] construct matches a single character out of several possible characters. For example, the pattern '[abc]' will match either 'a', 'b', or 'c'. It does not match strings like 'ab' or 'ba'. This is also known as a character class.

/[abc]/

[0-9]+

This regex pattern matches one or more digits. It is commonly used to find and extract numerical values from text data.

/[0-9]+/g

[a-zA-Z]

This regex pattern matches all lowercase and uppercase letters. It is commonly used to validate input that should only contain alphabetic characters.

/^[a-zA-Z]+$/

Wildcard: .

The dot (.) is a wildcard character that matches any single character except for line terminators. It can be used to represent any letter, digit, whitespace or symbol in the search pattern.

/a.c/ would match 'abc', 'axc', but not 'ac' or 'abbc'

Zero or more: *

The asterisk (*) in regex matches zero or more occurrences of the preceding element. It allows for flexibility when searching for patterns that may repeat multiple times, including not at all.

/a*b/ - This pattern will match 'b', 'ab', and even 'aaab'.

{n}: Exactly n times

The {n} quantifier matches the preceding element exactly n times. For example, 'a{3}' will match the string 'aaa' but not 'aa'. It can be used with any character or group of characters.

/a{2}/g

Optional: ?

The question mark (?) indicates that the preceding character in the regular expression is optional, meaning it can appear zero or one time. It allows for flexibility when matching patterns.

/colou?r/ matches both 'color' and 'colour'

{n,m}: Between n and m times

This quantifier specifies that the preceding character or group must occur at least n times, but no more than m times. It is used to define a range of occurrences for the pattern.

/a{2,4}/ matches 'aa', 'aaa' and 'aaaa'

One or more: +

+ is a quantifier that matches one or more occurrences of the preceding element. It ensures that there is at least one occurrence, but it can match multiple occurrences as well.

/a+/ will match 'a', 'aa', 'aaa', and so on in a given string.

{n,}: At least n times

The {n,} quantifier matches the preceding element at least n times. It allows you to specify a minimum number of occurrences for the pattern it follows.

/a{2,4}/ will match 'aa', 'aaa', or 'aaaa' but not just one 'a'

Greedy vs. Lazy Quantifiers

In regular expressions, greedy quantifiers match as much of the string as possible, while lazy (or non-greedy) quantifiers match as little as possible. Greedy quantifier is denoted by adding a '?' after the qualifier, and it will try to consume maximum characters before backtracking if needed.

/a.*b/ matches 'aabab', but /a.*?b/ matches only 'aab'