Regular Expressions Cheatsheet for Developers

Regular expressions are one of the most powerful tools in a developer's toolkit — and one of the most avoided. This practical cheatsheet covers everything from basic character matching to lookaheads, with real-world examples in Python, JavaScript, and grep.

Regex Python JavaScript grep Pattern Matching

Character Classes

Pattern	Matches
.	Any character except newline
\d	Any digit (0-9)
\D	Any non-digit
\w	Word character (a-z, A-Z, 0-9, _)
\W	Non-word character
\s	Whitespace (space, tab, newline)
\S	Non-whitespace
[aeiou]	Any character in the set
[^aeiou]	Any character NOT in the set
[a-z]	Any lowercase letter
[A-Za-z0-9]	Alphanumeric character

Quantifiers

Pattern	Meaning
*	Zero or more (greedy)
+	One or more (greedy)
?	Zero or one (optional)
{3}	Exactly 3 times
{2,5}	Between 2 and 5 times
{3,}	3 or more times
*?	Zero or more (lazy / non-greedy)
+?	One or more (lazy)

Greedy quantifiers match as much as possible. Lazy quantifiers (add ?) match as little as possible. This matters when your pattern is inside a larger string with repeating segments.

Anchors and Boundaries

Pattern	Meaning
^	Start of string (or line in multiline mode)
$	End of string (or line in multiline mode)
\b	Word boundary
\B	Non-word boundary
\A	Start of string (Python only)
\Z	End of string (Python only)

Groups and Alternation

Pattern	Meaning
(abc)	Capturing group
(?:abc)	Non-capturing group
(?P<name>abc)	Named capturing group (Python)
(?<name>abc)	Named capturing group (JavaScript)
cat\|dog	Alternation (cat OR dog)
\1	Backreference to group 1

Lookaheads and Lookbehinds

Pattern	Meaning
(?=abc)	Positive lookahead: followed by "abc"
(?!abc)	Negative lookahead: NOT followed by "abc"
(?<=abc)	Positive lookbehind: preceded by "abc"
(?<!abc)	Negative lookbehind: NOT preceded by "abc"

Real-World Examples

Validate an Email Address

# Python
import re
pattern = r'^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$'
print(bool(re.match(pattern, 'user@example.com')))  # True
print(bool(re.match(pattern, 'not-an-email')))      # False

Extract All URLs from HTML

# Python
import re
html = 'link other'
urls = re.findall(r'href="(https?://[^"]+)"', html)
print(urls)  # ['https://example.com', 'https://other.org']

Match ISO Date Format

# JavaScript
const dateRegex = /^(\d{4})-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$/;
console.log(dateRegex.test('2024-05-29'));  // true
console.log(dateRegex.test('2024-13-01')); // false (month 13)

Replace All Whitespace Runs with a Single Space

# Python
import re
clean = re.sub(r'\s+', ' ', '  hello   world  ').strip()
print(clean)  # 'hello world'

Extract Named Groups (Python)

import re
log_line = '2024-05-29 14:32:11 ERROR Failed to connect to database'
pattern = r'(?P\d{4}-\d{2}-\d{2}) (?P\d{2}:\d{2}:\d{2}) (?P\w+) (?P.+)'
m = re.match(pattern, log_line)
print(m.group('level'))    # 'ERROR'
print(m.group('message'))  # 'Failed to connect to database'

grep with Extended Regex

# Match lines with an IP address
grep -E '([0-9]{1,3}\.){3}[0-9]{1,3}' access.log

# Match lines containing ERROR or WARN
grep -E 'ERROR|WARN' app.log

# Extract just the matched part
grep -oE '"[A-Z]+ /[^"]+"' access.log

Flags / Modifiers

Python flag	JS flag	Effect
re.IGNORECASE (re.I)	i	Case-insensitive matching
re.MULTILINE (re.M)	m	^ and $ match line starts/ends
re.DOTALL (re.S)	s	. matches newlines too
re.VERBOSE (re.X)	(none)	Allow whitespace and comments in pattern
(none)	g	Global: find all matches (not just first)

Tips for Writing Readable Regex

Long regex patterns become unmaintainable. Use Python's re.VERBOSE mode to add comments and whitespace:

import re
pattern = re.compile(r'''
  ^                    # Start of string
  (\d{4})             # Year (4 digits)
  -
  (0[1-9]|1[0-2])    # Month (01-12)
  -
  (0[1-9]|[12]\d|3[01])  # Day (01-31)
  $                    # End of string
''', re.VERBOSE)

print(bool(pattern.match('2024-05-29')))  # True