Regex Syntax Reference
ripgrep supports two regex engines that you can switch between at the command line.
The default engine is the Rust regex crate. The alternative is
PCRE2, enabled with -P.
Default engine: Rust regex
The default regex engine is the Rust
regex crate.
It provides linear-time guarantees — it will never have catastrophic
backtracking slowdowns, no matter what pattern you write. The trade-off is that some
advanced features (look-arounds, backreferences) are not supported.
Supported syntax
| Syntax | Meaning | Example |
|---|---|---|
| . | Any character except newline | f.o matches foo, f1o |
| * | 0 or more of previous | fo* matches f, fo, foo |
| + | 1 or more of previous | fo+ matches fo, foo |
| ? | 0 or 1 of previous | colou?r matches color, colour |
| {n} | Exactly n repetitions | \d{4} matches 4 digits |
| {n,m} | Between n and m repetitions | \w{2,5} |
| ^ | Start of line | ^fn matches fn at line start |
| $ | End of line | error$ matches error at end |
| \b | Word boundary | \bfoo\b matches foo but not foobar |
| [abc] | Character class | [aeiou] matches any vowel |
| [^abc] | Negated character class | [^\d] matches non-digit |
| [a-z] | Character range | [a-z] matches lowercase letter |
| (abc) | Capture group | (foo|bar) |
| (?:abc) | Non-capturing group | (?:foo|bar) |
| a|b | Alternation | cat|dog |
| \d | Digit (Unicode-aware) | \d+ matches 42 or ٤٢ |
| \w | Word character | \w+ matches identifiers |
| \s | Whitespace | \s+ matches spaces, tabs |
| \D, \W, \S | Negated class | \D matches non-digit |
Not supported in the default engine
- Look-ahead (
(?=...),(?!...)) — use-P - Look-behind (
(?<=...),(?<!...)) — use-P - Backreferences (
\1,\k<name>) — use-P - Possessive quantifiers (
*+,++) — use-P - Atomic groups (
(?>...)) — use-P
PCRE2 engine (-P / --pcre2)
PCRE2 is the same regex library used by Perl, Python's re module, and
most other modern languages. Enable it with rg -P 'pattern'.
cargo install ripgrep --features pcre2.
PCRE2-only features
# Look-ahead: match "foo" only when followed by "bar"
rg -P 'foo(?=bar)'
# Negative look-ahead: match "foo" NOT followed by "bar"
rg -P 'foo(?!bar)'
# Look-behind: match "bar" only preceded by "foo"
rg -P '(?<=foo)bar'
# Negative look-behind
rg -P '(?<!foo)bar'
# Backreference: find repeated word
rg -P '(\b\w+\b) \1'
# Named capture group
rg -P '(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})' Hybrid mode
Use --auto-hybrid-regex to let ripgrep automatically pick PCRE2 when the
pattern requires it and fall back to the default engine otherwise. This avoids the
PCRE2 overhead for simple patterns.
rg --auto-hybrid-regex '(?<=def )\w+' Unicode support
ripgrep has Unicode support enabled by default in both engines. \d,
\w, and \s match their Unicode equivalents, not just ASCII.
# \p{L} — any Unicode letter
rg '\p{L}+'
# \p{Han} — CJK Han characters
rg '\p{Han}'
# \p{Ll} — lowercase letters
rg '\p{Ll}{3,}'
# \p{N} — numeric characters (includes Arabic-Indic digits, etc.)
rg '\p{N}+'
# Disable Unicode for ASCII-only matching (faster on ASCII corpora)
rg '(?-u)\w+' Common Unicode categories
| Pattern | Matches |
|---|---|
| \p{L} | Any letter (all scripts) |
| \p{Lu} | Uppercase letter |
| \p{Ll} | Lowercase letter |
| \p{N} | Any number |
| \p{Nd} | Decimal digit (0–9 in any script) |
| \p{P} | Punctuation |
| \p{Z} | Separator (space, line sep, paragraph sep) |
| \p{Latin} | Latin script characters |
| \p{Han} | CJK unified ideographs |
| \p{Arabic} | Arabic script |
| \p{Cyrillic} | Cyrillic script |
Common patterns cheatsheet
# Email address (simplified)
rg '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
# URL
rg 'https?://[^\s"<>]+'
# IPv4 address
rg '\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b'
# Hex color
rg '#[0-9a-fA-F]{3,6}\b'
# ISO 8601 date
rg '\d{4}-\d{2}-\d{2}'
# Version number (semver)
rg '\bv?\d+\.\d+\.\d+\b'
# TODO/FIXME comment
rg '(TODO|FIXME|HACK|XXX):'
# Function definition (Rust)
rg 'pub fn \w+'
# Import statement (Python)
rg '^(import|from) \w'
# JSON key-value (approximate)
rg '"[^"]+": "[^"]*"' Performance tips
- Anchor patterns —
^patternorpattern$can reduce the search space. - Prefer literals —
foois faster than[f][o][o]. ripgrep can use SIMD for literal prefixes. - Use
-Ffor literals — if you are not using regex features,-F(fixed strings) skips regex parsing entirely. - Use
-wfor words —-w foois faster than\bfoo\b. - Avoid
.*anchors —foo.*barforces ripgrep to scan each line fully; preferfoofollowed by manual review. - Use the default engine — PCRE2 is powerful but slower. Only switch with
-Pwhen you genuinely need it. - Disable Unicode when not needed —
(?-u)\w+uses ASCII-only matching and can be faster on ASCII-heavy corpora.