WikiPlus

Regex Lookaheads and Lookbehinds Explained

Lookaheads and lookbehinds are among the most powerful and least understood features of regular expressions. They let you assert that a pattern is or is not followed (or preceded) by another pattern — without including those surrounding characters in the match. This enables sophisticated text operations that would be impossible or extremely verbose without them: matching a word only when it appears before a currency symbol, extracting values only when preceded by a specific key, or splitting on a delimiter only when it is not inside quotes. This article demystifies both constructs with practical, testable examples.

What Are Lookaheads? Positive and Negative

A lookahead is a zero-width assertion that checks the characters immediately following the current position in the string. It does not consume characters — after the lookahead succeeds (or fails), the engine's position in the string remains where it was before the lookahead. Positive lookahead syntax: (?=pattern) This asserts that pattern must follow the current position for the overall match to succeed. Example: \d+(?= dollars) matches one or more digits only when followed by ' dollars'. In the string 'I have 50 dollars and 30 euros', this matches '50' but not '30'. Negative lookahead syntax: (?!pattern) This asserts that pattern must NOT follow the current position. Example: \d+(?! dollars) matches digits NOT followed by ' dollars'. Applied to the same string, it matches '30' but not '50'. Lookaheads are commonly used in password validation to assert the presence of required character types anywhere in the string: (?=.*[A-Z]) asserts that at least one uppercase letter appears somewhere in the remaining string. Stacking multiple lookaheads at the start of a pattern creates an AND condition: (?=.*[A-Z])(?=.*\d) requires both an uppercase letter and a digit. In JavaScript, all modern engines support both positive and negative lookaheads. You can test them live in the WikiPlus Regex Tester by pasting patterns with (?=...) or (?!...) and watching which tokens in your test string trigger matches.

What Are Lookbehinds? Positive and Negative

A lookbehind is the mirror image of a lookahead: it asserts a condition on the characters immediately before the current position. Like lookaheads, lookbehinds are zero-width — they do not consume characters. Positive lookbehind syntax: (?<=pattern) Asserts that pattern must precede the current position. Example: (?<=\$)\d+\.\d{2} matches a decimal number only when preceded by a dollar sign. In '€12.50 and $19.99', this matches '19.99' but not '12.50'. Negative lookbehind syntax: (?<!pattern) Asserts that pattern must NOT precede the current position. Example: (?<!\d)\d{4}(?!\d) matches a four-digit sequence that is not part of a longer number. This is useful for matching four-digit years while excluding the digit runs inside longer sequences like account numbers. JavaScript added lookbehind support in ES2018. Lookbehinds are supported in V8 (Chrome and Node.js) and all modern JavaScript engines. If you need to support very old environments, check compatibility tables. The WikiPlus tester uses the browser's JS engine, so lookbehind results you see in the tester will match what you get in Chrome/Node. One JavaScript-specific quirk: lookbehinds in JavaScript are evaluated right-to-left (matching backwards), which affects how backtracking works inside them. In most practical cases this makes no difference, but it means variable-length lookbehinds (using * or + inside them) can behave differently from engines like Python's 're2' module, which does not support variable-length lookbehinds at all.

Practical Use Cases for Lookaround Assertions

Lookaround assertions shine in situations where you need to match based on context without including that context in the match. Extract values from key-value pairs: (?<=name: )[^\n]+ matches the value after 'name: ' without including the key in the match. This is cleaner than using a capture group to extract group 1 from (name: )([^\n]+). Split on commas not inside quotes: This cannot be done with a pure non-lookaround regex in most engines, but a combination approach works. Use a lookbehind to assert the comma is not preceded by an unclosed quote (approximate), or more reliably, use the regex to find quoted strings first and handle them separately. Strip HTML tags while preserving content: (?<=<[^>]*>)[^<]+(?=<\/[^>]*>) — this extracts text nodes between opening and closing tags without including the tags themselves in the match. Match a word only at the start of a line not preceded by a comment character: (?<=^)(?<!\s*#)\w+ — in a configuration file, match setting names that are not commented out. Validate that a username is not already taken (for async validation, regex does not help, but you can use a negative lookahead to enforce format): ^(?!admin|root|system)[a-zA-Z]\w{2,19}$ — matches a valid username that does not start with reserved names. Replace currency symbols without affecting the number: (?<=\d)(?= EUR) in a replacement replaces the space before 'EUR' only when preceded by a digit — useful for formatting transformations.

Combining Lookaheads and Lookbehinds

The full power of lookaround assertions emerges when you combine them in a single pattern. A pattern can have both a lookbehind and a lookahead, effectively matching text that is sandwiched between two contexts. Example: (?<=<b>)[^<]+(?=<\/b>) matches the text content inside bold tags — preceded by <b> and followed by </b> — without including the tags in the match result. This is a clean way to extract content from specific elements without a full HTML parser. Example: (?<=Price: \$)\d+\.\d{2}(?= USD) matches a decimal price only when preceded by 'Price: $' and followed by ' USD'. This is more precise than matching any decimal number anywhere in the document. You can also combine lookaheads and lookbehinds with capture groups. The pattern (?<=key=')[^']+(?=') captures the value between single quotes after 'key=', while a capture group version ((?<=key=')([^']+)(?=')) wraps the matched value in group 1. When testing combined lookarounds in the regex tester, enable the global flag to see all matches across your test string. Pay attention to which characters are included in the match highlight — lookaround content should never appear highlighted, since it is zero-width. If you see more text highlighted than you expect, check whether your lookaround syntax is correct. Performance note: complex nested lookaheads can be slow on large inputs because the engine may evaluate them at every position in the string. For performance-critical code, anchor your pattern where possible and ensure the non-lookaround part of the pattern fails fast on non-matching positions.

Frequently Asked Questions

Are lookbehinds supported in all JavaScript environments?
Lookbehinds are part of ES2018 and are supported in all modern JavaScript environments: Chrome 62+, Firefox 78+, Safari 16.4+, Node.js 8.3+, and Edge 79+. They are not supported in Internet Explorer. If you need to support IE11 or very old mobile browsers, either avoid lookbehinds or use a polyfill/transpiler. The WikiPlus Regex Tester runs in your current browser, so if lookbehinds work in the tester, they will work in any environment running the same or newer browser engine.
What is the difference between a lookahead and a capture group?
A capture group (pattern) both matches and captures the text, including it in the match result and making it available in the groups array. A lookahead (?=pattern) only asserts that the pattern is present — it does not consume characters and is not included in the match. Use a capture group when you need the surrounding text in your result. Use a lookahead when the surrounding text is a condition for matching but should not appear in the output — for example, when you want to extract a number but only when followed by a specific unit.
Can lookaheads contain quantifiers?
Yes. Lookaheads can contain any valid regex pattern, including quantifiers. (?=.*[A-Z]) uses .* inside a lookahead to assert that an uppercase letter appears anywhere in the remaining string. Variable-length patterns inside lookaheads are fully supported in JavaScript. Inside lookbehinds, variable-length patterns (using * or +) are also supported in JavaScript's V8 engine, unlike some other regex flavors (Python's re module, for example, requires fixed-length lookbehinds). Always test lookbehind patterns with variable-length content in the tester to confirm behavior.