WikiPlus

Regex in Python vs JavaScript: Key Differences

Python and JavaScript are the two most popular languages for web development and data processing, and both have powerful built-in regular expression engines. But despite sharing most of their syntax, the two engines differ in important ways that trip up developers who switch between them or write patterns intended to run in both environments. This article maps the key differences — from flag names to lookbehind support to module APIs — so you can write patterns with confidence in either language.

Syntax Similarities and Core Differences

Both Python and JavaScript regex engines descend from the Perl regex tradition and share the vast majority of syntax: character classes, quantifiers, anchors, alternation, and capture groups all work identically. A pattern like ^(\d{4})-(\d{2})-(\d{2})$ works the same way in both languages. The primary syntactic differences: 1. Verbose mode. Python's re module supports re.VERBOSE (also written re.X), which allows whitespace and comments inside patterns. This makes complex patterns much more readable. JavaScript has no built-in verbose mode; the common workaround is to split the pattern into parts as strings and concatenate before passing to new RegExp(). 2. Possessive quantifiers and atomic groups. Python's regex module (not the standard re module) supports possessive quantifiers (\d++, \w++). JavaScript does not support them natively. Neither supports atomic groups natively in their standard engines, though workarounds exist. 3. String versus literal syntax. JavaScript regex literals are enclosed in forward slashes (/pattern/flags). Python passes patterns as strings to re functions or compiles them with re.compile(). Python's raw strings (r'...') are strongly recommended to avoid double-escaping backslashes: r'\d+' is equivalent to JavaScript's /\d+/. 4. Unicode support. Python 3 strings are Unicode by default, and \w, \d, etc. match Unicode characters by default. In JavaScript, Unicode character classes in \w and \d are not fully Unicode-aware without the u flag — \d matches only ASCII digits 0-9 even with u, while Python's \d matches Unicode decimal digits from any script.

Flag Equivalents Between Python and JavaScript

The flag systems in Python and JavaScript use different names and sometimes different semantics for similar behaviors. Case-insensitive: Python: re.IGNORECASE or re.I JavaScript: i flag (/pattern/i) Global (find all matches): Python: findall() or finditer() inherently finds all matches — there is no separate flag JavaScript: g flag (/pattern/g), or use matchAll() which always returns all matches Multiline (^ and $ match line boundaries): Python: re.MULTILINE or re.M JavaScript: m flag DotAll (dot matches newlines): Python: re.DOTALL or re.S JavaScript: s flag (ES2018) Verbose/extended: Python: re.VERBOSE or re.X (no equivalent in JavaScript) JavaScript: no native equivalent Unicode: Python: Python 3 is Unicode by default; re.UNICODE or re.U is the default and largely historical JavaScript: u flag (important! enables full Unicode support, required for \p{} property escapes) Flags can be combined in Python using the | operator: re.I | re.M | re.S. In JavaScript, they are combined by concatenating the flag letters: /pattern/ims. A practical translation tip: when porting a Python pattern to JavaScript, check every flag in the Python code and find its JavaScript equivalent. Missing the s/DOTALL flag when the Python code uses it will cause the JavaScript pattern to fail on multi-line input.

Named Groups, Back-References, and the re API

Named capture groups use slightly different syntax in the two languages. Python named group syntax: (?P<name>pattern) JavaScript named group syntax: (?<name>pattern) (ES2018) Both let you access captured text by name after a match. In Python: match.group('name'). In JavaScript: match.groups.name. Back-references to named groups also differ: Python: (?P=name) refers back to the text matched by group 'name' JavaScript: \k<name> (ES2018) The Python re module's API provides several functions worth knowing if you are coming from JavaScript: re.match(pattern, string) — attempts match only at the beginning of the string (like /^pattern/.exec(string)) re.search(pattern, string) — finds the first match anywhere in the string (like /pattern/.exec(string) without ^) re.findall(pattern, string) — returns a list of all matches (like [...string.matchAll(/pattern/g)]) re.sub(pattern, repl, string) — replaces matches, like string.replace(/pattern/g, repl) re.split(pattern, string) — splits on the pattern, like string.split(/pattern/) One Python-specific feature: re.compile() returns a compiled pattern object that can be reused. In JavaScript, regex literals are also compiled and can be reused, but storing them in a variable with the g flag comes with the lastIndex pitfall (see the FAQ in the JavaScript article in this series).

Lookbehinds and Advanced Features

Lookbehind support is one of the most significant practical differences between the two engines. Python's re module: supports lookbehinds, but only fixed-length ones. (?<=\d{4}) is valid; (?<=\d+) is not, because \d+ has variable length. This is a real limitation when porting from JavaScript to Python. JavaScript (V8 engine): supports variable-length lookbehinds since ES2018. (?<=\d+) works correctly in Chrome and Node.js. This is a genuine advantage of the JavaScript engine for some patterns. Python's regex module (third-party, install with pip install regex): supports variable-length lookbehinds, possessive quantifiers, atomic groups, and Unicode properties (\p{Letter}, etc.). If you need these features in Python, this module is the solution. Unicode property escapes: JavaScript (with u flag): \p{Letter}, \p{Decimal_Number}, \p{Script=Greek}, etc. Python re module: does not support \p{} syntax Python regex module: supports \p{} syntax Atomic groups and possessive quantifiers: Both standard engines lack these; the Python regex module adds them. Conditional patterns ((?(id)yes|no)): Python re module supports conditional patterns; JavaScript does not. For cross-platform patterns intended to run in both Python and JavaScript, stick to the common subset: named groups with (?<name>) in JS and (?P<name>) in Python (or use numbered groups), no possessive quantifiers, fixed-length lookbehinds, and no \p{} property escapes unless you use the Python regex module.

Frequently Asked Questions

Can I use the same regex pattern in Python and JavaScript?
The core syntax is identical and most patterns are directly portable, with two adjustments. First, named group syntax differs: Python uses (?P<name>...) while JavaScript uses (?<name>...). Second, be cautious with features that one engine does not support: Python's re module does not support variable-length lookbehinds or \p{} property escapes, and JavaScript does not support conditional patterns. For patterns that only use the common subset (character classes, quantifiers, anchors, capture groups, flags), cross-language portability is straightforward.
Why does \w match different characters in Python and JavaScript?
In Python 3, \w is Unicode-aware by default and matches any Unicode word character, including letters and digits from non-Latin scripts. In JavaScript without the u flag, \w is ASCII-only: [a-zA-Z0-9_]. With the u flag, JavaScript \w still only matches ASCII word characters — full Unicode word matching requires \p{Word} with the u flag (or the v flag in newer environments). If you are processing international text, be explicit about which character set you intend to match rather than relying on \w behavior.
Which language has better regex performance?
Performance depends heavily on the specific pattern and input. For most practical patterns, both engines are fast enough that the difference is negligible. JavaScript's V8 engine uses JIT compilation for common patterns, making simple patterns very fast. Python's re module is implemented in C and is also fast for typical use cases. For catastrophically backtracking patterns, both engines suffer equally. If regex performance is a bottleneck in a specific application, benchmark with your actual patterns and input data rather than relying on general comparisons.