Regular Expression Tester

Regular expressions (regex) are patterns for matching text. This tool analyzes regex syntax, explains pattern components, and provides a test area to try matches against sample text. Supports JavaScript, PCRE, and Python regex formats.

Specifications

Common Use Cases

Debug why a regex isn't matching expected text
Learn regex syntax by seeing explanations
Test patterns before adding to code
Validate input patterns for form fields
Extract data from structured text (logs, CSV, HTML)

Features

Detect regex format (JavaScript /pattern/flags, PCRE #pattern#, Python r"pattern")
Explain pattern components in plain English
Test against sample text with match highlighting
Single match or find all matches mode
Display capture groups and their contents
Named capture group display
Show match positions and indices
Regex execution time display
Pattern feature badges (anchors, quantifiers, etc.)
Pattern warnings
Library of common patterns (email, URL, phone, etc.)

Examples

Email Pattern

Try it →

A regex for matching email addresses.

/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/i

URL Pattern

Try it →

A regex for matching HTTP/HTTPS URLs.

/https?:\/\/[^\s]+/g

Date Pattern

Try it →

A regex for YYYY-MM-DD dates with capture groups.

/(\d{4})-(\d{2})-(\d{2})/

Tips

Use online testers to debug complex patterns before production.
Escape special characters (. ^ $ * + ? { } [ ] \ | ( )) with backslash.
Named capture groups (?<name>...) make matches more readable.
The g flag finds all matches; without it, only the first match is returned.
Be careful with greedy quantifiers (* +); use non-greedy (*? +?) when appropriate.

Understanding Regular Expression

Regular expressions (regex) are a pattern matching language used across virtually every programming language, text editor, and command-line tool. A regex defines a search pattern using a combination of literal characters and metacharacters that match classes of strings. They are essential for input validation, text extraction, search-and-replace operations, and log analysis.

The core syntax includes character classes ([a-z], \\d, \\w, \\s), quantifiers (*, +, ?, {n,m}), anchors (^, $, \\b), alternation (|), and grouping with parentheses. Captured groups extract matched substrings and enable backreferences. Lookaheads (?=...) and lookbehinds (?<=...) match positions without consuming characters, enabling complex assertions about surrounding context.

JavaScript regex has its own flavor with specific flags: g (global), i (case-insensitive), m (multiline), s (dotAll), u (Unicode), and y (sticky). The u flag is increasingly important for correctly handling Unicode text, as without it, patterns may not match multi-byte characters correctly. Named capture groups (?<name>...) improve readability over numbered groups.

Common regex tasks include email validation, URL parsing, extracting data from structured text (log files, CSV), and replacing patterns in code refactoring. However, regex has limits — it cannot parse nested structures like HTML or JSON (use a proper parser), and overly complex patterns become unreadable and unmaintainable. The joke "now you have two problems" reflects real maintenance challenges with complex regex.

HTML is a context-free language with nested structures that regular expressions fundamentally cannot handle. A regex cannot correctly match opening and closing tags with arbitrary nesting depth. Simple cases may seem to work but break on edge cases involving nested tags, attributes containing >, or comments. For HTML processing, use a DOM parser such as DOMParser in the browser, cheerio for Node.js, or BeautifulSoup for Python.

The difference between .* (greedy) and .*? (lazy) quantifiers is significant for matching behavior. Greedy .* matches as many characters as possible, then backtracks if needed. Lazy .*? matches as few characters as possible. In a string like "onetwo", the greedy pattern .* matches the entire string, while the lazy pattern .*? matches only the first "one". Choosing the wrong mode is one of the most common sources of unexpected regex behavior.

In JavaScript, the u flag (/pattern/u) enables Unicode-aware matching. Without it, the . metacharacter does not match characters outside the Basic Multilingual Plane (such as emoji), and \\w only matches ASCII word characters. The u flag makes quantifiers and character classes handle surrogate pairs correctly and enables Unicode property escapes like \\p{Letter}. Lookaheads (?=...) and lookbehinds (?<=...) are zero-width assertions that check for patterns at a position without including them in the match. Positive lookahead (?=...) asserts what must follow, while negative lookahead (?!...) asserts what must not follow. For example, \\d+(?= dollars) matches numbers that are followed by " dollars" but captures only the number itself. Lookbehinds work the same way but check the preceding text.

← Back to all tools