regex pattern · ready to copy
Regex for matching HTML tags (open + close + self-closing)
Crude tag matcher for log/code highlighting. NOT for parsing real HTML.
intermediate
javascript / pcre / python4 use cases
The pattern
<\/?[a-zA-Z][\w-]*(?:\s+[\w-]+(?:=(?:"[^"]*"|'[^']*'|[^\s>]+))?)*\s*\/?>
Test cases
| Input | Result |
|---|---|
| <div> | ✓matches |
| <input type="text" required> | ✓matches |
| <br /> | ✓matches |
| </p> | ✓matches |
| <not a tag> | ✗rejects |
| < space-after-bracket> | ✗rejects |
Edge cases & caveats
WARNING: do NOT parse HTML with regex in production. Nested elements, comments (`<!-- -->`), script/style content all break this. Use a proper parser (BeautifulSoup, lxml, parse5). This regex is for syntax highlighting only.
Note: see the warning above. Treat this page as a starting point, not a security control.
Common use cases
- log file syntax highlight
- rough tag extraction from snippets
- code editor tokenizer (toy implementations)
- static analysis pre-pass
Try variations against your data
regexlab is a free in-browser tester with side-by-side match highlighting, group inspector, and named-capture export to JS/Python/PCRE.
Open regexlab
Related
email address · hex color · url slug · iban