regex pattern · ready to copy
Regex for extracting image src from HTML
Pull `src="..."` values out of `<img>` tags (rough extract, not a real parser).
intermediate
javascript / pcre / python4 use cases
The pattern
<img\b[^>]*\bsrc\s*=\s*["']([^"']+)["']
Test cases
| Input | Result |
|---|---|
| <img src="/cat.jpg"> | ✓matches |
| <img alt='foo' src="https://cdn.example/bar.png" loading="lazy"> | ✓matches |
| <picture><source src="..."> | ✗rejects |
| <svg>...</svg> | ✗rejects |
Edge cases & caveats
Won't catch lazy-loaded `data-src` images, `srcset` (multiple URLs), or CSS `background-image`. Use BeautifulSoup for real HTML pages.
Common use cases
- scraping image lists
- broken-image audit
- CDN migration prep
- asset inventory
Try variations against your data
regexlab is a free in-browser tester with side-by-side match highlighting, group inspector, and named-capture export to JS/Python/PCRE.
Open regexlab