Regular expressions for HTML

Regular expressions can be used to find HTML code which can then be used for cleaning or in CMS’s. Here are a couple I’ve been looking for.

To find all tags on a page

</?\w+((\s+\w+(\s*=\s*(?:”.*?”|’.*?’|[^'">\s]+))?)+\s*|\s*)/?>

To find a class or id tag

class=”[^"]*["]|id=”[^"]*["]|style=”[^"]*["]

Replacing tags in dreamweaver

In the following example:

<th>Column Header 1</th>

We want to replace the th tag contents with a link. Use parenthesis
<th[^>]*>([^<]*)</th>

<th><a title=”Sort By $1″ href=”#”>$1</a></th>

Find image tags

<img[^>]*[>]

Find links

<a href=”[^>]*[>]

or

<a[\s]+[^>]*?href[\s]?=[\s\"\']+(.*?)[\"\']+.*?>([^<]+|.*?)?<\/a>

Powered by ScribeFire.

Leave the first comment: