Skip to content Skip to sidebar Skip to footer

Regular Expression Negative Lookahead/lookbehind To Exclude Html From Find-and-replace

I have a feature on my site where search results have the search query highlighted in results. However, some of the fields that the site searched through has HTML in it. For exampl

Solution 1:

It's considered bad practice to use regex to parse a complex language like HTML. With sufficient skill and patience, and an advanced regex engine, it may be possible, but the potential pitfalls are huge and the performance is unlikely to be good.

A better solution is to use a dom parser such as PHP's built-in DOMDocument class.

A good example of this can be found here in the answer to this related SO question.

Hope that helps.

Solution 2:

If you do want to use regular expressions, a simple negative look-ahead is all that is required (assuming well-formed markup with no < or > within or between the tags)

$return = preg_replace("/$match(?![^<>]*>)/i", '<mark>$0</mark>', $result);

Any special regular expression characters in $match will need to be properly escaped.

Post a Comment for "Regular Expression Negative Lookahead/lookbehind To Exclude Html From Find-and-replace"