Lookaround

Lookarounds instruct the regex parser, to check for the presence of a pattern of text without including it in the returned result.

Example

Let’s say we want to match each chunk of text in a web page that’s bolded.

We’d start by writing a regex¬† that would look for text within <b> HTML tags.

However this would only be a partial solution, as we’d get back the HTML tags as part of our match.

<b>This is some bold text</b>

We’d then be tasked with additional processing to strip the tags off before we could use the result.

To avoid this, we can use Lookarounds to identify both the opening and closing tags of the bolded text, without returning the tags in the match.

There are two types of Lookaround: Lookbehind and Lookahead.

Lookbehind, looks backwards in the source text. Look Ahead looks forwards in the source text.

So in our example above: The parser would start matching text to return when it got to the upper case ‘T’ as shown below. This is because at this point the parser would be able to Look Behind and ‘see’ the bold open tag.

<b>This is some bold text</b>

The parser would stop matching text when it got to the lower case ‘t’; the parser would be able to Lookahead and ‘see’ the bold close tag.

Our full regex would be:

(?<=<b>).+?(?=</b>)

Which breaks down like this:

(?<= start the lookbehind
<b> match the bold tag open
) end the Lookbehind
.+? match one or more chars lazily
(?= start the lookahead
</b> match the bold tag close
) end the Lookahead

More on Lookarounds

Lookarounds:

Aren’t restricted to just containing strings of chars to match (like in our HTML tag example). They can contain full regex’s (some regex flavors restrict this, but if you use Textpression it will provide you with feedback as you create your regex);

Can be used anywhere in an expression, including inside other Lookarounds;

Can also be negative. In other words, the parser is instructed: ‘Don’t start matching if the specified pattern of text is present’.

Textpression

Here are some of the ways you could view the completed regex in Textpression.

You can paste plain text regex straight into Textpression and get an immediate visualisation as above.

Or you can create regex yourself using the simple drag and drop editor.

If you’d like to see how easy Lookarounds are in Textpression check out the video.