What’s new with REGEX in Data Studio

Tony Villanova - June 29, 2017

Regular Expressions (also known as Regex in Google Analytics and Tag Manager) can help save time and include/ exclude data from your reports.

Regex is now supported in Google Data Studio. However, there are few changes to Regex that classifies it as RE2. RE2 was designed to be faster and more secure.

In this post I will provide you with a few Regex examples that we have created for Google Data Studio and a few differences we have noticed when compared to regular Regex.

Example 1: A Data Studio RE2 example

CASE

WHEN REGEXP_MATCH(Landing Page, ‘^/$’) THEN ‘Home’

  • This regex explains to Data Studio that if the landing page matches only / then it is to be called Home.

WHEN REGEXP_MATCH(Landing Page, ‘(.*)secondarypage(.*)’) THEN ‘Secondary Page’

This regex explains to Data Studio that if the landing page has “secondarypage” within the URL then it is to be called Secondary Page. As seen above and below to catch everything in Data Studio you must use .* before and after what you are trying to match.

WHEN REGEXP_MATCH(Landing Page, ‘(.*)blog(.*)’) THEN ‘Blog’

  • This regex explains to Data Studio that if the landing page has “blog” within the URL then it is to be called Secondary Page.

WHEN REGEXP_MATCH(Landing Page, ‘(.*)categorypage(.*)’) THEN ‘Category Page’

  • This regex explains to Data Studio that if the landing page has “categorypage” within the URL then it is to be called Category Page.

ELSE ‘Other’ END

The entire Regex that we have created will look like this:

CASE WHEN REGEXP_MATCH(Landing Page, ‘^/$’) THEN ‘Home’ WHEN REGEXP_MATCH(Landing Page, ‘(.*)secondarypage(.*)’) THEN ‘Secondary Page’ WHEN REGEXP_MATCH(Landing Page, ‘(.*)blog(.*)’) THEN ‘Blog’ WHEN REGEXP_MATCH(Landing Page, ‘(.*)categorypage(.*)’) THEN ‘Category Page’ ELSE ‘Other’ END

We have included below a quick reference guide to a few of the common characters that you will use in both Regex & RE2

Quick Reference Guide:

Parentheses ()

Groups characters together and/ or matches the characters enclosed.

(morevisibility) will match only morevisibility

 

Caret ^

Start of a string

^Morevisibility

 

Dot .

Matches any character

go.gle matches google, gobgle and goagle but not gogle.

 

Asterisk *

A * can match the previous character 0 or more times.

Se*n will match Seen, Seeen, Seeeeen, etc

An * with a dot (.*)matches everything

se.* matches with seen, sean, sewn

 

Pipe |

A | (pipe) mean OR

Analytics|Google Analytics means Analytics or Google Analytics

 

Dollar Sign $

$ lets the regex know that the texts ends with xyz.

Analytics$ would match Google Analytics but not Analytics Google.

 

Question Mark ?

A ? means the previous character is optional and it is allowed to match it zero or one time.

Go?gle would match Google or Gogle.

© 2023 MoreVisibility. All rights reserved.