Regular Expressions (also known as Regex in Google Analytics and Tag Manager) can help save time and include/ exclude data from your reports.
Regex is now supported in Google Data Studio. However, there are few changes to Regex that classifies it as RE2. RE2 was designed to be faster and more secure.
In this post I will provide you with a few Regex examples that we have created for Google Data Studio and a few differences we have noticed when compared to regular Regex.
Attention all Google Analytics users around the world: you don’t have to be an expert in regular expressions to use filters. Why? Because this post will help you, that’s why!
No long and drawn-out lead-in to the story this time – here are 5 filters that you can create for your Google Analytics profile(s) that will tidy up your data and make you a happier analyst.
1. Excluding your own traffic from reports
Why: Chances are that your own visits to your own web site aren’t racking up that many visits and page views. Nonetheless, you can still permanently remove your own traffic statistics from appearing in your Google Analytics profile(s).
How: First, grab your IP address from whatismyip.com (or, ask an IT person). If you have administrative access to your account, click on your account’s name, then click on your web property’s name. Next, click on the filters sub-tab (within the profiles tab), click on “Add Filter“, and do the following:
Method: Create New Filter
Filter Name: Exclude my IP Address
Filter Type: Custom Filter >> Exclude
Filter Field: Visitor IP Address
Filter Pattern: ^192.168.25.25$
Case Sensitive: No
Replace the IP address in the example above with your own IP address, but leave the ^, the $, and the three symbols (just replace the numbers). Click Save, and you’re done!
2. Lowercasing your hostnames
Why: A hostname is a domain that has sent you visitor data. In other words, a hostname is a URL where your Google Analytics tracking code is present and has at least sent you 1 visit during the selected date-range that you’re looking at. If you ever toggle your report dimension by hostname, or switch the viewing table to show hostnames, you could see mixed cases (upper and lower), which leads to many different variations of your same domain name appearing. That also means you need to work on your SEO re-directs – but that’s something for another time.
How: Go through the same steps as you did in the last filter to get to the filter creation screen. Once there, do this:
Method: Create New Filter
Filter Name: Lowercase Hostnames
Filter Type: Custom Filter >> Lowercase
Filter Field: Hostname
Click Save, and you’re done! You can also create additional lowercase filters to do the same thing to other pieces of data that may look unsightly (one of them might be the Request URI filter field, which represents everything after the .com part of your URL).
3. Search for long, bulky page name; Replace with short, clean page name.
Why: Page names can get long and bulky. There’s probably an important page in your top ten that’s just an eye-sore. How about we shorten it and clean it up some?
How: Follow these filter creation steps – but remember to change the page names to your own, as the following is just an example:
Method: Create New Filter
Filter Name: Search & Replace: Long page with “/john.php”
Filter Type: Custom Filter >> Search and Replace
Filter Field: Request URI
Search String: /your-very-long-and-bulky-page.php?id=1234567
Replace String: /john.php
Case Sensitive: No
4. Add the visitor’s browser to the visitor’s operating system
Why: Why not? Google Analytics lets you create some powerful, advanced filters that let you do something cool (and efficient) like adding the visitor’s browser to the operating system that they’re using. This way, you can see a visitor’s browser along side a visitor’s operating system, without having to apply a secondary dimension (saving your secondary dimension option for something else).
How: Here’s how you do it:
Method: Create New Filter
Filter Name: Operating System + Browser Platform
Filter Type: Custom Filter >> Advanced
Field A -> Extract A: Visitor Operating System Platform -> (.*)
Field B -> Extract B: Visitor Browser Program -> (.*)
Output To -> Constructor: Visitor Operating System Platform -> $A1 – $B1
Field A Required: Yes
Field B Required: No
Override Output Field: Yes
Case Sensitive: No
For Field A and Field B, choose the filter field as described, and then in the blank form field, type in (.*) as shown.
5. Include your domain (and, ONLY your domain!)
Why: Unfortunately, server caching and having your tracking code outright stolen and placed on someone else’s web site is something that we sometimes have to deal with. So, from time to time, you must write a filter that will prohibit the collection of data from every domain except for your own web site.
How: Create your include filter like this:
Method: Create New Filter
Filter Name: Include my domain
Filter Type: Custom Filter >> Include
Filter Field: Hostname
Filter Pattern: mywebsite.com$
Case Sensitive: No
Click Save to stop the nefarious ones from sending you irrelevant data!
We could write about filters until the next Presidential election, because there is just so much on the topic, and, so many different things that you can do with filters. Even though you can copy the steps outlined in the above 5 filters directly, I still urge you to use caution. Filters are sensitive, temperamental, and must be precise, to say the very least. A poorly-created filter can cause permanent damage, so tread lightly.
What about you? What filters do you like to use? What problems are you experiencing? We’d love to hear your thoughts below!
I just love a good mystery, and to be candid, I love being the one who gets to solve it! Solving mysteries and putting together the proverbial pieces of the puzzle is a critical skill in the field of web analytics. You almost have to like the torture that comes with trying to figure out a problem, in a weird and demented way.
So when my industry colleague Matt asked me on Twitter to help him solve his Google Analytics quandary, I was ready in a nano-second.
You can read the full post here, but essentially, Matt needs to know what the best way to “isolate” page data would be. He has a sub-directory on his web site, which include pages, and needs to be able to create a segmented, sliced-up view(s) of that sub-directory, and needs to be able to view how each sub-directory’s pages are performing in relation to other sub-directory pages.
Creating a duplicate, filtered profile for all of this sub-directory’s traffic within the same Google Analytics account (using the same website domain) will create your isolated view of only those sub-directory pages. You will only see visits and page views that happened on those sub-directory pages. It’s good for looking at your sub-directory data in a silo, and you can compare the high-level data by using the profile overview screen (assuming you are planning on creating additional filtered profiles for the other sub-directories). You can also download the data offline and mash it up, either via the Google Analytics API or by simply downloading PDF or CSV files.
Creating an advanced segment that displays any pages that match your sub-directory name will show you any visits which included at least one page view on any one of the pages within that sub-directory. This definition – visits instead of pages from the previous paragraph – is an important differentiation. As commenter Amanda has already astutely observed, you will see other pages appear in your Content report section, because this segment will show you those other pages, as they were a part of these visitor’s sessions that viewed at least one page within your desired sub-directory. You can create an advanced segment for each sub-directory and compare up to three (plus the “All Visits” segment) at the same time, and get an on-the-fly look at your sub-directory data. However, if your date-range is long, you may encounter data-sampling (not the biggest issue in the world, but something to be aware of).
If you create a Custom Report, in your main profile and without any advanced segments applied, you will be tailoring an original view of your data. You can combine metrics from different reports, like visits, bounce rate, goal start and goal completion percentage, and revenue / ROI metrics (if you do Ecommerce). You can then match it up with the page dimension, and even set it up so that when you click on a page, the report will show you the keywords, or the source / medium combo, or the visitor country, or whatever drill-down dimension you want to see. Then, if you really want to get fancy, you can apply an advanced segment while you are looking at your custom report to show you visits that have viewed at least one of you desired sub-directory pages, and really get cooking! You can then apply a custom report and an advanced segment to multiple profiles from within the main profile (Click on the respective “manage” links), and apply it to any of the other profiles within your account.
So, what would I do? I would create a custom report with an advanced segment applied to it. You can also create a filtered profile if you wish, but I would suspect you would not use it as much as you would a custom report / advanced segment combo. I would also insist that your report is meaningful and that you can take action from it (e.g. knowing that a page’s $Index value is a lot lower than the site average would point you in that page’s direction to optimize / refine it). Pick metrics like Bounce Rate, $Index and Goal Conversion Rate that help you understand page performance, and ditch trivial ones like Avg. Time on Site or Exit Percentage.
Hope I helped out Matt and others in a similar situation!