This post describes a method for overwriting the dreaded keyword (not provided) with the keyword which you have inferred that (not provided) represents.
Warning! Notice that I used the word ‘overwriting’ above. The method I describe below literally overwrites the referrer which is logged within Google Analytics, similar to how a filter in Google Analytics can modify your data. When you use filters in Google Analytics, Google recommends maintaining one profile which is free of filters. This is because once you have modified your data using filters, there is no way to ‘unmodify’ that data. With this script, however, the referrer will be overwritten for ALL of your profiles, and there is no way to maintain one profile where the referrer is not overwritten. Basically, if you screw this up, you are out of luck. So make sure you fully understand this script before you implement it.
Still with me? Great. Anyone working in the realm of online marketing is well aware of the rising scourge of the (not provided) keyword in Google Analytics. If you’re reading this, I’m going to assume that you already know what I’m talking about. If you do not, I would advise Googling “google analytics (not provided)” and reading through some of the top results.
Many blog posts have been written about how to better analyze the (not provided) data. My personal favorite approach involves breaking down the (not provided) referrals by landing page. Say I’ve got a blog post discussing enterprise app stores. When I go into Google Analytics, I see data like this:
The data shown above is real data taken from a site BEFORE the script has been implemented. The data shows visits to a specific landing page URL, filtered to show only visits referred by organic Google searches. I’ve selected ‘keyword’ as a second dimension.
When Google refers organic searches to this page, the vast majority of the keywords are being logged as (not provided). But for the referrals where the real keyword actually is tracked, ‘enterprise app store’ makes up the bulk of the referrals and all of the remaining keywords are closely related to ‘enterprise app store’. From this data, I can infer that the keywords logged as (not provided) were actually ‘enterprise app store’ (or another closely related keyword). This concept of inferred keywords is described in much greater detail within this well written SEOmoz post.
If the image above had shown 90 referrals from (not provided), 16 referrals from ‘enterprise app store’ and another 15 referrals from ‘application management’, the ability to infer keywords would be much murkier. In an instance like that, the script I’m about to describe should not be applied.
Now, inferring which keywords the (not provided) referrals actually represent is great, but trying to perform regular and comprehensive analysis on these inferred keywords is a pain in the ass. Combining ‘real’ keyword referrals with ‘inferred’ keyword referrals into one data set would likely require extensive use of Excel or Access, which doesn’t work for me, because Excel and Access are my top two reasons for hating Microsoft.
What if we could overwrite the value of (not provided) with the value of the inferred keyword that we think (not provided) most likely represents? We can! I have developed a script that can overwrite (not provided) with the inferred keyword we think would most likely refer a visitor to the landing page in question.
What Happens If I Implement The Script?
I’m glad you asked! Continuing with the example described above, after the script has been implemented, you’ll see data like this:
The data shown above is real data taken from a site AFTER the script has been implemented. The data again shows visits to a specific landing page URL, filtered to show only visits referred by organic Google searches. Notice that the (not provided) referrals have been replaced by additional ‘enterprise app store’ referrals.
And also data like this:
The data shown above is real data taken from a site AFTER the script has been implemented. This data shows instances of a custom event that is triggered every time the script overwrites (not provided). The top row shows that 45 of the 59 ‘enterprise app store’ referrals from the second image are actually instances where (not provided) was overwritten with ‘enterprise app store’.
For those of you with a limited understanding of JavaScript, that’s probably as technical as you want me to get. Here are instructions for installation of the script. Remember, only implement this is you truly 100% understand everything I’ve described so far in this post.
Installation Instructions
- Download the file.
- Open the file inside a text editor.
- At the top of the file, you will see a JavaScript array that is populated with example data. This is the only part of the code that you should need to update. Don’t be scared by the word “array” if you are not a JavaScript coder. An array is just multiple sets of data contained within double quotes.
- Each line in the code is a single entry in the array. The value in the left brackets is the landing page that you want to overwrite (not provided) data for. The value in the right brackets is the inferred keyword you want to use when completing the overwrite.
- Notice that the final entry in the array does not have a comma at the end of the line of code, while all the other lines do have commas. You must maintain this syntax when adding further lines to the array (the final line of the array must not have the comma). Otherwise, the script will throw an error.
- Add the script into your site (I would recommend using an external JS file, as that will be easier to maintain). The script must be placed in the middle of the standard Google Analytics tracking code, or else it will not work. The script needs to be placed after the initial “_gaq” variable declaration, but before the “_trackPageView” method is called. See below for an example.
- Every time the (not provided) keyword is overwritten, the script also triggers a custom event in Google Analytics which saves the inferred keyword that was used within an Event Category named ‘Cookie Override’. You can use these events to track how many times you are overwriting (not provided). These custom events are set up as ‘non-interaction’ events so that the triggered events will not affect your bounce rate.
- Another way to make sure that everything is working properly is to use an alternate version of the inferred keyword for your overwriting value (i.e. instead of just ‘enterprise app store’, use ‘not provided – enterprise app store’, or whatever you prefer).
To again emphasis an earlier point, this method can only be used on landing pages where the vast majority of the recorded referring keywords are highly similar to each other. If your landing page receives many referrals from both ‘enterprise app store’ and ‘app stores for corporations’, overwriting (not provided) with either of these values would skew your results. In this case, if you overwrote all (not provided) referrals on the page to ‘enterprise app store’, you would be improperly overwriting the referrals of many people that had originally searched for ‘app stores for corporations’.
In short, use this script with an abundance of caution.
How does this all work, you ask?
Here’s how:
-
Check Referring Page
This one is pretty self-explanatory. We only want to run the script if the referring page contains “google” in the URL’s hostname. We also only want to run the script if the visitor is on the landing page of their session. The best way to check this is to see if the visitor’s document.referrer includes your website’s hostname in it. If the document.referrer does include your hostname in it, the script does not proceed. -
Look for the Empty Keyword Reference
Here, we look for the value of the “q” URL parameter. If that value is a blank string or is undefined, we know that Google has not passed a referring keyword, so we will proceed with our script. These are the Google referrals which result in (not provided) keywords. -
Make Sure the Visit is Not From AdWords
Self explanatory. If the visit originated from AdWords, the script does not proceed. This is checked by looking for “/aclk” in the referrer, which signifies an AdWords visit. -
Loop Over the Array
Now we loop over the array of values that have been set up at the top of the script. Here we are comparing ‘location.pathname’ (the current URL without the hostname and without any query variables) to the landing page values that have been entered into the array. If ‘location.pathname’ matches one of the landing page values, we run two Google Analytics methods. (Note: if your site has URLs which only differ with query variables that are passed, you will need to change this portion of the script to look for something like location.href) -
Run _setReferrerOverride
This method in Google Analytics allows us to override the user’s referrer. For the purposes of this script, we run the method and set the referrer to ‘http://www.google.com/search?hl=en&q=’, then we append the inferred keyword associated with the landing page in question. So in some of the examples described earlier in this post, the referrer would be reset to ‘http://www.google.com/search?hl=en&q=enterprise app store’. Read about _setReferrerOverride here – by changing the referrer to this, we set the Google Analytics medium to ‘organic’, the source to ‘google’ and the keyword to ‘enterprise app’. -
Trigger Custom Event
Finally, the script uses standard Google Analytics event tracking to make note of the fact that the overwriting has taken place, saved within an Event Category of ‘Cookie Override’. This line could be stripped out of the script (if you don’t want to trigger all these events) without anything breaking.
That’s it!
Note: I am not a master JavaScript programmer by any means and would appreciate any input to improve this script.
And another note: The script will not work properly if you are utilizing Google Analytics custom variables. I’m not sure if this is because of a bug in Google Analytics, or if it has something to do with Google’s efforts to protect people’s privacy, but either way, ‘_setReferrerOverride’ does not seem to trigger correctly when custom variables are used on a page.