How the Google Removal Tool Can Keep You In The Doghouse – And How It Has Become a Negative SEO Weapon
Author’s note: Three days after submitting this post to SEOmoz, Google lifted the manual penalty. However, the struggles and problems with the reconsideration process are still highly relevant to many webmasters and SEOers. I hope this post will promote positive discussion and change in our industry.
We’ve all read various stories of fellow SEOers and Webmasters and their attempts to clean up penalties from Google. Some have been successful, while others haven’t.
For us at Business Supply, the process has been a nightmare. This isn’t to say we’re innocent. We have a lot of what you’ll see from penalized sites: comment spam, heavy anchor text, Paid Networks, etc. Most of the bad stuff happened prior to my joining the company in May 2011, but some happened on my watch too. As a small company in a gigantic vertical, we took some shortcuts to keep up with the competition and got our hand slapped by the Big G.
And while that hand slapping was justified, it has turned into a full-on paddling. Based on the work we’ve done, which I describe in detail below, the penalty should have been lifted. But it hasn’t, and it’s because Google’s Removal Tool has some major flaws. We can’t start over with another site. We’ve been in business since 1999 and have hundreds of thousands of customers and an established brand. Getting the penalty lifted is our only option.
A little background on our case: We’ve filed 10 reconsideration requests since July. Each one showed a substantial increase in progress on our part. Here are the stats from our most recent request from 10/26. We used SEOMOZ, AHREFs, Majestic and Google as the source of our inbound links.
- 15,574 links removed (48.7% of all links)
- 12,964 links Disavowed (40.6% of all links)
Between removals and disavows, that encompasses 89.4% of links cleaned up. That’s 28,538 out of 31,925 links.
Google, in their words, requires a “substantial good-faith effort to remove the links, and this effort should result in a decrease in the number of bad links that we see.” I’m not sure how much more substantial it can get than what we’ve done. The spreadsheet we send shows contact information, when we contacted, the status of the link, etc. It’s incredibly detailed.
Contrary to Matt Cutts’ assertion that Google is now sending a sample of offending links within their notices to webmasters, we haven’t seen anything. We’ve asked for specific examples of links multiple times, but haven’t received anything back from Google (and we’re at the point where we aren’t submitting the Reconsideration request form—we’re emailing back and forth with Google).
With us having gotten almost 49% of links removed, we figured that our hard work would have paid off.
But it hasn’t. I think the largest problem with this whole process exists with the Google Removal Tool: and how it handles removals. Here are some of those faults:
NO CACHE:
When you want to request removal/re-cache of a page from Google’s search results, you go to the tool, enter the URL, and hit Submit. If Google determines the page is broken or needs to be removed, it allows you to submit the URL to be removed or re-cached.
However, there are times where Google will show a cache of the page, and you have to tell Google how the page has changed. Usually, I’d do this by finding a unique word that no longer exists, but still is shown in the cached version. Enter the appropriate term and you have a 50/50 chance of Google accepting your request and removing/re-caching the request.
Notice I said 50%. Because in the other 50%, this happens:
Offending page which has a link to Business Supply (according to Google):
No link on there to Business Supply. After entering the URL into the Removal Tool, here is what I get:
Okay, cool. Google wants to be sure that I’m removing or re-caching a page that actually needs it. I understand completely. But when I click on the ‘cached version’ to find my term, I get this:
Google doesn’t have a cache of the page. My link isn’t on the current page—it hasn’t been for months. But Google doesn’t know that, and the one solution for me to inform them of the situation is broken. It doesn’t do any good to submit a list of these URLs, because I’ve tried it (we have thousands of URLS that don’t have a cache in Google).
Just as importantly, how is Google even telling me that the site is linking to me when it doesn’t even have the page cached and I can’t find it in the SERPs? See this screenshot:
This is a URL that, based on my WMT “Download Latest Links” report, was discovered by Google on April 4th. This isn’t a new URL. After getting stonewalled by the Removal Tool, I took this URL (and dozens of other URLs over the course of a month) and submitted it via Google’s Submission Tool:
Nothing. I then created a RSS feed with these test links and uploaded them to Feedburner, in the hopes that Google would recrawl them. No such luck.
If you don’t know the anchor text (or any text) that was previously on the page, you’re out of luck). Sure, you could put in random text, and you’ll get these denials:
If you happen to know correctly enter text that has been removed, you can still get either of the above denials. But if you are successful, the URL will be listed as ‘removed’ in the Removal Tool. But that doesn’t mean you’ve solved the problem.
Google keeps the URL requests “for the time until they’re no longer relevant and up to 90 days.” However, if Google doesn’t swing by that particular page in 90 days and recrawl it, they mark the request as “expired,” which means your link is still there and you have to resubmit it. In our case, that constitutes thousands of URL requests that we have to re-make. That process isn’t something which can be done in bulk; it’s a one-at-a-time submission process that takes a long time. And you’re limited to 500 or so requests a day.
The “No Cache” problem happens with 404 pages and parked pages too.
I specifically wrote to Google about these problems in August, and their response was
“Regarding links that are no longer live, the linking examples generated in Webmaster Tools are created automatically as part of our crawling pipeline. These links will automatically drop out as example URLs as we detect changes on the page. The same concept goes for links on pages that are now 404 pages – we will detect those automatically over time.”
I gave them this specific example:
It’s a parked domain page. Forget about the fact that Google says they have a ‘detector’ for these types of pages. Let’s just look at it from the Removal Tool perspective.
Now after checking Google’s cache of the page:
The page was discovered by Google on April 3rd. I attempted to get it removed in July. In August, I specifically told Google about this URL in our reconsideration request email and the problem with ones like it. They said it would fall off. It’s November, and the link is on my current Download Latest Links report.
The same thing happens with redirects, but you get the point. I have thousands of URLs like this. I’ve manually checked every one of them. They don’t exist, but Google says they do.
ROBOTS.TXT DENIAL:
This is a major problem with the Removal tool, and it’s twofold. Not only are you not able to get these links removed, but your competitors can use these types of links to negatively impact you, SEO-wise.
The link:
The page doesn’t exist.
After punching the URL into the tool, I get this:
Woo hoo! Success! I hit the “Yes, remove this page” and I think I’m in business. Only the next day, I find that Google has denied the request. The reason?
Wait, what? Isn’t the page already removed? I know it. Google, based on two screenshots above, knows it as well. While the example I gave was from a dead site, live sites are also impacted by the Robots.txt denial. A lot of Web 2.0 site URLs return this particular denial from the Removal Tool. Once you get past the first page or something like a ‘top upcoming’ page, Google will hit you with the denial. For some reason, they can’t remove links from deeper pages, so this becomes a very real—and scary—possibility for Negative SEO.
There are other problems with the Google Removal Tool, but I just highlighted the two big ones. We know and see that 44% of our links have been removed and 89.4% have either been removed or disavowed. We can only assume that Google, due to the massive problems associated with their Removal Tool and Caching, sees a lot more live links than we do and has decided to not lift a penalty.
In the meantime, the faults in the Removal Tool have made Negative SEO real. Sure, with the new Disavow feature, you can brush away links that are hurting your backlink profile. But that process takes weeks to take effect, and you’re not guaranteed that Google will accept your Disavow recommendations. At first glance, waiting 2-4 weeks for the Disavows to take place doesn’t seem that bad—unless it happens during the most important time of the year for your business.
That’s how Google’s Removal Tool has almost killed us. It can do the same for you, if someone uses it to negatively impact your site like this:
Paid networks: We used paid blog network, and in May we cut the cord. Six months later, our links are still active on the network. After repeated requests to take the links down, we finally gave up. Additionally, after digging, I found evidence of an additional, inactive paid network going back 3-4 years. Our links were still active on the old network, even though we hadn’t compensated the network in years. We were able to get about 10% of those links removed by going to individual site owners. The rest, however, are still live.
Because paid networks don’t require any identification, they don’t care if you are who you say you are. You pay, you play. A competitor could easily sign up for a paid network and blast my site with a ton of bad links with anchor text that could do some serious damage. No one would ever know, and I wouldn’t be able to rid myself of the links (at least not until the Disavow tool kicks in).
Comment Spam: There are a ton of old blog posts that exist which have thousands of comments on them. Those blog owners aren’t actively maintaining the sites, so the comments continue to pile up. Slap the same anchor text on 1,000 different pages on these sites and you’ll crush your competitor. In many of these cases for us, Google didn’t have a cache of the page. So we couldn’t do anything about the link.
Tags and inactive Web 2.0 sites: In my experience, Google rarely has a cache of tagged pages (ex: site.com/tag/). So if you can get a link deeper into a site, chances are it won’t be removable via the Removal Tool.
Low Level Search Directories: I’m not going to give out any names, but I’ve seen a number of Directory companies expanding rapidly. I see new ones pop up on our backlink profile regularly, and it’s difficult to find the time required to contact them for removal and eventually Disavow them.
As a small company trying to come clean, Google sure isn’t making it easy for us, and we’ve felt the negative impact in sales throughout this entire process. It’s made us remember that any one source shouldn’t be a large portion of our revenue. Which is unfortunate, because we’ve enjoyed our relationship with Google for more than 10 years error-free.
We had a stint with the dark side, but we’ve taken responsibility and tried to resolve our problems with Google for six months now. We’re ready to move forward with Google, but it’s not clear on how.
How many more links do we need to remove? How will we get past these problems with Google’s Removal Tool? It’s really unclear, and I wish I had more answers.