What are the factors Google considers when weighing whether a page is high or low quality, and how can you identify those pages yourself? There’s a laundry list of things to examine to determine which pages make the grade and which don’t, from searcher behavior to page load times to spelling mistakes. Rand covers it all in this episode of Whiteboard Friday.
Video Transcription
Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re going to chat about how to figure out if Google thinks a page on a website is potentially low quality and if that could lead us to some optimization options.
So as we’ve talked about previously here on Whiteboard Friday, and I’m sure many of you have been following along with experiments that Britney Muller from Moz has been conducting about removing low-quality pages, you saw Roy Hinkis from SimilarWeb talk about how they had removed low-quality pages from their site and seen an increase in rankings on a bunch of stuff. So many people have been trying this tactic. The challenge is figuring out which pages are actually low quality. What does that constitute?
What constitutes “quality” for Google?
So Google has some ideas about what’s high quality versus low quality, and a few of those are pretty obvious and we’re familiar with, and some of them may be more intriguing. So…
- Google wants unique content.
- They want to make sure that the value to searchers from that content is actually unique, not that it’s just different words and phrases on the page, but the value provided is actually different. You can check out the Whiteboard Friday on unique value if you have more questions on that.
- They like to see lots of external sources linking editorially to a page. That tells them that the page is probably high quality because it’s reference-worthy.
- They also like to see high-quality pages, not just sources, domains but high-quality pages linking to this. That can be internal and external links. So it tends to be the case that if your high-quality pages on your website link to another page on your site, Google often interprets that that way.
- The page successfully answers the searcher’s query.
This is an intriguing one. So if someone performs a search, let’s say here I type in a search on Google for “pressure washing.” I’ll just write “pressure wash.” This page comes up. Someone clicks on that page, and they stay here and maybe they do go back to Google, but then they perform a completely different search, or they go to a different task, they visit a different website, they go back to their email, whatever it is. That tells Google, great, this page solved the query.
If instead someone searches for this and they go, they perform the search, they click on a link, and they get a low-quality mumbo-jumbo page and they click back and they choose a different result instead, that tells Google that page did not successfully answer that searcher’s query. If this happens a lot, Google calls this activity pogo-sticking, where you visit this one, it didn’t answer your query, so you go visit another one that does. It’s very likely that this result will be moved down and be perceived as low quality in Google.
- The page has got to load fast on any connection.
- They want to see high-quality accessibility with intuitive user experience and design on any device, so mobile, desktop, tablet, laptop.
- They want to see actually grammatically correct and well-spelled content. I know this may come as a surprise, but we’ve actually done some tests and seen that by having poor spelling or bad grammar, we can get featured snippets removed from Google. So you can have a featured snippet, it’s doing great in the SERPs, you change something in there, you mess it up, and Google says, “Wait, no, that no longer qualifies. You are no longer a high-quality answer.” So that tells us that they are analyzing pages for that type of information.
- Non-text content needs to have text alternatives. This is why Google encourages use of the alt attribute. This is why on videos they like transcripts. Here on Whiteboard Friday, as I’m speaking, there’s a transcript down below this video that you can read and get all the content without having to listen to me if you don’t want to or if you don’t have the ability to for whatever technical or accessibility, handicapped reasons.
- They also like to see content that is well-organized and easy to consume and understand. They interpret that through a bunch of different things, but some of their machine learning systems can certainly pick that up.
- Then they like to see content that points to additional sources for more information or for follow-up on tasks or to cite sources. So links externally from a page will do that.
This is not an exhaustive list. But these are some of the things that can tell Google high quality versus low quality and start to get them filtering things.
How can SEOs & marketers filter pages on sites to ID high vs. low quality?
As a marketer, as an SEO, there’s a process that we can use. We don’t have access to every single one of these components that Google can measure, but we can look at some things that will help us determine this is high quality, this is low quality, maybe I should try deleting or removing this from my site or recreating it if it is low quality.
In general, I’m going to urge you NOT to use things like:
A. Time on site, raw time on site
B. Raw bounce rate
C. Organic visits
D. Assisted conversions
Why not? Because by themselves, all of these can be misleading signals.
So a long time on your website could be because someone’s very engaged with your content. It could also be because someone is immensely frustrated and they cannot find what they need. So they’re going to return to the search result and click something else that quickly answers their query in an accessible fashion. Maybe you have lots of pop-ups and they have to click close on them and it’s hard to find the x-button and they have to scroll down far in your content. So they’re very unhappy with your result.
Bounce rate works similarly. A high bounce rate could be a fine thing if you’re answering a very simple query or if the next step is to go somewhere else or if there is no next step. If I’m just trying to get, “Hey, I need some pressure washing tips for this kind of treated wood, and I need to know whether I’ll remove the treatment if I pressure wash the wood at this level of pressure,” and it turns out no, I’m good. Great. Thank you. I’m all done. I don’t need to visit your website anymore. My bounce rate was very, very high. Maybe you have a bounce rate in the 80s or 90s percent, but you’ve answered the searcher’s query. You’ve done what Google wants. So bounce rate by itself, bad metric.
Same with organic visits. You could have a page that is relatively low quality that receives a good amount of organic traffic for one reason or another, and that could be because it’s still ranking for something or because it ranks for a bunch of long tail stuff, but it is disappointing searchers. This one is a little bit better in the longer term. If you look at this over the course of weeks or months as opposed to just days, you can generally get a better sense, but still, by itself, I don’t love it.
Assisted conversions is a great example. This page might not convert anyone. It may be an opportunity to drop cookies. It might be an opportunity to remarket or retarget to someone or get them to sign up for an email list, but it may not convert directly into whatever goal conversions you’ve got. That doesn’t mean it’s low-quality content.
THESE can be a good start:
So what I’m going to urge you to do is think of these as a combination of metrics. Any time you’re analyzing for low versus high quality, have a combination of metrics approach that you’re applying.
1. That could be a combination of engagement metrics. I’m going to look at…
- Total visits
- External and internal
- I’m going to look at the pages per visit after landing. So if someone gets to the page and then they browse through other pages on the site, that is a good sign. If they browse through very few, not as good a sign, but not to be taken by itself. It needs to be combined with things like time on site and bounce rate and total visits and external visits.
2. You can combine some offsite metrics. So things like…
- External links
- Number of linking root domains
- PA and your social shares like Facebook, Twitter, LinkedIn share counts, those can also be applicable here. If you see something that’s getting social shares, well, maybe it doesn’t match up with searchers’ needs, but it could still be high-quality content.
3. Search engine metrics. You can look at…
- Indexation by typing a URL directly into the search bar or the browser bar and seeing whether the page is indexed.
- You can also look at things that rank for their own title.
- You can look in Google Search Console and see click-through rates.
- You can look at unique versus duplicate content. So if I type in a URL here and I see multiple pages come back from my site, or if I type in the title of a page that I’ve created and I see multiple URLs come back from my own website, I know that there’s some uniqueness problems there.
4. You are almost definitely going to want to do an actual hand review of a handful of pages.
- Pages from subsections or subfolders or subdomains, if you have them, and say, “Oh, hang on. Does this actually help searchers? Is this content current and up to date? Is it meeting our organization’s standards?”
Make 3 buckets:
Using these combinations of metrics, you can build some buckets. You can do this in a pretty easy way by exporting all your URLs. You could use something like Screaming Frog or Moz’s crawler or DeepCrawl, and you can export all your pages into a spreadsheet with metrics like these, and then you can start to sort and filter. You can create some sort of algorithm, some combination of the metrics that you determine is pretty good at ID’ing things, and you double-check that with your hand review. I’m going to urge you to put them into three kinds of buckets.
I. High importance. So high importance, high-quality content, you’re going to keep that stuff.
II. Needs work. second is actually stuff that needs work but is still good enough to stay in the search engines. It’s not awful. It’s not harming your brand, and it’s certainly not what search engines would call low quality and be penalizing you for. It’s just not living up to your expectations or your hopes. That means you can republish it or work on it and improve it.
III. Low quality. It really doesn’t meet the standards that you’ve got here, but don’t just delete them outright. Do some testing. Take a sample set of the worst junk that you put in the low bucket, remove it from your site, make sure you keep a copy, and see if by removing a few hundred or a few thousand of those pages, you see an increase in crawl budget and indexation and rankings and search traffic. If so, you can start to be more or less judicious and more liberal with what you’re cutting out of that low-quality bucket and a lot of times see some great results from Google.
All right, everyone. Hope you’ve enjoyed this edition of Whiteboard Friday, and we’ll see you again next week. Take care.