seo

Headsmacking Tip #13: Don’t Accidentally Block Link Juice with Robots.txt

Ali JalilPour July 3, 2024

0 0 2 minutes read

A very simple return to the headsmacking series this week (as it’s late here in London and I’ve been up my usual 40+ hours traveling).

We’ve been noticing that a number of websites seeking to block bot access to pages on their domain have been employing robots.txt to do so. While this is certainly a fine practice, the questions we’ve been getting show that there are a few misunderstandings about what blocking Google/Yahoo!/MSN/other search bots with robots.txt does. Here’s a quick breakdown:

Block with Robots.txt – do not attempt to visit the URL, but feel free to keep it in the index & display in the SERPs (see below if this confuses you)
Block with Meta NoIndex – feel free to visit, but don’t put the URL in the index or display in the results
Block by Nofollowing Links – not a smart move, as other followed links can still put them in the index (it’s fine if you don’t want to “waste juice” on the page, but don’t think it will keep bots away or prevent it from appearing in the SERPs)

Here’s a quick example of a page that’s blocked via robots.txt but appears in Google’s index:

(note that this robots.txt is the same across about.com’s other subdomains, too)

You can see that about.com is clearly disallowing the /library/nosearch/ folder. Yet, here’s what happens when we search Google for URLs in that folder:

Notice that Google has 2,760 pages from that “disallowed” directory. They haven’t crawled these URLs, so they appear as mere address strings (no title, description, etc – since Google can’t see the pages’ content).

Now think one step further – if you’ve got any number of pages you’re blocking from the search engines’ eyes, those URLs can still accumulate links, accumulate juice and other query-independent ranking factors, but they have no way to “pass it along” since their own links out will never be seen. I’ll illustrate the situation:

There’s two real takeaways here:

Conserve link juice by using nofollow when linking to a URL that is robots.txt disallowed
If you know that disallowed pages have acquired link juice (particularly from external links), consider using meta noindex, follow instead so they can pass their link juice on to places on your site that need it.

Looking forward to seeing folks at SMX London tomorrow (and for Will and my big showdown on Tuesday, too)!

p.s. Andy Beard covered this topic previously in a solid post – SEO Linking Gotchas Even the Pros Make.

Ali JalilPour July 3, 2024

0 0 2 minutes read

Rewriting the Beginner’s Guide Part VI: How Usability, User Experience, and Content Affect Search Engine Rankings

What I learned at Toronto SES 2007

The Rise and Fall of PageRank: A Parable

Why Freebies are Link Building Gold

Customer Journey Maps

Dumbing It Down For Your Sales Force

Outsource Link Building like a Small SEO Company

Custom Reporting using Google Analytics and Google Docs – The Ultimate Analytics Mashup

Podcast – Rand Gets Interviewed on Paid Links, Edu & Gov Links, SEO Tactics, Keyword Tools & More

Write Content for Content-Hungry Communities

Site Speed – Are You Fast? Does it Matter for SEO?

The 2 User Metrics That Matter for SEO

Headsmacking Tip #13: Don’t Accidentally Block Link Juice with Robots.txt

Ali JalilPour

Leave a Reply Cancel reply

Web hosting for SEO: Why it’s important

SEM career playbook: Overview of a growing industry

What Is SEO – Search Engine Optimization?

3 Unexpected Examples of Technical SEO Issues (And How to Resolve Them) — Whiteboard Friday

Building SEO into the Development Process

How to Do SEO for Sites and Products with No Search Demand

How I Develop Successful Link Building Strategies for My Clients

Optimizing for AI Overviews

My Top 5 Local SEO and Marketing Takeaways From MozCon 2024

What is an Email Bounce and What Can You Do About It?

What Is SEO – Search Engine Optimization?

Top SEO Tips for 2024 — Whiteboard Friday

Subscribe to our mailing list to get the new updates!

Weapons-Grade SEO Part 1: Laying the Foundation

Weekend Roundup for the Week of 5/10/09

Related Articles

Leave a Reply Cancel reply

Web hosting for SEO: Why it’s important

SEM career playbook: Overview of a growing industry

What Is SEO – Search Engine Optimization?

3 Unexpected Examples of Technical SEO Issues (And How to Resolve Them) — Whiteboard Friday

Building SEO into the Development Process

How to Do SEO for Sites and Products with No Search Demand

How I Develop Successful Link Building Strategies for My Clients

Optimizing for AI Overviews

My Top 5 Local SEO and Marketing Takeaways From MozCon 2024

What is an Email Bounce and What Can You Do About It?

What Is SEO – Search Engine Optimization?

Top SEO Tips for 2024 — Whiteboard Friday