One of the most overlooked issues of on-page optimisation by clients, and even many SEO agencies are guilty of this on occasion, is a website’s URL structure. It is almost becoming my first port of call when taking on a new client, as correcting URLs and how they are formed can be one of the quickest wins a site can achieve. A sites URL structure also underpins all future SEO efforts and is fundamental to any level of success that you wish to achieve.
Therefore, for both website owners and agencies alike, please refer to my “Common URL Related SEO Mistakes” cheat sheet and maximise both your search engine visibility and usability in one go!
1. Lack Of Keywords
There still appears to be two camps on this matter: those who think using keywords in the URL string is of benefit, and those who don’t. I hereby grant you permission to quote me on saying that presently, in Google at least, yes this does indeed help your SEO efforts.
This is compounded by the fact that a recent Google update saw inner pages now ranking more frequently for certain search terms rather than the homepage, of which was historically ranked due to its higher overall authority.
With this in mind, any weighting and relevance you can give a URL to the search terms that you are targeting, the better.
Good URL Example: http://www.jessops.com/compact-system-cameras/Sony/NEX-5-Black-18-55-lens/
Bad URL Example: http://www.jessops.com/online.store/products/77650/show.html
(Sorry to pick on you here Jessops, but I use your site quite regularly and your URLs are a constant bugbear of mine!)
2. Too Many Keywords
While having keywords in your URL string is a positive thing, too many may land you in trouble. While I haven’t come across any firm data sets that would show the negative correlation between keyword stuffing in URLs and ranking positions, Google did mention a while ago that their algorithm does in fact look out for this (Matt Cutts alludes to how it can “look spammy” in this webmaster help video). Even if it isn’t an overly strong ranking factor, the usability aspect of the URLs is hampered when you begin jamming them full of target phrases.
Good URL Example: http://www.example.com/hotels/USA/north-amarica/florida/orlando/
Bad URL Example: http://www.example.com/hotels/USA-hotels/north-america-hotels/florida-hotels/orlando-hotels/
This is a relatively benign example and far from the worst I have seen!
3. Semantics and Directory Structure
Using a logical URL structure will not only help users figure out how all the pages relate to each other and how to navigate between categories, but search engine spiders also. An added benefit of using well-crafted semantics is that Google can pull these into a SERP and display them in place of a sometimes confusing URL string;
4. Dynamic AJAX Content
Moving onto something a little more technical now.
I have had a few new clients recently who were worried, as only their top-level category pages could be found in search engines. This meant that any long tail queries were not returning their sub-categories. Upon further investigation, it turned out that they were using AJAX to generate these pages with dynamic content using the hash parameter, and so these URLs were not being indexed.
Back in 2009 Google announced that it was making changes to allow these dynamic pages to be indexed. To do so, the exclamation mark token (“!”) needs to be added after the hash (“#”) within an AJAX URL.
Non-Indexable AJAX URL: http://www.example.com/news.html#latest
Indexable AJAX URL: http://www.example.com/news.html#!latest
5. Canonicalise www and Non-www
Most webmasters nowadays handle this well, although it is usually by accident as their CMS does it for them. However, there are still handfuls that forget. I guess it can be forgiven in some cases though as developers, after toiling for months on making a site pretty and ensuring it does what it is meant to do, that checking both the www and non-www are dealt with correctly is far from their minds.
There are two issues here really, both of which have the same answer. The first is that the non-www URL is not pointing anywhere and returns a 404 ‘page not found’ error. The second issue is that the non-www URL could render the same as the www version – this would effectively create two exact copies of the same website.
The solution: ensure that the non-www version is 301 redirected into the www version, or visa verse depending on your personal preference!
6. Secure HTTPs Pages
Another potential duplicate content issue that very often goes unnoticed is http and https pages rendering the same content. This usually arises either through sloppy web development using relative URLs within the website or through an automated CMS. It is most common when a user enters a secured area of the site with an https, then leaves the https page returning to the none secured http page ; however, the navigation retains the https precursor – usually because of relative URL links. This therefore results in the https being rendered on all pages of the site thereafter.
To combat this, two steps need to be taken. Firstly, all navigation to none secured pages must be http and not https. This can be achieved by either hard coding it or ensuring any relative URLs are removed from secured pages.
Secondly, in none secured areas, the https versions should be 301 redirected into the correct http versions.
7. Category IDs
Many of these points are interconnected, and this one lends itself to the advice already given, regarding keyword inclusion and semantics.
Many sites utilise category IDs within their URLs, generated the majority of the time by their CMS. In a nut shell, a load of numbers, letters and symbols in a URL means absolutely nothing to either a human visitor or a search engine spider. In order to maximise the sites SEO impact, and meet the advice of including keywords within the URL and logical semantics, these IDs need to be turned into relevant descriptive text.
Many CMS platforms have this ‘pretty URL’ ability built in to them. However, if this facility is not available, simply map each of the IDs to a relevant handle such as a products name or category.
Ugly URL Example: http://www.example.com/product.aspx?ID=11526&IT=5f7d3d
Pretty URL Example: http://www.example.com/dvds/anchorman-the-legend-of-ron-burgundy/
8. Session IDs
Many ecommerce sites track visitors’ activities, such as adding products to shopping baskets, by appending session IDs to the end of the URLs. These IDs are necessary for visitors to interact with functionality that is user specific; however, they can result in dangerous duplicate content issues. As each ID must be unique to each visitor, this potentially creates an infinite number of duplicated website pages.
For example:
http://www.example.com/buy?id=2f5e2 and http://www.example.com/buy?id=4k3g1 will render individually, however potentially be exactly the same page.
The best way to combat this issue is to remove the session IDs from the URL string and replace them with a session cookie. This cookie will work in the same manner as the ID, but is stored on the users’ machine and so will not affect the URL.
9. The Trailing Slash Conundrum
This is another duplicate content issue but a very subtle one. Again, many CMS platforms cope with this very well out of the box, but you need to be aware of it just in case.
The duplicate content in this case comes from a website rendering URLs, both with and without the trailing slash.
For example:
Both http://www.example.com/catagory/product and http://www.example.com/catagory/product/ will render individually, however will be exactly the same page.
Correcting the issue is straight forward and can be fixed with a simple 301 redirect rule for all pages without a trailing slash pointing to the version with a trailing slash.
10. Index File Rendering
A website will sometimes render both a root directory in the URL and the root appended with the index file (index.html, index.php, index.aspx, etc). When this happens, both get treated as an individual page by a search engine, resulting in both being indexed and creating duplicated content.
For example:
http://www.example.com/catagory/product/ and http://www.example.com/catagory/product/index.html will render individually, however be exactly the same page.
This is one of the most common oversights I come across on a daily basis, and is very simple to rectify. Similar to the trailing slash fix, a 301 redirect rule needs to be established to point one into the other. To allow for a greater level of usability I’d suggest redirecting the version, including the index page into the root directory URL without the index page.
BONUS TIP
11. Subdomain URLs
This is not so much directly an SEO-related issue; however, it is one I felt I had to include in this list as it has caused me headache after headache, with one particular client last year, and I want you to avoid my pain!
I was browsing through Google Analytics and came across one particularly unassuming page of a client’s site that was generating vast numbers of page views. Delving into the analytics a little further, after several hours of hair pulling, I discovered that it was due to a page on the main domain being named exactly the same as a page on one of their subdomains. Due to the way subdomain tracking in Google Analytics works, it was under the impression that these two very different pages were in fact, one in the same.
For example:
http://www.example.com/page-one/ and http://sub.example.com/page-one/ are separate pages displaying different information; however the standard Google Analytics sub-domain tracking code it will register all activity from both as being on the same page.
So to avoid unnecessary hair loss, ensure that all URLs on both the main domain and across all subdomains are unique, or alternatively implement a series of custom filters.
About the author: Paul Martin is a Senior SEO Analyst at Epiphany Solutions. With offices in both Leeds and London, Epiphany provides world class digital marketing to both national and international brands.