Many people, whether new to the web or not, seem to have a difficult time wrapping their heads around how to logically structure their site’s information for both users and engines, but I’ve found that comparing it to something everyone’s familiar with can make it very easy to understand even complex aspects of site architecture.
Let’s first look at how a standard filing cabinet is organized: You have the individual cabinet, drawers in that cabinet, folders within the drawers, files within the folders, and documents within the files.
There is only one copy of any individual document, and it’s located in a particular spot. There is a very clear navigation path to get to it.
If I want to find the January 2008 Invoice for a client (Amalgamated Glove & Spat), I will go to the cabinet, and perhaps open the drawer marked Client Accounts, find the Amalgamated Glove & Spat folder, look for the Invoices file, and then flip through the documents until I come to the January 2008 invoice I’m looking for (again, there is only one copy of this…I won’t find it anywhere else).
Now, before we get into details, let’s make a quick comparative reference between our filing cabinet’s information architecture, and that of a website. Let’s use Craigslist as an example. We’re not paying attention to their page design or navigation, just their information architecture.
This sort of information architecture provides a simple, straightforward way for your visitors to find individual items or content. It’s the way we all learned to organize information growing up, and the way we naturally want to navigate down through a site. It’s also an excellent architecture style for SEO purposes. It’s shallow, easily crawled, and if you remember to think of individual content items as documents in a filing cabinet, there will never be more than one copy and you’ll avoid duplicate content problems.
Let’s also look briefly at how this basic filing cabinet approach can work for some more complex IA issues.
Subdomains:
Subdomains should be thought of as completely separate filing cabinets within one big room. They may share similar architecture, but they shouldn’t share the same content and, more importantly, if someone points you to one cabinet to find something, they’re indicating that cabinet is the authority…not the other cabinets in the room. Why is this important? It will help you remember that links (i.e., votes or references) to subdomains may not pass their authority to other subdomains within the Pay-Level Domain (e.g., *.craigslist.com, wherein * is a variable subdomain name).
Those cabinets, their contents, and their authority are isolated from each other and may not be considered in concert with each other. This is why, in most cases, it is best to have one large, well-organized, filing cabinet (may I suggest the www variety) than several that may prevent users and bots from finding what they want.
Redirects:
The most organized Administrative Assistant I’ve ever seen in my life was a woman named Corley at a film company I used to work for. She didn’t know it (neither did I at the time), but she used 301 redirects inside her literal, metal filing cabinet. If there was something she found herself looking for in the wrong place, she would stick a Post-It note in there reminding her of the correct location. Anytime you looked for something in her cabinets, you could always find it because if you navigated improperly, you would inevitably find a note pointing you in the right direction. One copy. One. Only. Ever.
Redirect irrelevant, outdated, or misplaced content to the proper spot in your filing cabinet and both your users and the engines will know what qualities and keywords you think it should be associated with.
Noindex and Nofollow:
It may be the case that you have certain files or documents that you only want your users to see, but not the search engines. In those cases, using noindex is a way to mark areas of your filing cabinet as Classified. While the engines may sneak a peek, they won’t share the information with anybody. The only way someone will find that information is if you they look through your cabinet and come across it, or if you provide a declassified copy for them to search for.
Nofollow, on the other hand, is a way to slap the engines’ hands away when they’re going through your cabinet the wrong way. Users may jump around and need several different ways to find things in your cabinet. Or you may need to reference another cabinet that has relevant information, but you don’t advise people to look in it. In these cases you can use nofollow to tell engines that they should ignore certain links; they might be worthwhile to users, but you don’t advocate their use and would prefer they’d use other links to get places, or look at other sources that you do advocate.
URLs:
How easy would it be to find something in a filing cabinet if every time you went to look for it, it had a different name? What if that name resembled “jklhj25br3g452ikbr52k”? Static, keyword targeted URLs are best for users and best for bots. They can always be found in the same place, and give semantic clues as to the nature of the content.
These specifics aside, thinking of your site information architecture in terms of a filing cabinet is a good way to make sense of best practices. It’ll help keep you focused on simple, easily navigated, easily crawled, well-organized structure. It’s also a great way to explain an often complicated set of concepts to clients and coworkers.
If you can think of other ways the filing cabinet analogy can be applied, or if you have your own helpful analogies for explaining SEO concepts, feel free to share them.