I never thought of what it must be like to be a search engine crawling bot but, when I do, I can only imagine the challenges they go through day in, day out. Of course, it being a bot, or “spider,” makes it pretty hard to identify with at all since it’s nothing more than a program that follows links on the web and record whatever it scans.
Why the sudden interest? A rather random eye injury landed me in the the local emergency room recently. As one can expect, I was shuffled into a hallway waiting room to wait for my turn. By the time I sat down, I was pretty much exhausted from all pain and just sat there with both eyes closed: my only remaining task for the evening was to listen for my name and then I would be under the watchful care of nurses and doctors. This, as it turns out, wasn’t going to be as easy as it seemed. For one, the waiting room/hallway was a surprisingly noisy place full of action. Lacking my sense of sight, I could rely on what I heard and smelt around me: casual conversations on mobiles, the smell for strong Turkish coffee suddenly wafting about, someone offering sugar for that coffee followed by an offer for a a glass of water with the sound of a disposable cup in the background…I couldn’t help to wonder if I was really in the right place or not!?!
With so many inconsistent sounds and smells around me, I actually smiled for a moment and wondered if this is what’s it’s like to be a humble search engine crawl bot. The signs I walked past on the way to the hospital said I was going to the right place (well, if I could see them), I remember walking through the hospital gate, signing the forms presented by the apathetic staff and finally sitting down among morose people. Yet, when I closed my eyes, it felt like I was in the wrong place because the smells and sounds didn’t match what I thought a hospital waiting room should be like. Could this very well be the same for a crawling bot? After all, bots don’t “see” webpages: they mostly “read” the underlying source code (although this is changing). What happens when what they scan doesn’t match the information that directed it to the page? Is the page automatically marked as spam without any tolerance?
This is the challenge SEO practitioners and content writers have to deal with. We’re asked to create and promote original content within some pretty specific frameworks so that the content is optimized (but no too much so), that the text is fitting for where it’s hosted and incoming links seem “natural.” The amount of tolerance shown by the bots and algorithm are never fully known but deviating risks getting the page and entire site flagged for spam.
Fortunately, crawl bots are continually evolving with new capabilities being added all the time. Since I started working in SEO, I can honestly say that they aren’t as “blind” as they used to be and are pretty resilient at gathering information. However, as the web continues to expand requiring more effort from these bots, we also need to adapt and enhance our websites to optimize their visits. Some practices are well known and have become de facto standards of website SEO, such as XML sitemaps, robot.txt and meta tags, while others are still being neglected by many platforms. What I mostly have in mind is rich snippets and structured data: these not only help crawl bots parse information on a page more efficiently but also help push more data in search results to provide would be visitors with a better sense of what’s on the page within the search results so that they’re not flying blind:
It’s a pretty powerful tool when you look into it but it also requires some investment in applying it.
A week later, my eye is healing well and I’m no longer popping pills to lessen the pain (corneal erosion: therapeutic contact lenses are amazing). I’ve also got a greater appreciation for search engine crawling bots and am more convinced than ever that SEO is not only about optimizing the visit for humans but also bots as well.
Follow me on Google+