On March 22, at about 15:00 GMT, Google was finally allowed to enter, crawl and index the New SCN. This marks another step foward in the roll out of the New SCN and, given Google’s importance as a source of traffic for the site, something my colleagues were waiting for ever since the New SCN launched ten days earlier.
For me, I have to admit that I was very excited because it’s not everyday you get to open the floodgates on a site with such a massive amount of quality content and witness its re-indexing by Google: it meant validating (hopefully) the months of planning that went into ensuring that SCN would be back up on it’s feet quickly with a minimal interruption to the search experience the community enjoyed before migration.
Here’s a rough scope of what this migration included and the dual challenge faced for SEO:
- Migrating three sites (and three systems) to a single and new URL:
forums.sdn.sap.com + weblogs.sdn.sap.com + www.sdn.sap.com –> scn.sap.com
- Migrating three sites worth of content into a single, unified one:
- 2million+ discussion (forum) threads and 20million+ messages (replies)
- 25,000+ blogs
- 25,000+ library assets (documents and videos)
- 1,100+ webpages
- New URL structure
- Millions of redirects activated (millions)
I’m not going to say that things went smoothly but my point is that the numbers are huge and so is the task of getting all those millions of SCN search results updated to their new URLs, that also happens to be on a brand new new sub-domain that was never crawled by Google. Updating the search results is organic and all we can really do is suggest to Google where and how to crawl the site: something that I have likened to waltzing with an elephant over the past few months.
Why not open sooner?
We had good reason to wait. Google crawlers add a considerable amount of load to any website: there were well founded fears that this additional load from external search engine crawlers could potentially bring down the site. (Crawlers, bots and spiders can represent anywhere from 30% to 50% of site traffic.) This fact was simply undeniable and we had to accept it despite our eagerness to open up to Google and other search engine crawlers as soon as possible. With a little negotiation and creative thinking by our IT team, we finally flipped the switch last Thursday*.
(*I can’t say the timing was great: I got notice just as I was preparing dinner for my screaming 16 month old and at the start of my weekend so I had to juggle feeding, cleaning and working on the computer to submit to Google and other search engines the relevant information all at the same time 😥 )
So we opened the doors and then what?
As you can see, Google started crawling immediately after we allowed it and submitted our new sitemap. Within hours there were already thousands of pages indexed, complete with their previews visible from the search results. After three days, most of our content has now been crawled and indexed, and even new content such as discussion appear within minutes of being posted:
Our space overview pages for popular topics like “SAP Mobile” and “ABAP” are ranking on the first page of results too so that’s really positive given that we offline for ten days and migrated these pages to a new sub-domain!
The redirects also worked in our favor: before opening the site, I could see that Google was already recording the redirections and building it’s own starting points ahead of the great crawl. Every day the number of URLs listed–but not indexed–grew: 135 on day one, 160k two days later, 400k by the end of the week and 1 million a week after launch.
See for Yourself (Search Tips)
Have a look at the New SCN search results. The site operator is a handy way of limiting your search results to a specific domain:
|Search Prefix/Format||Description||Example Query Strings|
|site:scn.sap.com [query]||Displays results from SCN only||site:scn.sap.com ABAP|
|site:scn.sap.com/thread/ [query]||Displays results from SCN Discussions (forums) only||site:scn.sap.com/thread/ time value|
|site:sap.com [query]||Displays results from SAP domains: SAP.com, SCN, Help, EcoHub, etc.||site:sap.com Mobile|
|site:scn.sap.com [query] inurl:blog||Displays results from SCN Blogs only||site:scn.sap.com/ workflow inurl:blog|
There’s still lots of work to be done. We have to update incoming links we can influence (Like Wikipedia: please help us!), tweak our crawl instructions to Google, fix any broken redirects, slowly shut down the old sites when we’ve got the most possible of their redirects possible…yup, there’s still lots to be done.
You, the community, have done a fine job creating lots of new quality content on the New SCN so you should just keep doing that while also sharing, bookmarking, liking and rating any content you have an opinion about: all these little extras help Google to identify good/bad content and ultimately help users find the content.