Performance and Scalability improvement progress report #2

It's been a long and busy month, and I wanted to give y'all an update on the infrastructure, performance and scalability progress over at Technorati. There's been a lot going on as I described earlier in the year, and we've made some progress, but there's important things that are still broken, and are being fixed this month.

The situation as of couple of months ago

The blogosphere has been growing at an explosive rate - Technorati is now indexing over 16 million blogs, with about 100,000 new blogs created every day. And there's over 1.4 Million new posts every day, and about 22% of those posts are from spam or fake blogs, which means that even after we pull out the spam and fake blogs from the indexes, we are dealing with about 1.2 Million posts each day.

We just weren't expecting that kind of sudden growth, both on the posting side and also on the search side, and frankly we didn't plan well enough to handle the load. We've been adding new machines to our datacenter, - over 400 now - and more coming each week, and we've been fixing bugs and making performance enhancements on the web site as well.

We also made some pretty significant performance improvements to keyword search - most now returning in 1--2 seconds; you can see some details on those statistics and also a month view.

However, Cosmos search (or URL search) is still being worked on, and is often timing out under the increased load. Unfortunately this is also one of the searches that bloggers find most compelling, as it helps you to all know who is linking to your blog, and it is the very first type of search that Technorati made available, so it is near and dear to our hearts. Everyone here also uses it every day, so it really sucks when it isn't working right.

As search traffic has grown, we've also seen an increase in support and feedback requests. It's my goal to make sure that we respond to all support requests within 24 hours of getting the request. right now, we're not meeting those goals, and some people haven't had a human response in over a week from when they sent in their request.

What we're doing

Once we got our keyword search infrastructure back on track, our infrastructure team has been working 100% on fixing Cosmos search. Our current plan is to have Cosmos search back up and running by the end of September, and you'll see incremental improvement throughout the coming month. I'll keep you informed on progress of this critical project. As the project progresses throughout the month, you'll be able to see progress because you'll see fewer and fewer error messages when you do a URL search as September progresses.

We're busy expanding out our support capabilities, and also putting together tools to make it easier for users to help answer their own questions before a Technorati support staffer has to get involved, and we've already made a bunch of fixes and feature enhancements to help fix the most common support requests, like fixes in our blog claiming code.

What about new stuff?

While we work on these core infrastructure issues, we're not resting on our laurels in our dedication to provide great tools and services for bloggers and for people who want to keep track of what's happening on the web right now. There'll be more to announce in the coming days and weeks, stay tuned...

Thanks for your support

I am consistently humbled and amazed at how great our users are. You guys have stood by us as the service has grown and has gone through growing pains. We take this trust very seriously, and are working very very hard to live up to your expectations. Thanks.