July 29, 2005

Technorati Mobile

We just rolled out Technorati optimized for mobile phones: Technorati Mobile (m.technorati.com). It has a nice side benefit too- for those of you who want a simple "just the facts" search results page, or for those of you who aren't happy with the new web design, this may be just the site for you.

It is currently optimized for phones like the Treo 600/650. Send us feedback on how we can make it even more useful for you! How does it look on your phone? Leave a comment here and let us know!

m.technorati.com incorporates a lot of the new performance enhancements we've been rolling out, so it should be pretty fast as well, give it a shot!

Technorati Tags: , , , , , ,

July 28, 2005

Filter Search Results by Language (Beta!)

While we've been building out and scaling the infrastructure, we haven't stopped innovating and rolling out new features! Yesterday we released a beta language filtering service - this lets you filter search results by language, one of the top feature requests from our users. We listened hard to your feedback.

Go and give it a try, you'll see the filtering pulldown on every keyword search result, and you can restrict your search results to Chinese, Dutch, English, French, German, Greek, Italian, Japanese, Portuguese, or Spanish. It remembers your filtering preference, so you don't have to remind us every time you come and do a search on the same computer. This new feature is only in keyword searches for now, and it's still in beta.

We think our algorithms for doing language detection work reasonably well, but there's still lots of room for improvement, and for giving you, the blogger, more control about specifying the language that your posts are in. We're working on additional features to really help tune and tweak the system, but we thought that what we had so far was useful enough that we wanted to share it with you, in the spirit of "release early, release often". So, as always, enjoy! And please let us know how we're doing.

Performance and Scalability improvement progress report #1

This is first in a series where I'll be discussing the issues and challenges involved in building a world-class real-time search engine, and keep all of you more informed on what's going on under the covers at Technorati.

The blogosphere has been growing at an explosive rate - Technorati is now indexing over 14 million blogs, with about 80,000 new blogs created every day. That's about a new blog created every second! And there's about 900,000 new posts every day, which means about 37,500 posts per hour that we're indexing.

So, we have been working really hard on performance and scalability improvements for the service. Just as the size of the blogosphere has been growing by leaps and bounds, and our traffic growth has been growing even faster. We just had another 40%+ growth in traffic this month - which makes this month the fourth month in a row of these kinds of traffic jumps. Basically, that means that we are now serving more traffic in a week than we did in a month just 4 months ago. So, we've been racking and stacking servers - over 200 now in our data center, and more coming each week, and we've been fixing bugs and making performance enhancements on the web site as well. Our median time from post to index is now under 5 minutes. That means that on average, we index your blog posts in under 5 minutes from when you post them to the web. All you have to do is make sure that your blog software sends us a ping.

Most already do - with just a few exceptions. So, if you're a blogger and you're not finding your posts in our index, you should check out our publisher's guide and send an email to your blog software developer or hosting provider to ask why they aren't including you in the Technorati High Priority indexer, which we've built especially for getting you indexed quickly. And if you are a developer, we've got a wealth of material and sample code that you can use to make the process of integrating Technorati features a snap.

As for performance speedups, you can see the results of the work we've undertaken in the past few weeks, search result times have gotten more consistent, and consistently faster. The response time bump you see in that graph linked above last night was due to the rollout of our language filtering service (see this post) and the transition as we rolled it out over multiple servers in our farm, which reduced capacity temporarily. Note however, that in early July, our average search response times were 5-7 seconds. Now they are between 1-3 seconds. Our goal is sub 1-second response on all these queries.

There's still more to do, especially around Cosmos search and ensuring regularly updated link counts, dealing with spam, and making sure that everyone's tags are indexed properly. We're working on that as a top priority. We're going to keep working our butts off to keep providing you with the best search and discovery experience in the world, and I'll keep you informed with regular updates to let you know what's going on, both the good and the bad (but hopefully there'll be more good than bad!) Our mantra around here is to Be Of Service. Thanks for putting up with us during these tough months while we continue to grow to meet all of the demand.

Technorati Tags: , , , , , , , ,

What a month! Quick Technorati Summary

Lots has been going on since I've last had the chance to blog. It really is true what they say about the cobbler's kids having the worst shoes. I know so many people who have gone to work for companies in the blogging industry that now rarely have time to blog. c'est la vie!

Anyway, here's some of the neat stuff that's been going on, in no particular order:

Technorati is hiring! We're looking for a great Product Manager and Marcom Manager, Analytics Engineer, Project Manager, Web Applications Engineer, and are always on the lookout for great Search Engineers.

We've recently won some awards as well: A 2005 Forbes Best of the Web Favorite, and we were named as an AlwaysOn 2005 Top 100 Private Company. Both are humbling, and we give many thanks for the kind recognition.

With the help of our partners at Digital Garage, we launched technorati.jp our first localized version, optimized for the Japanese market, with a suite of tools that enhance blog search specifically for the Japanese language.

Yesterday was a banner day as well - Technorati now tracks over 14 million blogs - and there's over 80,000 new weblogs being created every day. That number takes into account all the spam blogs we kill as well. We've been tracking people who are creating garbage or spam blogs just to game AdSense or try to get more pagerank. We don't get them all, but we've been doing a lot of work identifying and squashing them from our index and search results. More on that in a future post.

There's been a lot of great press: BusinessWeek just did a story, Wired News had a column, The New York Times had a couple of nice mentions, USAToday, Wall Street Journal, and lots more.

We have been working really hard on performance and scalability improvements, See the next post.

While we've been building out and scaling the infrastructure, that hasn't stopped us from rolling out some new features as well! Yesterday we released a beta language filtering service - this lets you filter search results by language, which was one of our top feature requests. More about it here.

I've also been working with the team on a new State of The Blogosphere report, updated with statistics through July 2005. I'll be posting about it over the next few days. Some tremendous growth, with the blogosphere expanding in leaps and bounds.

More to come, stay tuned...

Technorati Tags: , , , , , , , , , , , , , , , , , ,

July 14, 2005

Scaling, performance, and plain old bug fixing

What a couple of months this has been! First, some stats on what’s going on in the blogosphere. Technorati is now tracking over 13.3 Million blogs, and 1.3 billion links. We are seeing over 900,000 posts per day on average, which means we're adding about 10 posts per second. We’re also seeing about 80,000 new weblogs created each day. That’s more weblogs created each day than there were total when I started the service in November 2002. And our search traffic has increased by over 40% month on month for each of the last 4 months. The day of the london bombings we saw over 1.2 Million posts, and had an additional 30% increase in traffic as people turned to weblogs, moblogs, and other citizen’s media for instant updates on events in London, survivor accounts, and sharing of deep feelings on the tragedy.

Recently a number of people have had some pretty public complaints about some of Technorati's services. Thanks for the terrific feedback and comments. I feel your pain.

We sat down, listened hard to what you were saying, and then got to work. And tonight, we rolled out a raft of bug fixes and performance enhancements that should help most, if not all of the Cosmos (URL) searches you do on Technorati. It will also help with the speed of all searches across the site.

Give it a go - Here's the results for this blog, and here's how you can give it a try.

These improvements don’t fix everything - some searches are still slow, and while we pride ourselves on completeness and fast index times, there’s still a long way to go. Performance and scalability improvements are our number 1 priority until this is fixed.

For those of you who have been having problems, we're working our butts off to win back your trust. I hope that the fixes we rolled out tonight and will roll out over the coming days and weeks will be of service to you. Because, in the end, that’s what this is all about to us - to be of service to all of you.

Thanks again for the great feedback and comments. Keep us on our toes.

July 7, 2005

An update on the blogosphere's reactions (and resources) to the London Bombings (7/7/05 16:30PST)

All of my thoughts and prayers are with the victims and survivors of today's events. The blogosphere reaction with an outpouring of emotion has been tremendous. I wanted to share some information and statistics taken from what Technorati has observing throughout the blogosphere today. We have put up a special page on its site to cover the events in London. The URL is http://technorati.com/londonbombings and will continue to be updated with information and live content as it happens.

There were just over 500,000 posts since from Midnight - 11AM Pacific (bombings took place at 12:51AM Pacific. This represents a 29.8% increase from yesterday's 840,000 posts (385,000 from midnight-11 am yesterday).

Here's a set of first hand accounts from the blogosphere:

"Fate is a strange thing. On this particular day a series of events transpired such that I ended up on a Tube train that was destroyed by terrorists. Fortunately it was only the carriage in front of me, but tragically it resulted in a serious amount of injuries. This is my story."

London pride meme starts as people post odes to their city.

A letter to the terrorists

Europhobia extended coverage. Lots of comments

In the station

Other accounts:

Statement from Qaeda't al-Jihad claiming responsibility

English translation

Technorati tag pages

Metroblogging London

Wikipedia page on London bombings

LiveJournal mood tracker. Check out sad and shocked.

Flickr London bomb blasts pool

People in the streets

I hope that today finds you and your friends and family well and safe. I will be going home tonight and giving an extra squeeze to my kids and remembering to be grateful for the small blessings in life. Peace.