January 23, 2004

Technorati Hacks at ETCon

I'm speaking at the O'Reilly Emerging Technologies Conference! My talk is called "Technorati Hacks", and it is at 11:00 AM on Tuesday, February 10 in the Plaza room. This is right after the opening keynote, so I'm really excited to be in a "lead-off" position. And I'll be followed by the excellent Liz Lawley at 11:45AM, talking about "Breaking Into the Boys' Club: How Diversifying Your Team Can Expand Your Market". If you're coming to the show, leave a comment or a trackback here - are there any areas or topics you'd like me to cover or explain? I'm planning a few fun surprises up my sleeve...

Posted by dsifry at 12:11 AM | View blog reactions

January 21, 2004

What is Technorati?

If you're one of the tens of thousands of people who use Technorati every day, you'll notice that most of our changes (on the new beta site so far have been under the hood. Changes to the body have been minimal. As a result, we've been scratching our heads because we've never explained exactly what Technorati *is*. For that matter, we've never explained much about what a "cosmos" is, either -- even though that's what Technorati finds in its searches.

So I thought it would make sense to ask you what Technorati is. Is it a search engine for blogs? A conversation engine? Or something else again?

Same with "cosmos." Is there a more self-explanatory word for what Technorati finds? Or a better way to say exactly what "cosmos" means?

Let us know. We'd like to hear from you. Thanks!

Posted by dsifry at 2:56 AM | View blog reactions

January 19, 2004

New Technorati Infrastructure beta test!


After 2 months of painstaking effort, I'm proud to announce the new Technorati infrastructure is up and ready for use.

Please have a look, and tell us what you think:


We focused 100% of our time on completely refurbishing our underlying event engine - essentially taking a volkswagen engine out and putting a Ferrari engine in. This new engine sports:

1) Much faster indexing - the median amount of time it takes from when someone posts something on their weblog to when it is captured and searchable via our live database is 7 minutes.

2) Much faster querying - our goal is to have every search query take less than a second, even as the database is being continuously updated. We added a query timer at the top of every results page so you can judge for yourself.

3) Much more scalable - We built this distributed database system to scale. As we track more events, we add more machines to scale. As our user traffic increases, we add more machines to scale. This should continue to work for quite some time, so we're eager to test under load.

4) Much better internationalization support - The database is entirely in UTF-8, a character set that encompasses a significant number (well, all) of non-english languages, including Japanese, Farsi, Hebrew, and many others. You can see results in multiple languages all on the same page. Localization should be significantly easier.

5) A new, smarter spider/crawler, which understands weblog posts and blogrolls much better than our old spider. You'll note that on our results pages, many results offer a "Read Full Post" capability, which take you directly to the entire microcontent post that created the link.

6) A redone results page, which should load faster, and is designed for non-browser usage as well. Lots has been moved to CSS, and we've added a nifty pager widget at the top and bottom of each page of results.

Please go and use the site - and send us feedback.

Some known issues: There are a few areas where we're still filling out content, fixing bugs and layout, like in the top 100 page, breaking news, current events, and other pages. We're looking to find showstopper bugs or problems before we move this beta infrastructure over to the production site. So, don't fret if a page you like is currently missing or if the top 100 is messed up, we're fixing that. You may also see a change in your inbound blogs/links numbers, but that is primarily due to the fact that we're still bringing the new database up to speed, so we know that some of the numbers are different.

Thanks again for your time and patience, and on behalf of the entire Technorati team, we thank you for all of your support. We're really looking forward to your feedback.

Posted by dsifry at 11:43 PM | View blog reactions