June 09, 2003

Technorati Keword Search

I'm extremely proud to announce the newest Technorati feature, Keyword Search.  You can now search Technorati's database of over 360,000 weblogs and get up-to-date information on your search terms.  Bookmark this page:

http://www.technorati.com/cosmos/search.html

The indexes are rebuilt several times each day, which means that it can take as little as 2 hours from the time you post something on your weblog to when it shows up in Keyword Search results.

A few disclaimers:  This is BETA, I whipped it up on little sleep over the weekend, and it may still have bugs.  There are some things that need fixing, like the results ordering and some parts of the user interface.  Rather than wait for everything to be perfect, I figured I'd release what I had in the hopes that some of you would find it useful. As always, I'm interested in getting feedback on how to improve it.

I've also implemented a REST-ful API for search requests as well.  This means that you can use HTTP GET or POST to the following URL:

http://api.technorati.com/search

and you'll get back an XML document with the search results.  Standard disclaimers apply, and you've got to abide by the Terms Of Service, which basically says that the results are for non-commercial use only, and must include a "powered by Technorati" link when displaying the results. 

First off, you'll need a Technorati API key, and if you don't have one already, you can get one for free by signing up as a Technorati member on the signup page and then retrieve your API Key

Next, construct your query.  For the sake of readability, I'm going to show this as an HTTP GET query, but it can just as easily be encoded in a POST for all of you REST zealots. :-)
query=SEARCH_TERM
Don't forget to URL encode your search term if it has spaces or quotes in it.
key=API_KEY
Here's where you put in your Technorati API Key.  You get 500 queries per day, from midnight to midnight PST.
start=RANKING
The API call returns 20 results.  If you want to see result 21-40, set start=21.  You can begin viewing results anywhere in the stream, so if you set start=30, you'd see results 30-49.  Note that there is no guarantee that results will be contiguous - rankings can change, and because the indexes are rebuilt frequently, some rankings may change between calls.  If this is an issue for you, let us know
version=0.9
type=xml
These two variables are built to allow for various format changes as time goes on.  The current Technorati API is at version 0.9, and as long as you set version=0.9 in your API call, we'll always return API  0.9 results.  This gives you developers the assurance that your applications will work for a long time, and it allows us to make changes and extensions to the API.  If you leave the version variable out of your query, it will default to the most recent version of the API (which is currently 0.9).

The type variable controls the format of the output itself.  We may decide at some future time to support other formats other than this XML format.  Putting in this variable allows developers to specify the exact type of output you want to receive.  Right now, the only legal value is type=xml, and it is the default value.

So, if you wanted to perform an API call on say, "David Sifry" and return the first 20 results, you would use (GET syntax):
http://api.technorati.com/search?query=%22David+Sifry%22&key=94035daac6b136378856f3239648ab27&start=1
Please send your feedback and comments, and if you have problems or questions, check on the api-discuss mailing list - lots of smart people hang out there.

Note to self - we still need to make changes to the XML DTD (Ken?) to incorporate the search results.  We also need to include some different sorting options, like sorting by date or authority, not just by relevance. 
Posted by dsifry at June 9, 2003 01:08 AM | TrackBack | View blog reactions