Technorati Keword Search

3

I’m extremely proud to announce the newest Technorati feature, Keyword Search.  You can now search Technorati’s database of over 360,000 weblogs and get up-to-date information on your search terms.  Bookmark this page:



http://www.technorati.com/cosmos/search.html



The indexes are rebuilt several times each day, which means that it can take as little as 2 hours from the time you post something on your weblog to when it shows up in Keyword Search results.



A few disclaimers:  This is BETA, I whipped it up on little sleep over the weekend, and it may still have bugs.  There are some things that need fixing, like the results ordering and some parts of the user interface.  Rather than wait for everything to be perfect, I figured I’d release what I had in the hopes that some of you would find it useful. As always, I’m interested in getting feedback on how to improve it.



I’ve also implemented a REST-ful API for search requests as well.  This means that you can use HTTP GET or POST to the following URL:



http://api.technorati.com/search



and you’ll get back an XML document with the search results.  Standard disclaimers apply, and you’ve got to abide by the Terms Of Service, which basically says that the results are for non-commercial use only, and must include a "powered by Technorati" link when displaying the results. 



First off, you’ll need a Technorati API key, and if you don’t have one already, you can get one for free by signing up as a Technorati member on the signup page and then retrieve your API Key



Next, construct your query.  For the sake of readability, I’m going to show this as an HTTP GET query, but it can just as easily be encoded in a POST for all of you REST zealots. :-)


query=SEARCH_TERM

Don’t forget to URL encode your search term if it has spaces or quotes in it.

key=API_KEY

Here’s where you put in your Technorati API Key.  You get 500 queries per day, from midnight to midnight PST.

start=RANKING

The API call returns 20 results.  If you want to see result 21-40, set start=21.  You can begin viewing results anywhere in the stream, so if you set start=30, you’d see results 30-49.  Note that there is no guarantee that results will be contiguous – rankings can change, and because the indexes are rebuilt frequently, some rankings may change between calls.  If this is an issue for you, let us know

version=0.9

type=xml

These two variables are built to allow for various format changes as time goes on.  The current Technorati API is at version 0.9, and as long as you set version=0.9 in your API call, we’ll always return API  0.9 results.  This gives you developers the assurance that your applications will work for a long time, and it allows us to make changes and extensions to the API.  If you leave the version variable out of your query, it will default to the most recent version of the API (which is currently 0.9).



The type variable controls the format of the output itself.  We may decide at some future time to support other formats other than this XML format.  Putting in this variable allows developers to specify the exact type of output you want to receive.  Right now, the only legal value is type=xml, and it is the default value.



So, if you wanted to perform an API call on say, "David Sifry" and return the first 20 results, you would use (GET syntax):

http://api.technorati.com/search?query=%22David+Sifry%22&key=94035daac6b136378856f3239648ab27&start=1

Please send your feedback and comments, and if you have problems or questions, check on the api-discuss mailing list – lots of smart people hang out there.



Note to self – we still need to make changes to the XML DTD (Ken?) to incorporate the search results.  We also need to include some different sorting options, like sorting by date or authority, not just by relevance. 

Share

Related posts:

  1. Technorati API 0.9
  2. A great new use of the Google API and SOAP
  3. New and Improved! Technorati Keyword Search…
  4. Full text and boolean search now available!
  5. More Google API hacks for Movable Type
Posted in: Uncategorized

This article has 3 comments

  1. rvr 06/09/2003, 5:53 am:

    Congrats David!
    I sent Mark Pilgrim a patch. Meanwhile, the technorati.py with search engine support can be downloaded from http://infoastro.com/rvr/img/technorati.py
    This patch is currently used by jibot, a #joiito bot (yes, there is more than once there :)

  2. Jacques Distler 06/10/2003, 2:27 pm:

    The net effect is that all other queries (eg, “type=weblog”) appear to be pooched.
    Try
    http://api.technorati.com/cosmos?key=94035daac6b136378856f3239648ab27&url=http://www.sifry.com/alerts/&type=weblog&start=1

  3. John Dowdell 06/13/2003, 4:44 pm:

    Hi David, I’m glad you did this, thanks for the work, but… is there any way to index on the individual RSS entry rather than on the whole HTML page? I’m getting a lot of false-positives on two-word queries, where the words actually occur in separate items.
    For instance, I’m tracking how people feel about advertising done in SWF files, and the term “flash advertising” pulls up *pages* where both terms occur, instead of individual entries where both terms occur. A search on “bush segway” shows a similar thing, although with the time-sorting the first few entries are actually relevant.
    … or do you see a way I can otherwise qualify my searches to find terms in proximity to each other, if not within the same entry?
    tx,
    jd

Across the Web

  • Facebook
  • Twitter
  • Flickr
  • Youtube

Twitter

Javascript needs to be installed to view the twitterfeed. Get Javacript

Follow Me on Twitter