Steve Gillmor has an excellent post on Attention.xml and the rationale and use cases for why we developed it together. We're working hard on turning that vision into reality, and I'm encouraged by how many others (look at the people linking to Steve's post) are "getting it" and working on it. This is not a one-man or two-man show, it has to have involvement from a community, and it has to be as open and transparent as possible. Go Steve, go!
When I started Technorati we tracked 15,000 weblogs and 1.2 million links in the first month. A lot has changed in the world of weblogs over the past 2+ years and Technorati now tracks over 8 million weblogs and close to one billion links.
This milestone would not be possible without the help of the entire community of bloggers out there posting on a daily basis. Thank you for writing and we will continue to provide new ways to follow online conversations and be of service to our community.
Technorati would like to mark this special event for our members with a "One Billion Links Tracked" contest. Entries must be received by 11:59 p.m. Pacific time on Tuesday, March 29, 2005.
How to participate
We will review all entries and declare the winner as the person with the closest guess on or after we track our one-billionth link. If two or more members guess the same winning time the most senior member -- the first member to join Technorati -- will receive the prize.
Technorati employees, contractors, affiliates, and their immediate families are ineligible for this contest.
The winner will receive an iPod shuffle with 1,000,000,000 bytes (1 GB) of storage.
This Thursday, March 31, 2005, Technorati will host a user salon to bring together Technorati staff and users for face-to-face discussions. The event will take place at Chevys at 3rd St. and Howard St. in San Francisco (maps and directions below).
We will deliver a short presentation about what is new in the blogosphere, including some new stats. The world of weblogs has been through a lot of change in the ten months since our last user salon last May. We would like to bring together the community to have a conversation about the last few months in the world of weblogs and discuss how Technorati may be of service in the future.
We're also approaching the milestone of a Billion links tracked in the Technorati database. as of this writing, we are at 990,172,347 links tracked, and at this rate, we'll hit 1 Billion links before the end of the week. Come and celebrate with us the amazing growth of the Blogosphere. We're also planning some fun stuff, more to come in the next post.
Here's the details on the event:
Thursday is the seven year anniversary of the release of Netscape Communicator 5.0 source code and the creation of the Mozilla project. In celebration of this milestone Technorati and Chevys will donate 20% of the bill to the Mozilla Foundation.
Technorati will provide appetizers and drinks for approximately 50 people. Please RSVP to email@example.com and let us know how many people you are bringing so we can plan the event space, food, and drink.
Today I'll discuss the impact of weblogs on traditional media, the impact of the A-List, and the power of the long tail.
First off, some terminology and an understanding of what we're measuring. This graph is a measure of influence or authority of a site or blog as measured by the number of people who are linking to it. Note that this is not a measure of page views or website "hits". Rather, Technorati looks at linking behavior as a proxy for attention and influence. In other words, the more people who link to a site or blog, the more influence it has on others. Note that influence is not an indicator of veracity - lots of people link to The Drudge Report, for example, which implies that Matt Drudge gets a lot of attention and influence, but not necessarily that he is truthful.
OK, now on to the fun stuff: First, the growing power of weblogs when compared to the mainstream media. As the chart above shows, the most influential media sites on the web are still well-funded mainstream media sites, like The New York Times, The Washington Post, and CNN. However, a lot of bloggers are achieving a significant amount of attention and influence. Blogs like bOingbOing and Instapundit are highly influential, especially among technology and political thought leaders, and sites like Gizmodo are seeing as much influence as mainstream media sites like MTV.com. A note on counting: Some organizations with multiple domains or highly syndicated strategies like the Associated Press and Reuters, are underrepresented in this chart, given that their impact is not easily countable using our methods. An interesting statistic to note is the current placement of subscription sites like WSJ.com (the Wall Street Journal). While the WSJ has begun to offer some content outside of its subscriber-only site, the policy is clearly costing them some influence and attention in the blogosphere, as bloggers find it difficult to link to articles in the subscriber-only sections. Also interesting to note is that even though The New York Times and The Washington Post require free registration to view the articles, bloggers are still linking to the stories. In addition, sites like the NY Times have worked out ways for links from bloggers to continue to be valid even after the article goes behind a paywall.
Some historical perspective: Last October's chart shows a similar distribution of the top media sites and blogs. Most of the names in today's chart are also in last October's, showing that there is indeed a power law forming around reader attention. Some people might argue that this is a bad thing, because of the implied stratification of the blogosphere. However, I don't believe that this is the case. For example, there are a number of new names in the Technorati Top 100, for example, showing that new voices can make themselves heard.
However, what is more important than the Top 100 is what is happening in the long tail, which is where blogs as a media type really start to look different from mainstream media. To get a better look at the power law in action, here are two charts that first show the existance of the long tail (note the flatness of the power curve), and the power of the conversations going on in the long tail.
Note that these charts are actually Flash, so you can right-click and zoom in on the details of the graph, to see names of blogs, for example. The chart below shows the aggregate number of linking activity (which implies conversation) going on at the long end of the tail. In other words, the fact that the A-list exists does nothing to drown out the immensely larger set of conversations that are going on among smaller groups of people, like friends and niche topic bloggers. In fact, even though the amount of influence that a single blog may have is less than that of a single blog on the A-list, the aggregate influence of all of the long tail far outstrips even the mainstream media.
This also has implications for enlightened marketers and media companies. There is power in the conversations going on around you, and not necessarily from the places that you'd ordinarily expect. Companies that work in conjunction with the trends going on in the long tail: e.g. fostering peoples voices, listening to and incorporating their comments and feedback, and fostering a community have a tremendous opportunity awaiting them.
Next in the series, hopefully posted tomorrow because I'm on a business trip: Some information on tags and tagging.
I had a great time speaking with Jon Gordon for his MPR radio show Future Tense. You can hear the interview via RealAudio, and Jon has a podcast feed too, kudos! Anyway, in my excitement, I noticed that I misspoke on two of the answers, so here's my chance to correct myself: I said in the interview that the posting rate in the blogosphere (see my earlier post) had doubled in 5 months, that's incorrect - the posting rate has doubled in 9 months, as you can see in the chart. One other thing: The active blog rate is for 6 months, not 3 months; ouch, goes to show me for spouting statistics before I've had my morning cup of coffee. Jon, I'm sorry for the errors, thanks for the interview. More to come in the State of the Blogosphere series later tonight, then I jump on a plane for the east coast. All you folks down at ETech, I'm envious! BTW, if anyone wants to get together at PC Forum in Scottsdale, just drop me a line, and we'll set it up!
To expand on my post yesterday on the overall growth of the number of weblogs, today I'm going to look at another important measure of the growth of the blogosphere, posting volume. A single post is a single entry to a weblog, whether it be a long essay or just a short entry, each is a post, and the posting volume is the aggregate number of posts per day. Just as it is important to note the increased growth in the number of weblogs out there, it is as or more important to see if blogging is a fad or if people are blogging at a sustained rate. The chart below shows that posting volume has been growing. (Compare with the chart from October 2004)
On average, Technorati is tracking about 500,000 posts per day, which is about 5.8 posts per second. In October 2004, we were seeing about 400,000 posts per day. It is interesting to note that posting volume suffered a decline during the months of November and December, 2004. A large part of this decline is the reduction in postings about US politics after the election in early November. However, the growth of mainstream blogging services becomes apparent when looking at the rise in posting volume starting in December. This is congruent with the increase in the number of new blogs during those months as discussed in yesterday's entry. It is also interesting to watch the spikes in the graph that have accompanied major news events. I haven't done a detailed analysis here but picked out a few spikes that stood out to me. The graph shows the effects of events like the US Political conventions and elections, the Indian Ocean tsunami, and the US Super Bowl. It is important to note that with major events like these, the actual amplitude of the spike is less relevant than size of the spike above the ambient amplitude level. In other words, even though there were fewer posts on the days following the Tsunami, it had a much larger spike than the one that came the day of the Super Bowl. I'm sure there are numerous international events that show up as spikes in this graph, as the number of postings made in non-english languages is now about 40% of the volume of postings that Technorati is tracking.
Tomorrow I'll discuss how blogs are capturing attention compared to mainstream media sites, what news sites are being referenced the most, and how that has been changing over the last few months. Comments and feedback are always welcome and encouraged!
It's been 5 months since my first presentation on the State of the Blogosphere at the Web 2.0 conference, which I later posted in parts. A lot has happened, and its time for an update on what's going on in the world of weblogs, and to have a look at the numbers.
I'll be posting this in a number of parts, as there's a lot of information to cover. Today, I'll be focusing on the macro growth of the blogosphere, both in the aggregate number of bloggers out there, as well as the growth of the number of new blogs per day. Here's the chart of the aggregate growth of the blogosphere from March 2003 to February 2005 (compare this chart with the one from October 2004):
Technorati is now tracking over 7.8 million weblogs, and 937 million links. That's just about double the number of weblogs tracked in October 2004. In fact, the blogosphere is doubling in size about once every 5 months. It has already done so at this pace four times, which means that in the last 20 months, the blogosphere has increased in size by over 16 times.
Things don't appear to be letting up either. With the launch of MSN spaces and the continued significant growth of popular blogging and journaling tools like Google's Blogger, SixApart's LiveJournal, AOL Journals, and proliferation of software like WordPress and Movable Type, the number of people out there blogging has jumped in the past few months. The chart below shows the significant jump in the number of new blogs created per day (compare with the chart from October 2004):
We are currently seeing about 30,000 - 40,000 new weblogs being created each day, depending on the day. Compared to the past, this is well over double the rate of change in October, when there were about 15,000 new weblogs created each day. The remarkable growth over the past 3 months can be attributed to the increase in new, mainstream services such as MSN Spaces, and in increases of use of services like Blogger, AOL Journals, and LiveJournal. In addition, services outside the United States have been taking off, including a number of media sites promoting blogging, such as Le Monde in France.
There is a dark underbelly to these numbers, however: Part of the growth of new weblogs created each day is due to an increase in spam blogs - fake blogs that are created by robots in order to foster link farms, attempted search engine optimization, or drive traffic through to advertising or affiliate sites. We have been battling the spam situation in a significant way for about 2 months - prior to January, spam wasn't much of an issue. All of these charts reflect Technorati's databases after spam blogs have been removed, and we feel that we've been able to capture and identify most of the spam out there, but one should note that there is definitely blog spam that we don't catch (tell us if you see spam in the index!). I'd estimate that we currently catch about 90% of spam and remove it from the index, and notify the blog hosting operators. Most of this fake blog spam comes from hosted services or from specific IP addresses. One of the results of the extremely productive Spam Squashing Summit of a few weeks ago is the increased collaboration between services in order to report and combat this spam. Right now, about 20% of the aggregate pings Technorati receives are from spam blogs, so you won't see that in these numbers - these statistics show only "cleaned" data.
Tomorrow, I'll discuss some statistics around posting volume, which is a more accurate indicator of how much blogging is becoming a habit for people. While some of the dramatic increase in the number of aggregate weblogs out there is quite interesting, it is far more telling to look at the number of posts per day, which show the size and quantity of conversation that is going on. Well, more of that tomorrow, stay tuned!
Well, this has been an interesting and stressful few days, with a lot of charges thrown around the blogosphere about Technorati and Niall Kennedy, our Community Manager. As sometimes happens in the blogosphere, things have gotten a bit overblown.
For those of you who haven't been following the conversation, I suggest you read Niall's own words first. I think he eloquently explains what happened from his perspective, and it is a must-read if you want to get the context of this post.
We at Technorati support Niall 100%, and as his post above shows, he is publicly working through the issues of understanding that in his role of Community Manager, putting trademarked logos of companies in our industry into provocative images - on a Nazi soldier helmet, and in a pool of blood next to a dead soldier - those actions have repercussions on the company, not only to his own personal reputation. We all make mistakes - and we in fact are trying to build a culture where trying new things is encouraged, which means we're going to make more than our fair share of mistakes - but we hold ourselves accountable, take the criticism, and then move on. As Esther Dyson likes to say, "Always make new mistakes."
To address the censorship charge that was thrown about head-on: we do not censor people's blogs, and we take the censorship allegation extremely seriously. I actively encourage our employees to blog, and to express their opinions. However, many readers do not make as clear a distinction between personal and work lives as many experienced bloggers do, and will view a provocative image on a blog in the worst possible light, especially when presented by the company's Community Manager. Niall made the decision himself to post the things he posted, when he posted them. Other than the clear case of trademark violation (we asked him to remove the pictures that violated trademark, in order that we not be sued) his actions and postings have been completely his own, including his decision to take down his original post.
I am truly sorry. My mother is a holocaust survivor, so I understand how emotionally charged and easily misinterpreted these images can be. To those of you who wrote objecting to the content of the images, I'm very sorry that we let you down. I assure you that these are not Technorati's official positions or feelings about the companies and projects mentioned, and I humbly ask for your forgiveness. To Technorati employees: I'm very sorry that we didn't communicate quickly enough and well enough with you about all of this, it has taken a bit of time to get to the bottom of things, and to give Niall the time he needed to think things through.
We really value Niall's contributions. He started the first Technorati Users Group, he's attended every developer's meeting, and then as Community Manager, he helped organize and lead the recent spam summit, answers feedback email, comments on blogs, and he's a hard worker who has done a great job representing Technorati in the world. We're treating this as a learning experience for everybody and putting it behind us, and hope that the rest of you do too.
A number of folks noticed that our searchlet had stopped working correctly a few days ago. This happened because we introduced an old bug into the code when we made other changes and fixes to the site. Sorry about that! We've now found and fixed the bug, and rolled out the fix yesterday. If you're still experiencing bugs with the searchlet, please let us know!
Oh, and we've been dealing with the spam issue as well, and our indexes have recovered their timeliness; as we discussed at the Web Spam Summit last week, this isn't the end of the problem, but we're working on a number of strategies to help keep link and blog spam to a minimum.
BTW, many thinks to everyone who came to the spam squashing summit - it was a huge success, and looks like some great things are going to come out of it, including better communication amongst developers and users in the industry.
We have the best users. I was blown away when I checked my Technorati watchlist this morning to find a screencast done by Alex Barnett, where he explains not only how he uses Technorati and its tags:, but also shows how to use one of the various bookmarklets to easily create and tag his posts. The video cut away a bit for me at the end (perhaps a firefox issue?) but if you are interested in tags, it is well worth reading his post and watching the video. Thanks, Alex!
I've just read about the new Yahoo! Search APIs, and at first glance, they look VERY COOL. The folks at Yahoo Search (congrats, Jeff, Geoff, Jeremy, and all!) are making a great move in this space, opening up their interfaces further unleashes a tremendous amount of creativity - good for them, good for developers, and good for users. Great forward thinking, I'm looking forward to the apps to come, and I'm already thinking about ways to integrate apps with both the Yahoo and Technorati APIs...