April 05, 2007

The State of the Live Web, April 2007

Hey, it's that time again, time to slow down, take a deep breath, and dig into the data!

About this Report, and the Obligatory Plug for Technorati

Technorati is known widely for its quarterly State of the Blogosphere reports, analyzing the trends around blogs and blogging. With this report, we expand on this tradition by introducing information and analysis relating to the broader range of social media on the Web -- what we and many others call the Live Web (another good definition). Technorati continues to grow well beyond its roots at the leading blog search engine; increasingly, we are the main aggregation point for all forms of social media on the Web, including blogs, of course, but also video, photos, audio such as podcasts and much more.

What makes this possible is the rise in the use of tags across all forms of social media and the increasing implementation of tags by the publishing platforms supporting each form of media. Increasingly, tags have become a lingua franca of Live Web, helping to categorize social media while also indicating where people’s attention might be at any given moment. But because each form of media is published from unique platforms with their own established communities, the audience found itself hopping from platform to platform to get a sense of what might be hot at any given moment. Which is why our social media aggregation service -- made manifest on our tagged media pages -- is growing at a torrid pace.

While we still have substantial reporting on the the State of the Blogosphere, we now expanding the report to provide information about the State of Tags. Admittedly, the information we have on this new area of focus for our report isn’t as deep or as expansive as our State of the Blogosphere, and we expect that over time, this and other new sections will expand, but we believe this is a good first step in trying to provide a more comprehensive snapshot of the Live Web.

OK, on to the numbers!

The State of the Blogosphere

The state of the Blogosphere is strong, and is maturing as an influential and important part of the web.

For nearly four years, we’ve been tracking and enabling the growth of this phenomenon and theirs is much in our data to indicate that the medium is “growing up.”


Technorati is now tracking over 70 million weblogs, and we're seeing about 120,000 new weblogs being created worldwide each day. That's about 1.4 blogs created every second of every day.


Spam and splogs (spam blogs) continue to be a problem in the blogosphere, and there was a marked increase in splogs that coincided with the holiday season last year. Technorati has been tracking between 3,000 - 7,000 new splogs created each day, but there was a significant spike in splog creation during early December, when we tracked over 11,000 splogs created each day during December - a total of 341,000 splogs that we removed from our indexes during that period.

Fortunately, spam rates have decreased somewhat since then, as blog hosting providers have responded to the issue during the months of January and February. My personal take on the issue of spam is that all healthy ecosystems have parasites - the only question is whether or not the system is structurally vulnerable to being overwhelmed. Thankfully, because of the accountability that is built into the web itself (the URL structure is fundamentally accountable), I believe that while the vulnerability of the live web to spam is real, it is managable.


Since our last State of the Blogosphere report in October 2006, we’ve seen a slowing in the doubling of the size of the blogosphere. This shouldn't be surprising, as we're dealing with the law of large numbers - it takes a lot more growth to double from 35 million blogs to 70 million (which took about 320 days) than when it doubled from 5 million to 10 million blogs (which took about 180 days).

We also see a slowing in growth in the rate of posts created per day; while there are spikes in blog posts during times of significant world crisis -- for instance, last summer’s conflict between Israel and Hezbollah -- the overall trend is that posting volume is growing more slowly, at about 1.5 million postings per day. That's about 17 posts per second. In October 2006, Technorati was tracking about 1.3 million postings per day, about 15 posts per second.


Popularity of Blogs vs. the MSM


In previous reports, we looked at the popularity of mainstream media compared to blog sites. One interesting item to note in April 2007, the number of blogs in the top 100 most popular sites has risen substantially. During Q3 2006 there were only 12 blogs in the Top 100 most popular sites.

In Q4, however, there were 22 blogs on the list -- further evidence of the continuing maturation of the Blogosphere. Blogs continue to become more and more viable news and information outlets. For instance, information not shown in our data but revealed in our own user testing in Q1 2007 indicates that the audience is less and less likely to distinguish a blog from, say, nytimes.com -- for a growing base of users, these are all sites for news, information, entertainment, gossip, etc. and not a “blog” or a “MSM site”.

Further, there is a wider diversity of languages represented here, specifically Farsi with TodayLink.ir, Persian Blog Fans Club, and Giliran.com making the Top 100. More on that in a moment, as we discuss the international growth of the Blogosphere.

The Global Blogosphere


In terms of blog posts by language, Japanese retakes the top spot from our last report, with 37% (up from 33%) of the posts followed closely by English at 36% (down from 39%). Additionally there was movement in the middle of the top 10 languages, highlighted by Italian overtaking Spanish for the number four spot.

The newcomer to the top 10 languages is Farsi, just joining the list at #10. It has been very interesting to watch the growth of the blogging world in the middle east, especially in countries like Iran, and it is reflected in the language distribution above.


English, Japanese and Chinese look almost identical to our last report in their posting distribution. With Italian overtaking Spanish, we get to see another language with a different distribution, which contrast both the extreme geographic correlations of the Asian languages and the relative lack of geographic correlations of English. Again it would appear that both English and Spanish are more global languages based on consistency of posting through a 24 hour period, whereas other top languages, specifically Japanese, Chinese, and Italian, are more geographically correlated. It would also appear that a significant number of people who are blogging are doing it during work hours.

The State of Tags

The explosive growth that we see in the Technorati index is mirrored in social media sites throughout the Web, including Flickr, YouTube, and the like. This shared phenomenon allows us to marry the wealth of information in our index with the wealth of that stored on social media sites across the Live Web through the shared construct of tags.

For the uninitiated, a tag is a category or descriptor that someone (often the creator) assigns to it . This descriptor literally hangs off the media that’s published to the Web much in the same way a luggage tag hangs off your suitcase -- easily identifying the bag.

The bottom line: we’re seeing explosive growth in the tags index. People are clicking on tags, people are using tags, Google features tagged media in its results pages. Tags adoption has become a phenomenon across the Live Web, and we are seeing a correlative explosive growth at Technorati.

On to the numbers:


Technorati is now tracking over 230 million posts using tags or categories, and the number of people who are using tags is growing:


As of February 2007, About 35% of all posts Technorati tracks use tags.


The number of bloggers that are using tags is also increasing month over month. About 2.5 million blogs posted at least one tagged post in February 2007.

Growth and Maturation

Back in 2002 when Technorati started tracking the blogosphere, social mores and community practices were still forming, and its growth was primarily through the written word. It was a fledgling medium that was initially reviled, then feared, and, now, embraced as mainstream.

The blogosphere started well before Technorati was founded, and its growth was fostered by many people and organizations that brought openness and cooperation to the medium. One of those people, Dave Winer, just celebrated the tenth anniversary of his weblog. Given this auspicious anniversary, I wanted to give my thanks and support to Dave and to all of the other early pioneers in the world of blogging, RSS, and the Live Web. Without Dave's efforts, the web wouldn't look the way it does today. His creation and support for systems like weblogs.com and open formats like RSS were critical in building the early infrastructure that Technorati relies upon and helps to support.

Thanks, Dave!

Wrapping it all Up

As a result of this work and the cultural mores of openness, we also have photo sharing, podcasting, online music publishing, online video publishing, user-generated games, and, increasingly, we have structured data-sharing such as upcoming events. All of this seething, lively activity constitutes the Live Web and Technorati is its hub -- thanks in large part to the growing use and ubiquity of tags. Through the social constructs of tags, we help people find unique voices and points of view. We also help social media publishers to find the people formerly known as their audience. And they all converge, as a result, on Technorati.

We’re proud of this position, of course, but also humbled by the responsibility it imposes.

As we continue to bring more and more of the Live Web to the fore, and to organize it and present it in ways that are useful, entertaining, and informative to you all, I hope you’ll continue to tell us your opinions (as if I could stop you!) and provide us your guidance. Our credo has been and will always remain: “Be of Service.” Your voice helps us to do this, so please continue to tell us what we can do better.

In summary:

  • 70 million weblogs
  • About 120,000 new weblogs each day, or...
  • 1.4 new blogs every second
  • 3000-7000 new splogs (fake, or spam blogs) created every day
  • Peak of 11,000 splogs per day last December
  • 1.5 million posts per day, or...
  • 17 posts per second
  • Growing from 35 to 75 million blogs took 320 days
  • 22 blogs among the top 100 blogs among the top 100 sources linked to in Q4 2006 - up from 12 in the prior quarter
  • Japanese the #1 blogging language at 37%
  • English second at 33%
  • Chinese third at 8%
  • Italian fourth at 3%
  • Farsi a newcomer in the top 10 at 1%
  • English the most even in postings around-the-clock
  • Tracking 230 million posts with tags or categories
  • 35% of all February 2007 posts used tags
  • 2.5 million blogs posted at least one tagged post in February

Getting All the Reports

You can get all of the State of the Blogosphere and State of the Live Web reports, going back my first report in October 2004 at http://www.sifry.com/stateoftheliveweb/ All of this material is licensed under a creative commons for-attribution license, and all I ask in addition is that you please keep the Technorati logo and links to the original reports in any use of the charts or data.

Technorati Tags: , , , , , , , , , , , , , , , , , , , , , , , ,

Posted by dsifry at April 5, 2007 02:02 AM | View blog reactions