Categories

State of the Blogosphere, August 2006

Three months have passed since my last State of the Blogosphere report, so time for an update on the numbers. For those of you who just want the most interesting tidbits, I’ve tried something new this time around – I’ve put in boldface the most significant information. There’s also a summary at the bottom of the post for those of you who just want the significant details.

50 Million Blogs and Counting.

On July 31, 2006, Technorati tracked its 50 millionth blog. The blogosphere that Technorati tracks continues to show significant growth. The chart below (click to get a full-sized version) has the details:

Slide0002-6

Technorati has been tracking the blogosphere, or world of weblogs, since November 2002, and I’m constantly amazed at the growth over the years. The blogosphere has been doubling in size every 6 months or so. It is over 100 times bigger than it was just 3 years ago.

Whenever I write about these statistics, I’m always asked by people, “Can it continue to grow this quickly?” Frankly, I can’t possibly imagine it continuing to grow at this pace – after all, there are only so many human beings in the world! It has to slow down.

Rather than just postulate on this, we now have enough data to actually look at the real numbers – The rate at which the blogosphere has doubled over time, as shown in the chart below:

Slide0003-8

As this chart shows, back in November of 2003, the blogosphere had doubled in size in 40 days – probably because Technorati was new and was just picking up all of the blogs that were out there in the world. In January of 2004, the blogosphere was doubling at a rate of once ever 120 days, which is about once every 4 months. By July of 2004, the blogosphere was doubling every 180 days, or about once every 6 months. Today, the blogosphere is doubling in size every 200 days, or about once every 6 and a half months. That means things have slowed somewhat – the rate of doubling has increased by about half a month to once every seven months.

What I found so interesting in these numbers is that the graph has stayed so flat in the range of 150-200 day doublings for so long. From January 2004 until July 2006, almost two and a half years later, the number of blogs that Technorati tracks has continued to double every 5-7 months.

Can this possibly continue? Will I be posting about the 100 Millionth blog tracked in February of 2007? I can’t imagine that things will continue at this blistering pace – it has got to slow down. After all, that would mean that there will be more bloggers around in 7 months than there are bloggers around in total today. I shake my head as I am writing this – the only thing still niggling at my brain is that I’d have been perfectly confident making the same statement 7 months ago when we had tracked our 25 Millionth blog, and I’ve just proven myself wrong.

Let’s look at the number of new blogs tracked each day, to get another look at the numbers:

Slide0004-9

As of July 2006, about 175,000 new weblogs were created each day, which means that on average, there are more than 2 blogs created each second of each day.

Surely some of these new blogs in Technorati’s index are Spam blogs or ‘splogs‘. The spikes in red on the chart above shows the increased activity that occurs when spammers create massive numbers of fake blogs and try to get them into our indexes. This is going to be a fight that is going to continue as long as people find the web useful, and there’s really no way to make sure that we catch every single spam blog before it goes into our indexes. We’ve been working extremely hard on understanding these spam patterns, and

  1. eliminating the spam from our indexes as quickly as possible, and
  2. making sure that these identified sources of spam (and spam creation patterns) never even make it into the index when they attempt to do so in the future.

What we have found, after lots of analysis and spam elimination, is that we see about 8% of new blogs that get past our filters and make it into the index, even if it is only for a few hours or days. In other words, we’re always going to pay a price to make the blogosphere as open a place as possible, and Technorati will always have some results that are spammy. We’re going to have to continue be extremely vigilant to make sure that new attacks are spotted and eliminated as quickly as possible. About 70% of the pings Technorati receives are from known spam sources, for example, but we’re able to drop them before we even send out a spider to go and index the splog.

Of course, we’re also going to make some mistakes – so if you think your blog is possibly misclassified, go and have a look at your blog profile (here’s mine, for example)- simply type in your blog homepage URL to see what Technorati thinks it knows about your blog. If you don’t see your newest posts showing up, make sure that you’ve claimed your blog. If all else fails, please let us know about it, and we’ll try to fix it for you. Please note that if you have multiple URLs for your blog (e.g. Typepad users often have multiple URLs for their blogs, as do some other services) to please try the alternative URLs as well before dropping us a support ticket.

OK, back to the fun. Here’s a look at the daily posting volume in data that Technorati tracks:

Slide0005-11

First off, the total posting volume of the blogosphere continues to rise, showing about 1.6 Million postings per day, or about 18.6 posts per second. This is about double the volume of about a year ago. Along with the aggregate posting volume information, we’ve put in some annotations of the events that occurred at the time of the spikes, showing that the blogosphere continues to react strongly to various world events. It is important to note that it is the relative increase in posting volume rather than the absolute increase that is most relevant here. In other words, because more people are blogging now, the total number of posts on a particular day don’t tell the whole tale of the impact of an event – For example, The National Spelling Bee was not as large an event in the blogosphere as Hurricane Katrina. What is important to note in these charts is the relative size of the spike in relation to the posting volume at that time.

Another interesting item to note is the level of influence that blogs are having, especially compared with the mainstream media (MSM). This chart is somewhat biased towards western sources of the MSM, and if you see a source that is missing from this (or the next) chart, please let me know.

What is interesting is that some of the most influential weblogs are being treated in much the same way as traditional MSM, as measured by the number of bloggers who are linking to them, as shown in the chart below:

Slide0006-7

The blogs are in red, MSM in blue. What becomes more interesting to me, however, is that as you continue down the long tail of media sites, the number of blogs starts to grow – to 11 of the top 90 sites, or 12.2% of the total, especially given the budget differentials, as shown below:

Slide0007-3

Next, let’s look at the language distribution of the blogosphere. One of the most interesting statistics that has changed since the last State of the Blogosphere is that English has retaken the lead as the #1 language of the blogosphere. However, it’s not by much – the Japanese blogosphere has grown substantially as well.

In April, English edged out Japanese with 34% of all postings to 33% of all postings, with Chinese taking the #3 spot with 14% of all postings.

Slide0009-3

In May, English extended its lead to 41% of all postings in the blogosphere, to 31% in Japanese and 10% in Chinese.

Slide0010-2

In June, Chinese caught up somewhat, with 39% of all postings tracked by Technorati in English, 31% in Japanese, and 12% in Chinese. It is important to note that, as in the report in April, that there are some significant underreporting issues, especially in Korean and in French, as described in that report.

Slide0011-2

Finally, I thought it would be interesting to look at what times of day show significant posting volume by language. The chart below shows this information using Pacific time (Technorati is located in San Francisco, so we’re biased towards that time zone) as our base:

Slide0008-5

It is interesting to note that the most prevalent times for English-language posting is between the hours of 10AM and 2PM Pacific time, with an additional spike at around 5PM Pacific time. Japan, which is 17 hours ahead of San Francisco, shows a different pattern – more posting occurring during the evening hours into the night, as well as the early morning hours before work begins. I’m not entirely sure what to make of these numbers, but it would appear that English-speaking people are more likely to blog during work hours and early evening in the USA, while they are more reluctant to blog during work time in Japan. More research is definitely needed to understand when and where people are blogging. Perhaps a more experienced cultural anthropologist or sociology researcher can provide better insight here, if you’re interested, drop me a line at dsifry AT technorati DOT com.

In summary:

  • Technorati is now tracking over 50 Million Blogs.
  • The Blogosphere is over 100 times bigger than it was just 3 years ago.
  • Today, the blogosphere is doubling in size every 200 days, or about once every 6 and a half months.
  • From January 2004 until July 2006, the number of blogs that Technorati tracks has continued to double every 5-7 months.
  • About 175,000 new weblogs were created each day, which means that on average, there are more than 2 blogs created each second of each day.
  • About 8% of new blogs get past Technorati’s filters, even if it is only for a few hours or days.
  • About 70% of the pings Technorati receives are from known spam sources, but we drop them before we have to send out a spider to go and index the splog.
  • Total posting volume of the blogosphere continues to rise, showing about 1.6 Million postings per day, or about 18.6 posts per second.
  • This is about double the volume of about a year ago.
  • The most prevalent times for English-language posting is between the hours of 10AM and 2PM Pacific time, with an additional spike at around 5PM Pacific time

As always, I’m very interested in your comments and feedback.

Technorati Tags: , , , , , , , , , , , , , , , , , , , , , , , , ,

  • Share/Bookmark

Related posts:

  1. State of the Blogosphere, April 2006 Part 1: On Blogosphere Growth
  2. State of the Blogosphere, August 2005, Part 2: Posting Volume
  3. State of the Blogosphere, October, 2006
  4. State of The Blogosphere, March 2005, Part 2: Posting Volume
  5. State of the Blogosphere, February 2006 Part 1: On Blogosphere Growth

47 comments to State of the Blogosphere, August 2006

  • State of the Blogosphere, August 2006

    Staggering stats at the growth of the blogs. First of many revelations, the blogosphere is now 100 times larger than it was in 2004.

  • Seventy percent of blog-pings are from spammers

    Technorati’s David Sifry has published his latest installment in his quarterly series, “State of the Blogosphere.” This quarter’s news: Technorati is tracking 100 times more blogs than it was three years ago, and the blogosphere is doubling every 200 …

  • QUE DATOS TAN INTERESANTES Y QUE TRISTE QUE LOS BLOGS EN ESPA

  • Good statistics, thx. In the english language blogs, what is the ratio of native/non native bloggers?

  • Jon Saville

    Good stuff. Quick typo – I think you mean “*new* bloggers” in the following sentence: “After all, that would mean that there will be more bloggers around in 7 months than there are bloggers around in total today.” And some of the graphs are getting hard to read – especially “Blogs vs MSN June 2006″, even when you click-through.

  • David,
    Once again, a very interesting review of the blogosphere.
    Disappointed we couldn’t connect when I was writing below story about the French blogosphere. Be interesting to hear your thoughts.
    http://www.nytimes.com/2006/07/30/world/europe/30blogs.html?ex=1311912000&en=fb6a3a489d64c0f4&ei=5090&partner=rssuserland&emc=rss
    Curious as to why French bloggers do not seem so predominant in your ratings as they do via the web ratings agencies. (They operate by having a device to monitor usage, similar to TV ratings)
    On a separate topic, timing of blog entries is a great subject for cultural anthropologists.

  • Hi Dave.
    I always enjoy these state of the blogosphere posts – thanks!
    In the past, you’ve compared the total number of indexed blogs with the number that could be considered active (e.g., in April 2006 3.9 million blogs were being updated at least weekly; and 19.4 million were still being updated three months after their creation). Can you please do the same with the August data?
    thanks, pete.

  • Really interesting data–thanks for writing it up! I’d be curious about how the times correlated to, say, Google searches, and whether it’s just an overall Internet usage pattern.I have a “Bay Area Geek Jobs” page on my website, and its busiest time is 1 PM Pacific, Mondays. I figure that people spend the weekends trying to forget how much they hate their jobs, and then they get in Monday and they remember, so as soon as they get back from lunch, they start hunting.

  • A rigourous academic study…
    “Characterizing the Splogosphere” by Pranam Kolari, Akshay Java and Tim Finin of the University of Maryland Baltimore County
    … was conducted in mid 2005. It found, after filtering out all the spam, rubbish and dead blogs, only around 500,000 real English-language blogs on the planet. Compare this to the Technorati estimate of around 14 million for the same period. That suggests a ratio of 28:1. By that measure, Technorati’s “50 million” perobably means we have nudging 2 million “real” blogs in the English language.

  • Thanks, David. This is a real service to the community. I’m glad you’re here to glue the whole thing together for us.

  • ess

    It’s interesting how the ‘doubling of the blogosphere’ peaks line up with off-time for college students. Hmm… wonder if there’s something to that.

  • Dave,
    A lot of the coverage is quoting;
    “about 8% of new blogs that get past our filters and make it into the index, even if it is only for a few hours or days”
    Can you clarify this statement? Im assuming you mean 8% of new spam blogs are getting through.

  • Dave
    Are you including Spaces, MySpace and others? I think Spaces/MySpaces combined is > 100m blogs

  • I am currently affiliated with Tek Republik 7.Net, which has a Wiki. On that wiki, one of the projects we have started is to compile a comprehensive list of all the Blogs in the Blogosphere.
    We had no idea there were 50 MILLION blogs out there.
    Nice report.

  • English language only 39% of international blogs

    The latest Technorati State of the Blogosphere report shows that 39% of all blog postings are in english. Japanese is the second-most popular language, at 31% and China third with 12%. Here’s Technorati’s pie graph for June ’06:   Dave Sifry…

  • question

    Of the 50 million, how many have posted more than one post? How many have posted in the last 30, 60, and 90 days?

  • Great info. Very interesting. I wonder how long it takes the average blog to become inactive.

  • State of the blogosphere

    Here is Technorati’s latest State of the blogosphere report….

  • The question of abandoned blogs has lots of variations: Some people may start a blog and decide it’s too much work–or they might switch to MySpace if that better suits their needs. Or in the early stages, someone may just be shopping around–setting up blogs on multiple blog-hosting services to experiment, and then eventually settle on one. Or they may set up blogs on multiple services to reserve a favorite name. Or they may just decide migrate to a different blogging platform based on feature set, etc. How does Technorati account for abandoned and inactive blogs?

  • Wow, that’s a big blogosphere you got there

    Every three months, Dave Sifrey of Technorati drags out his abacus and counts up the number of blogs in a “State of the Blogosphere” report. This quarter’s report is out, and here’s the good stuff: Technorati is now tracking over…

  • English language only 39% of all blogs

    The latest Technorati State of the Blogosphere report shows that 39% of all blog postings are in english. Japanese is the second-most popular language, at 31% and China third with 12%. Here’s Technorati’s pie graph for June ’06:   Dave Sifry…

  • Anonymo

    How many blogs have had at least 30 posts in the last 30 days?

  • good stats. thank u!

  • And what about providers blogging systems ? msn ? typepad ? lj ? blogger ? etc.. ?? is there some data ?

  • Seems as though there’s a lot of interest in the abandoned blogs and junk. An interesting way to slice the data might be to look at the quantity of blogs that – over the past year – are averaging 1 post every 30 days, 1 post every 15 days, 1 post every 7 days, 1 post every day, 2 posts every day, etc. etc. You get the idea. Would give an interesting look at how many bloggers are really falling into the “pro” category … versus the abandoned category.

  • Las estadísticas de la blogosfera

    Dave Sifry acaba de publicar una serie de estadísticas muy completas sobre la evolución de la blogosfera a partir de la información de Technorati. Algunos datos: Un tercio de lo publicado es en inglés… pero curiosamente otro tanto lo es en japon

  • The time-of-day for posting per language info doesn’t take into account the geographic location of the poster, without compensating for the local timezone any conclusions are flawed. Ok, my guess would be that the majority of Japanese bloggers would be posting from Japan, I haven’t a clue for Chinese language, but I’m sure it’s erroneous to assume that all English-typing bloggers live in California!
    Not only are there first-language bloggers in the UK, Ireland, Australia, New Zealand etc, there appears to be quite a significant proportion of e.g. continental European bloggers who post in English. I bet the figure for non-US English posting bloggers is pretty close (quite possibly greater than) the figure for US English-language bloggers.
    It may be possible to determine from your current data a) if there really is a different blogging pattern between e.g. Japanese and USA blogger; or b) that there probably isn’t one, and if b) then c) the approximate proportions of bloggers based on geography. Assume that a diurnal pattern like that of Italian, Spanish, German etc applies in each of the other English-speaking countries (and US timezones), scaled to the population count, subtract those from the English total. If this approximately works you can test b) and perhaps find c).

  • Technorati: 50 millones de blogs.

    …technorati.jpg” width=”220″ height=”29″ class=”imgizqda” />Salió publicado el nuevo reporte de David Sifry (Technorati) donde nos dice que existen ya 50 millones de blogs a nivel mundia…

  • Interesting as always! Good presentation!
    The only thing which annoys me personally (not your fault :-) is that the German language only gets one percent compared to 40-something for English.
    I will have to publish more English blogs :-)

  • cindy

    So what does it tell us if an individual blogger makes it to top10 or top50?
    1) it doesn’t guarantee quality of the blog. Blogs is just like books, the good/quality books are seldom read or sold.
    2) making to the top is per quantity. Therefore the amount of clickings to make the page-count (I assume is the same logic), is baised.
    3) by language — if one happens to write in a dying language group, or one’s country have only less thatn 1 million persons says Bhutan, what would ranking on bloggerspace mean to these people?
    We seems to constantly concentrate on quantity rather than quality. It is a sad sight. As the traditional medias being torn into pieces by bloggers (or new media), we (at least me) are left to search and look for articles that is(are) worth reading. Is that the way how bloggers feel so proud of themselves?
    My thinking is: eventually bloggers would have to merge to make sense to the public they want to attract. Therefore the model would be back to the traditional media. Time would tell.
    Search engines only capable to search by topics. Search engines cannot tell me what is good and what is a plain waste of my time even to take a glance.
    Cindy

  • FoF

    Out of these 50 millions blogs, how many are famous?
    http://www.frameoffame.com just launched and is promoting 50 of them, through eBay auctions.

  • Just curious… do these statistics include dead blogs… how are dead/live blogs tracked? I agree with commentators above that the measurement for live/active blogs should be differentiated from others for a true measurement of the live blogosphere…

  • I think there is a terminology problem here — new bloggers does not 1:1 correspond with new blogs. For instance I have my home blog that I have run for 3 years now, but I have set up at least 3 other blogs since then for special projects and purposes…. can you differentiate between the number of forums and the number of writers in any way?

  • One billion people use the Internet. Ten percent of them using blogs isn’t inconceivable.

  • Great data, I have been wondering what the stats are to date. Thanks!
    David Jemeyson

  • Great data, I have been wondering what the stats are to date. Thanks!
    David Jemeyson

  • Estado de la blogosfera en Agosto del 2006

    David Sifry de Technorati ha publicado State of the Blogosphere, August 2006 (del cual como podréis ver, he tomado el titulo). A continuación un resumen del mismo.
    El 31 de Julio, Technorati llego a rastrear 50 millones de blogs, a forma de resumen …

  • “Of course, we’re also going to make some mistakes” I imagine this is true. However, as you asked us to contact support about this, let me point out that I have done this about 8 times, plus called once, just to get something done about not being able to claim our blog. (No response from your support people). So, if your numbers are to be vaguely believed, I would start to make sure what I’ve experienced isn’t a common problem.

  • The European blogosphere
    http://www.eu.socialtext.net/loicwiki/index.cgi?summary_page
    China’s New Obsession with Blogs and How Companies Can Benefit
    “The total number of blogs in China will grow over 200% from 37 million in 2005 to nearly 120 million by the end of 2006.”
    http://china.seekingalpha.com/article/13336

  • L’impressionnante croissance des blogs continue

    La dernière étude David Sifry, le CEO de Technorati, le site qui suit la blogosphère, daté de juillet dernier montre qu’avec 50 millions de blogs suivis, le doublement du nombre de blogs continue à avoir lieu tous les 6 mois. Indépendammen

  • The FT and the Economist are not influential?

    http://www.sifry.com/alerts/Slide0006-7.gif
    I was just listening to a fascinating podcast that featured…

  • The FT and the Economist are not influential?

    http://www.sifry.com/alerts/Slide0006-7.gif
    I was just listening to a fascinating podcast that featured…

  • The FT and the Economist are not influential?

     
    I was just listening to a fascinating podcast that featured an interview with Technorati’s founder…

  • Jodie Luu

    Hi,
    I’m Jodie Luu, currently a student at the National University of Singapore. I’m doing an honours thesis on public perceptions of organizational blogs in Singapore. I came across your report on the state of the blogosphere by Technorati, which is very useful for my paper. Thank you very much for sharing it online. I have a couple of questions regarding the report however.
    1. About the number of blogs existing out there, does Technorati take into account the fact that one blogger can have several blogs, some active, some inactive?
    2. How do all these blogs get through Technorati’s filter?
    3. Does Technorati have any figures about the number of bloggers?
    4. Technorati’s report gives a global view of the current state of the blogosphere. Does it have any figures that are specific to countries or regions? Since I want to focus my research paper on Singapore context, apart from the global data, I think it would be better if I can find some data about the blogosphere in Singapore.
    I’m looking forward to your response.
    Jodie

  • Nice stats. It would be nice if Technorati sorted out some of the update problems as well.

  • Technorati tells us blogs are 50 million strong and growing (faster than rabbits at a carrot convention).

    If you are new to the Blogosphere you may not have heard about Technorati http://technorati.com/   (I mention them in my last post). These people hold the main measuring cup for everything ‘blog’ in the recipe for ‘new world’ communic…

  • Time’s Person of the Year: You

    Time Magazine's Person of the Year for 2006? YOU. Here's why.
    "It's about the many wresting power from the few and helping one another for nothing and how that will not only change the world, but also change the way the world changes.&quo