May 14, 2004

Blog comment spam solutions and the coming arms race

Jeremy Zawodny recently wrote something about weblog spam. John Battelle picked up on it today. Six Apart has just released a centralized comment authorization system called TypeKey. I've been thinking about comment spam for some time, and I've got a radical solution - one that I believe is the only one that has a chance at working.

I think that all these blacklists, etc are the entirely wrong approach. They will serve to create an ever-escalating arms race between spammers and bloggers, resulting in the wasteland that we have today with email and Usenet (anybody remember Usenet?)

The problem is one of accountability. Whenever you have a system where someone can insert an unaccountable message into a message stream, abuse always follows. This has happened with Usenet, email, and now blog comments. As long as people see some gain to be had for perpatrating the abuse, and the abusers are unaccountable, they will do so. The protocols are fundamentally broken: for example, they allow spammers to forge From: addresses in email and they allow comment spammers to add arbitrary content to arbitrary blogs. And the authentication services only serve as a minor deterrent - spammers are now using the prospect of free porn to get people to fill in the "only-humans-can-decipher" image codes (captchas) that spam blocking services are using, for example. It is a classic arms race.

Here's my suggestion: Turn off comments altogether, and let people who want to comment get their own blog. When they link to you, they'll get picked up by services like Technorati which will automatically show their comments whenever doing a search for your post. This is what the folks at BoingBoing (and many other sites) have been doing, and it eliminates spam because it enforces accountability - you've got to have a publically addressable place on the net where your words appear - and that place is owned by you. The cost of setting up the blog lies with the commenter, which is the way things ought to be. We're working on some ways to easily show the number of people who have linked to a particular post, in real-time, which will make it easy to show the interesting articles dynamically - e.g. "Blogs Linking To This Post (15)" instead of just "Blogs Linking To this Post". Stay tuned.

Now, this doesn't completely eliminate spam - for example, I could set up a SPAM blog, and create links out the wazoo to all of the major sites. For a while, the SPAM blog site will show up in the Technorati Link Cosmos of each site that it links to, but it soon becomes easy to eliminate - for example, the SPAM site will never get an inbound link from people who I care about, and that can be used as a filter on the inbound links page. The spammer (and his site) would also quickly gain a reputation as a spammer, and could therefore be easily tracked. For example, a set of spam-hunting sites could link to the SPAM site, and you could have a filter that only showed links as comments if less than 2 of the spam-hunting sites linked to the site, or any metric that you wanted. Think of it as a distributed slashdot karma system, if you will. And you wouldn't be limited to using Technorati for this, other sites could come about that do a better job than we do, and you could use them.

Some might suggest that this is a bad system, because people who wanted to remain anonymous couldn't comment. That isn't true - Accountability doesn't mean the end of anonymity, take Salam Pax's blog as an example of this. Of course anonymity (or perhaps pseudonymity?) does bring a set of challenges, like "Why should I trust someone who won't tell me his name?" but these can be worked through if the pseudonymous blogger proves reliable and trustworthy over time.

Of course, you may ask yourself, "If this Sifry guy is so against comments, why does he enable them on his own site?" I have employed anti- comment spam measures in the past, which are working for now. Since I don't get enough blog spam right now to make the tradeoff, but I have no doubt that the day will come. I'm also technical enough to know how to do all this stuff, and my goal is to fix the underlying problem in the system, not to just patch things piecemeal. And I'll admit to not being 100% convinced that this is the right way to go, so I'm testing the waters of both approaches.

And besides, we'll get a whole bunch more bloggers in the world this way. More permalinks are good. Comments and feedback are welcome. :-)

Posted by dsifry at May 14, 2004 07:42 PM | TrackBack