Wordpress/Automattic: All you data is belong to us

The guys over at Automattic has made the Automattic Stats plugin available for download, which makes it possible for self hosted Wordpress installations to get the same statistics package as wordpress.com hosted blogs have by default. While this sounds like a good idea, and I'm sure a lot of people will be really happy about it, the total amount of data Automattic potentially can gather from Wordpress installs is further increased by this.

The plugin itself requires an Wordpress.com API key, you can use the same one as you use for Akismet, and all the data and statistics are gathered and created on the wordpress.com servers. In fact, the whole admin interface is located on the wordpress.com dashboard.

Combined with the data that Akismet gathers for each and every comment made on any Akismet protected installation it seems that Automattic are indeed gathering as much data as they can on non-wordpress.com hosted installs as well. They already have all this data for each of the 931,951 hosted installs and their data set can now grow even further with the new statistics package.

I wonder if people actually realize how much data Akismet actually gathers? For some reason it sends much more data than the actual comments, and combine all that information with views, post/page views, referrers, and clicks that the new statistics plugin sends back Automatic will have a lot of data about your self-hosted Wordpress install. I still can't understand why that much data needs to be sent for every comment made on every site that Akismet protects.

Combine that with the fact that Matt Mullenweg also owns Ping-O-Matic the sheer amount of data that the software creators have at their disposal in their datawarehouses is something that you might want to think about.

I know both plugins are opt-in for the self hosted Wordpress installs, but chances are that most installs will enable them both as they do provide a useful service to the site owner.

Don't get me wrong, Akismet does a great job fighting the evil that is comment spam, and I'm sure the statistics package will do a great job as well, but you might want to consider if you want to contribute to the data gathering. I'm not normally a very paranoid person, but the more I think about this the more it worries me.

For now, I run Akismet on this site just because there are no real alternatives available. I've also enabled the statistics plugin for testing purposes. I just wish there was a decentralized anti-spam service available.

May 6, 2007 at 11:53pm | 14 Comments
Tagged: , , , , , , , and

14 Comments so far

  1. coComment - , on January 1, 1970 at 1:00am, said:

    view blog [IMG]

    Edit Comment

  2. Chris Meller, on May 7, 2007 at 2:25am, said:

    When I saw your trackback come in, I had the intention of checking the Akismet plugin code and posting a relevant snippet where they'd started limiting the $_SERVER variables that were sent with each request.

    Unfortunately, what I found bothered me... Originally, Matt had added in a large array containing $_SERVER variables that weren't to be sent with each request.

    Much to my surprise, this has since been replaced with a single entry: HTTP_COOKIE. Granted they don't need to know the cookies for a request, but why do they need to know the rest either?

    There's no possible reason the current paths on the server can be of use, along with a vast majority of the other pieces I mentioned in my original post. It's just insane what they're picking up about each individual blog for every single comment.

    I think at this point, even a firm privacy policy would be of some comfort. Instead, all we get is silence. Silence makes me wonder... what are they hiding, that they can't come right out and say it?

    Edit Comment

  3. h0bbel, on May 7, 2007 at 9:24am, said:

    I admit I didn't look through the Akismet source code before posting this, and I knew your posting was an old one and might not be completely accurate any more. Also, Automattic has a privacy policy in place. If that can be trusted is an entirely different matter though.

    Edit Comment

  4. Wordpress at Kaizenlog, on May 7, 2007 at 6:37pm, said:

    [...] Wordpress/Automattic: All you data is belong to us By h0bbel The guys over at Automattic has made the Automattic Stats plugin available for download, which makes it possible for self hosted Wordpress installations to get the same statistics package as wordpress.com hosted blogs have by default. … h0bbel - http://h0bbel.p0ggel.org [...]

    Edit Comment

  5. Incoherent Babble, on May 7, 2007 at 7:55pm, said:

    Babble Blabber Wordpress/Automattic: All you data is belong to us - h0bbel on post Exactly What Data Are You Sending to Akismet? Bjaas on post Vista RC2 Install on Inspiron 9300, Part III Folkert on post Using CURL in XAMPP David on post WordPress Garland Port Frank on post

    Edit Comment

  6. Chris J. Davis, on May 8, 2007 at 2:32pm, said:

    The Akismet plugin for Habari only sends author, email, IP and content along to the servers. Everything else just seemed silly to me.

    Edit Comment

  7. h0bbel, on May 8, 2007 at 2:45pm, said:

    That would be all that it should need, I can't see why it should send anything else at all.

    Edit Comment

  8. Chris Meller, on May 9, 2007 at 5:43am, said:

    Where did you find that link to their privacy policy? I've consciously looked now (while I admit I didn't before I made that comment), and I still can't find it anywhere on automattic.com.

    Edit Comment

  9. h0bbel, on May 9, 2007 at 1:27pm, said:

    It's linked to on akismet.com but not on automattic.com, how strange.

    Edit Comment

  10. Morning Brew #1, on May 15, 2007 at 2:25pm, said:

    [...] Automattic Stats for self-hosted WordPress lets self-hosted WordPress bloggers use the exact same traffic metrics system we provide to WordPress.com users. (download, info, via h0bbel). [...]

    Edit Comment

  11. Chrissy, on May 24, 2007 at 1:27am, said:

    Hey Hobbel,
    I went to a Wordpress party here in San Francisco and, after reading your blog, asked Matt Mullenweg about this very issue. He said that most of the data is very helpful in fighting spam and the extra info being sent across all of the blogs that implement Akismet is only about 3GB a month.

    He also mentioned that there are plugins to strip what's sent to a bare minimum. I'm guessing it may be this one: http://incoherentbabble.com/2005/11/20/enhanced-akismet-plugin-version-106b5/

    Chrissy

    Edit Comment

  12. h0bbel, on May 24, 2007 at 9:25pm, said:

    "most of the data". Why gather data thats not directly related to the spam issue at all? 3GB worth of cookie data is still a lot of data that gets sent to Automattic.

    Edit Comment

  13. TechCrunch Questions Matt Mullenweg's Ethics | OpenSourceCommunity.org, on August 29, 2007 at 3:09pm, said:

    [...] http://h0bbel.p0ggel.org/2007/05/06/wordpressautomattic-all-you-data-is-... has my comments regarding Akismet and data gathering. reply [...]

    Edit Comment

  14. woodrose, on October 30, 2007 at 12:42am, said:

    Hmmm lets hope they use the data wisely!

    Edit Comment

Leave a Comment?


« Tweets on 2007-05-04  —  Tweets on 2007-05-08 »

Recent Comments