Category Archives: Spam

More on human comment spam

Comment spam linkUpdate to this post about human comment spam, about a new trend in blog comment spamming, using real life human spammers, to get around the fact that most bloggers can see the robots coming from miles away.

I’ve had a large number of these come through on my blogs in the few weeks. They’ve all been leaving links to sites like the one pictured. This one’s about antioxidants, but some are purportedly about computer viruses, drugs, whatever.

I really should update all my remaining blogs to use NoFollow, so if any get through, they don’t gain any PageRank. Time to chuck WP-Hashcash into the fray on all of them, as well.

Uh, so many blogs, so little time.


Another comment spam destinationUpdate 26/5/2006: Another example added.

Hand-written comment spam

Amongst all the easy-to-spot robot comment spam, I’m getting a bunch that (at first glance) looks like it’s written by humans. Gone are the stupid out-of-context broken-English comments and links to drug sales. These all have comments that look like they’ve got a few milliseconds’ thought put into them, all on new posts, they all leave a rediffmail (Indian GMail-type operation) address, a 209.97. IP address, and a link to a web site featuring lots of links and no content.

So far I’ve been spiteful and kept the comments but wiped the URL link.

I wonder if they’re particularly targetting WordPress sites that haven’t yet been upgraded to use the NoFollow links.

This is God calling

Yesterday I answered the ‘phone. Because I was home, having a holiday, which is soon to be rudely interrupted by a short working stint, but that’s by-the-by. I could tell that whomever had called didn’t know anyone in the house; the phone’s listed in my girlfriends name. “Hello, Mr [Girlfriend’s-name]?” is a dead giveaway that they’ve pulled the number from the phonebook, and immediately puts me on the defensive. Which is why I have no interest in having the phone in my name. I can spot low-life scum a mile away with the arrangement as it is.

Now, the first thing I do when I have a telemarketer on the phone is to get them to tell me who they are. The lass weasled about, talking about a survey. Surveys don’t care about the identity of the respondent; this was marketting. Eventually she said she was representing the Jehovah’s Witnesses, at which point I terminated the call; religous fundamentalists get up my nostril.

Neither Cathy nor I get any telemarketing calls – oh, well maybe we get a couple a year from local gyms. It’s because we’re signed up to the ADMA’s do-no-call list. If you’re not signed up, stop reading, and go sign up now. The local gyms get the line “we only purchase goods from members of the Australian Direct Marketting Association” and they’re taken care of.

So, here we have technology being used for evil. Evil, not only because it’s evangelical fundamentalists at work, but because they claim they’re doing a survey about how people in the local neighbourhood feel about stuff. Because it’s a survey, that would be covered by the Australian Market & Social Research Society, which (they would claim to keep the statistics clean) doesn’t operate a do-not-call list (in spite of the fact that people that don’t want to be surveyed are going to do all sorts of bad things to their stats).

Worst of all, I don’t think there’s much I can do about it, except I remember hearing about a guy who had installed a PABX with and IVR – “if you want to talk to Cathy, press 1 now. To talk to Josh, press 2 now. Pressing 3 now will let you talk at Owen, but don’t expect a cogniscient conversation out of him.” Apparently, in the US, he was getting zero telemarketing calls – which is quite a feat.

Questions:

  1. Has the obesity epidemic reached the point where the Jehovah’s Witnesses can’t be bothered leaving the house to recruit souls so that they can, pyramid-sales-scheme-like, go to heaven?
  2. Why don’t the Jehovah’s Witnesses tell people up front you’re not going to heaven, even if you convert (there’s only 144,000 spots – what are the chances you’ll be goody-two-shoes-super-converter enough to get in)?
  3. Why doesn’t the AMSRS operate a do-not-call list?
  4. Why doesn’t the government ban harrassment like this?
  5. What can I do to stop this from happening again?

WordPress’s best defence against the dark arts of spam

Scoble writes that WordPress.com has strong comment spam protection, but that it sometimes gets false positives.

I’ve found nothing better for spam protection than WP-Hashcash, which uses Javascript to make sure it’s a human entering the comment, not a robot, but without captchas or other stuff the user has to do. Works like a dream.

The only down side is it doesn’t work with some older WP templates. So while this site is fully spam equipped, my personal blog won’t run it until I upgrade the template (probably a project for Christmas time).

But apart from that, for WPers out there, I can’t recommend it highly enough.

Combined with settings that ensure firsttime posters go straight to moderation (subsequent postings are approved automatically) it ensures that those damn spammers never get their comments published on my site.

I might add that the company I work for (which develops B2B messaging systems) is working on a new site. To encourage them to update it regularly (some might call it blogging, but I’m emphasising “regular updates to existing and potential customers”) I’m building it on WordPress. Given WP’s ability to do a site of static pages and dated entries, it should work very well.

Stopping WordPress spammers

The blog comment/trackback anti-spam refinement continues.

I’m testing the WP-Hashcash plugin, which inserts Javascript code to calculate an authorisation code into the comment. Since comment spammers don’t actually use the comment forms (at least I hope not; not until they start using people to enter the comments), this means only real comments get through. Well, real comments from people with Javascript running. If they don’t have Javascript running, they may be out of luck. Hopefully that applies to nobody these days, and I think this solution is less painful than a captcha-based one.

But trackback spam is still a problem. One available option is to block direct access to the WordPress trackback PHP, but this isn’t very effective, since most current trackback spammers however are clever enough to call the “real” URL.

A version of Auto shutoff comments modified to close trackbacks on posts older than 28 days, however, seems more effective. I don’t particularly want to shut comments off (especially since the above plugin effectively stops comment spam), but trackbacks are less compelling to keep open.

Together with previously discussed .htaccess entries to block big bandwidth thieves, this appears to be a fairly effective set of anti-blog spam measures. For now.

Pirates! Spammers! Gyroscopes! Bandwidth thieves!

This is officially getting ridiculous. Not only are my blogs getting a lot of comment spam, but my personal blog site is burning huge amounts of bandwidth, as particular (I assume zombie) hosts hit the site.

Below are the top ten bandwidth users of danielbowen.com for June:

Top 10 of 15312 Total Sites By KBytes
# Hits Files KBytes Visits Hostname
1 14380 4.10% 3801 1.77% 111235 2.22% 159 0.24% host-148-244-150-58.block.alestra.net.mx
2 17558 5.01% 3191 1.48% 99441 1.98% 157 0.24% host-207-248-240-119.block.alestra.net.mx
3 3927 1.12% 3640 1.69% 75989 1.51% 3 0.00% csr010.goo.ne.jp
4 3062 0.87% 2797 1.30% 74881 1.49% 171 0.26% rrcs-24-97-174-130.nys.biz.rr.com
5 3057 0.87% 2200 1.02% 62547 1.25% 392 0.60% msnbot.msn.com
6 2691 0.77% 2248 1.04% 60684 1.21% 153 0.23% 64.124.85.78.become.com
7 2256 0.64% 2082 0.97% 56383 1.12% 124 0.19% 98-101-196-200.linkexpress.com.br
8 2146 0.61% 2033 0.94% 51665 1.03% 279 0.43% dsl-250-198.monet.no
9 2001 0.57% 1755 0.82% 47605 0.95% 23 0.04% host133.sprintnetops.net
10 1686 0.48% 1571 0.73% 35979 0.72% 325 0.50% corporativos

It’s not like this site is hosting pr0n or something — there’s just no reason why any single host would need to grab 110Mb of traffic in a single month. In total traffic topped 4Gb for the month, which is ludicrous for a diary site with a few photos on it. 4Gb is actually my monthly limit — thankfully my web ISP isn’t too strict about charging extra for hitting that, but there’s always the risk if this is consistent that it’ll be costing me real money.

As a result I’ve started a list of bandwidth hogs’ IP addresses, which I’m putting in the .htaccess file. Anything with lots of hits and grabbing above about 5Mb per month is going onto the list, and the list is being duplicated (manually unfortunately) across to the other WordPress sites that I run.

Inspection of the access_log is particularly enlightening, with at present a staggering number of requests coming in with a referer at poker-related sites. Of the 6665 hits in the file for today (covering about 13 hours) there are 674 from texasholdemcenteral.com (note the wonky spelling) and 1212 from sportscribe.com. All of these too are now being blocked with a 403 (forbidden) via .htaccess.

Sigh. I suppose it’s just too much to expect people to place nice?

.htaccess extract – Feel free to copy for your own site to block miscreants.
Continue reading

Recent spam stopping techniques

Okay, two techniques, one that’s going to be comprimised sooner, one that’s going to be compromised later:

  1. A hidden field that must be supplied
  2. A javascript client-server MD5 oneway hash

I don’t see the second as a viable solution because it demands javascript (precluding certain users), and the first will be bested by the spammers when it becomes economically viable. I guess it depends on the implementation cost as to if it’s adopted here.

Why Googlebomb?

Why are webloggers googlebombing online poker?

I assume it’s to reduce the attractiveness of spaming the blogs with the term. Wouldn’t you want positions 1-10, rather than just #1, and really shut the action down? I don’t see that it will. But wikipedia will be regarded as a more relevant site, and that’s gotta be good, right? Speaking of which, I must go check for vandalisim on my pages…

SMS spam from sms.ac

I got an invitation to join sms.ac. A quick Google seemed to indicate it’s not a great idea unless you want to give your mobile number to people who will SMS-spam you.

Further, if they convince you to reveal your Hotmail password (on the pretext of letting you read it from your mobile) they’ll also spam the people in your address book, inviting them to join. Delightful. And the person who “invited” me? She wasn’t even aware it had happened.

So remember kids: sms.ac is bad. Now email this warning to all your friends.

The power of spam

When I registered my first domain name, toxiccustard.com, in November 1996, I didn’t keep my email address secret. It wasn’t obvious (at least to me) that spammers were picking up any valid email addresses they could find, left right and centre. The address: dbowen@toxiccustard.com. I can quote it now because it hasn’t been valid for many years.

But they keep distributing it, and keep spamming it. I know this because my web ISP told me last week that toxiccustard.com is now getting about five thousand e-mail messages PER DAY. Aye carumba.

In fact so much mail is coming in that before they realised the nature of it, they were saying they’d have to decline to provide me with shared web hosting for that domain in the future, because of the impact on other customers. As it is they’ve said okay they’ll live with it, since they’ll be upgrading their systems shortly so bouncing mail doesn’t impact them as much.

I’ve disabled mail completely on that domain in Plesk, and I’m looking into fiddling with the MX records, which hopefully should stop dead any mail way before it reaches anybody and starts costing them money. This may involve moving the domain to a new registrar, since the current mob doesn’t appear to provide this level of customisation.

The lesson: keep your email address secret. Once the spammers have it, expect a snowball effect. It may take 9 years, but eventually it’ll be unusable.