Undercrank

Scripting

February 4, 2005

Despamming Shortstat (Part 2)

Further to my earlier (and popular - thanks Shaun) post about Despamming Shortstat, I've made a small update to the code that has a few improvements on the original:

  1. It now only looks for entries added to the MT-Blacklist database made in the past 72 hours (you can change it if you want). The comment spam blacklist is constantly very similar to those referer spamming, and so only checking recent additions reduces the execution time dramatically and (for me) seems to be just as efficient.
  2. The script chucks out a brief summary of what spam domains it's removed.
  3. It also uses a couple of variables from your ShortStat configuration file, so it really should just drop in and play nicely.
  4. You could probably set this as a cron job now, changing the 72 hours to fit the time of your job.

It's now a few more lines than the original (well, it's not that many, but I was rapidly getting toward a homepage full of scripts) so you can view the code right here instead:

More of the same RSS

Yahoo! Web Services news over Atom

aka: "When you realise Yahoo! already serve their news over RSS..."

Yahoo! Web Services news over RSS

I've put together a small project that will create an dynamic RSS 2.0 feed based on the Yahoo! News Search hooks.

Importing CSV files to SQLite

"SQLite is different from most other SQL database engines in that its primary design goal is to be simple". Which isn't strictly true...

Despamming Shortstat

I've been using Shaun Inman's Shortstat package for a short while now as my main source of web statistics. However, it's fairly susceptible to the, er, 'innovation' known as referer spam - so here's some code that use's Jay Allen's MT-Blacklist master list to clean it up.

Trackbacks

Trackback URL for this entry is:

Comments

Lance on March 20, 2005

Mark, thanks for the code for the de-spamming, that is awesome!
I am getting an error though, not sure it's in the code or in my server settings.

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource on line 26

Any ideas?
Thanks!

Marco on April 5, 2005

I made some changes to ShortStat. I don't call it through a PHP include but through some Javascript. Gone are the spammers without any need for de-spamming ;)

Check it here:

http://www.i-marco.nl/weblog/archive/2005/04/05/improving_shortstat

Post a comment

Remember personal info?