Blocklist Merging on Linux


A little while back, I wrote a post about using iptables PeerGuardian blocklists efficiently. However, that program only uses a single list; it expects it to be pre-processed by another program.

Originally I used Bluetack’s Blocklist Manager. It’s quite good, but it’s Windows-only, slow, and eats gobs of memory. (Seriously, 100+ MB, and 24/7 if you want it to auto-update.)

Today, I got bored, and wrote my own as a tiny command-line program dubbed BLM.

It only merges the blocklists, though I included a couple of scripts to show downloading them with wget automagically too. Also, it’s designed to output the merged list in PeerGuardian format to stdout, which works very nicely with my pg2ipset utility from that post I linked above.

My suggestion is to make a file with a bunch of URLs of blocklists in .gz format (the .tar.bz2 includes a list of the Bluetack ones) then add a script something like this to your crontab or /etc/cron.daily:

cd /opt/blocklist
wget --timestamping `grep -v ^# urls.txt`
zcat *.gz | ./blm | ./pg2ipset | ipset -R

Modifying this to your personal paths and needs, as always. 🙂

This entry was posted on Saturday, August 22nd, 2009 at 5:44 PM and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. Responses are currently closed, but you can trackback from your own site.

4 Responses to “Blocklist Merging on Linux”

  1. [...] Maeyanie.com « Blocklist Merging on Linux [...]

  2. I have an older version ipset bundled with Debian Lenny that expects ranges to be formatted like IP1:IP2, rather than IP1-IP2.

    I found the bit at the end of the blm program that needs changing, but I don't know c++ and can't work out how to substitute the - for a : ... Could you help me out a bit? 😉

    Cheers for the programs and writeups btw - they're just what I'm looing for!

  3. It's a pretty straightforward change. Change line 86 from:
    fprintf(ofp, "-A %s %s-%s\n", rulename, fromaddr, toaddr);
    fprintf(ofp, "-A %s %s:%s\n", rulename, fromaddr, toaddr);

    And thanks, glad to hear it's coming in handy. 🙂

  4. Of course! It was too late at night for me ^^ I was trying to change line 229 of BLM instead. 😀

    Thanks, this is no doubt better than piping the output through sed, hehe 😉