SP Filter 0.59 (finally considered beta...)
1) fetch multiple public available ip-based access-lists into the
local cache-directory, create diffs and clean out old copies.
supports either LWP, wget or rsync, *.bz2 transparently handled.
2) read entries from local cache into memory (hash), convert cidr-
netmask into octets and dedupe/consolidate in one single pass.
sort entries and write file in the format of your preferred mta,
optionally preserve/reimport existing lines (magic_update).
./spfilter.pl -verbose -format=sendmail,postfix,... SOURCE ...
./spfilter.pl [ -verbose ] [ -debug ] [-format=format,... ]
[ -cachedir=./cache ] [ -outdir=(./outdir|outfile|STDOUT) ]
[ -workdir=workdir] [ -pubdir=./publish ] [ -user=spfilter ]
[ -xmlconf=./spfilter-local.xml ]
[ -keyring=(NULL|spfilter-keyring.gpg) ]
[ -tiehash=(NULL|/tmp/tiehash.gdbm|/tmp/tiehash.db) ]
[ -email=responsible#contact.dom ]
[ -zone=localhost,127.0.0.1,43200 ]
SOURCE [SOURCE2] ...
only the first character after '-' is relevant, the single
char arguments from previous versions will still work.
keep using the short version in scripts, as the handling of
long arguments (wording) may change any time.
-c, -cachedir=: directory for cached sources
defaults './cache', directory must exist
old cached sources will be purged after successful fetch
-d, -debug: boolean, for testing and linting, may use multiple
-e, -email=: string passed in the HTTP_USER_AGENT, max. 48 chars
default: list sources as specified by -source or @ARGV
-f, -format=: output format(s), as named in the xml-config
default: octets (tab-delimited, no quotes)
for mta's: courier, exim, postfix, rblsmtpd, qmail_uce, sendmail
for dnsbl: rbldnsd, tinydns, bind and generic 'reverse'
for benchmarking: queryperf (from bind/contrib)
multiple formats may be specified separated with comas
use 'cdb', 'gdbm' or 'db' to compile output into DB_File
NOTE: contents of -format may be used as part of USER_AGENT
-h, -help: boolean, display built-in manpage (this document)
-k, -keyring=: use the named keyring to verify spfilter-config.xml
default 'spfilter-keyring.gpg', in './' or '/usr/local/etc'
the Makefile will generate the keyring on 'make keyring'
specify 'NULL' for keyring to disable gpg-functionality
-l, -log: log to syslog, named file or email (not implemented)
-o, -outdir=: directory or filename with optional path
default './outdir' if exist, or the current workdir '.'
specify '-outdir=STDOUT' for use in pipes (with singe format)
-p, -pubdir=: directory to publish *.bz2 for redistribution
three subdirectories required: ./input, ./diff and ./output
example1: http://spfilter.openrbl.org/data/
example2: http://mirror.openrbl.org/spfilter/
note: primary sites should run spfilter one time per day,
in the time from 00:30 to 01:30 UTC (GMT)
-q, -quiet: decrease verbosity (see also -verbose and -debug)
-s, -source=: input sources (legacy, just list them on commandline)
default: import all from set DEFAULT
set DEFAULT equivalent to: -source=SPEWS,SPAMSITE,PDL
for relays add DSBL and/or RSL: -source=DEFAULT,RELAYS
set RELAYS equivalent to: -source=DSBL,DSBL_MULTIHOP,RSL
please refer to spfilter-config.xml, preset_section
NOTE: contents of -source may be used as part of USER_AGENT
-t, -tiehash=: tie working hash to in-memory or file-db
DEFAULT: none, will trade memory against cpu and disk
'-tiehash=NULL': use in-memory db, reduce memory usage by 50%
'-tiehash=/tmp/spfilter-tiehash.$$.gdbm': reduce memory by ~70%
'-tiehash=/tmp/spfilter-tiehash.$$.db': whatever works better
-u, -user=: drop root-privilegies for external program ($CFG{user})
default setuid user 'nobody' if run by root (uid 0)
WARNING: this setuid is not save, start spfilter with:
su nobody "perl ./spfilter -format=sendmail"
-v, -verbose: verbose output
increase level of verbosity with multiple -vv (see also -d)
-w, -workdir=: chdir into this directory on startup
default none (no chdir will be done by spfilter)
setting simplifies usage from cron and in pipes (with STDOUT)
-z, -zone=: dnsbl-in-a-box (with -f bind and/or tinydns)
default -zone=localhost,127.0.0.1,43200 should work everywhere
- timestamps sometimes based on .yymmdd-extensions, and sometimes
the file modification-time (for If-Modified-Since).
use the tracker at http://sourceforge.net/projects/spfilter/
code still considered alpha, backup your files as always
./spfilter-config.xml: definition of sources and formats
self-updating copy kept in the directory ./cache
./spfilter-pubring.gpg: verify embedded signatures with gpg
files are internally signed with gpg, verify with 'make verify'
files will be searched in '.' and in /usr/local/etc
files must be owned by root or the user running spfilter
and cant have any group- or world-writable permissions.
same for files reused from the cache-directory (-cachedir)
spfilter(at)gmx.net. QPL licence apply
Perl5 with LWP::UserAgent (libwww-perl) or wget in $PATH,
XML::Simple (included in tarball), bunzip, rsync recommended
and optionally diff for primary sites.
mta with support for ip-based access-lists or nameserver.
- check if you have all the necessary executables in $ENV{PATH}
by running: `which -a perl rsync wget bunzip`
- primary (publishing) sites also need gpgv, diff and bzip2
- make shure to have the perl-module XML::Simple installed
(available at CPAN) or use the one included in ./XML/Simple.pm
- fetch the Makefile into an empty directory, run 'make 'all'
this will create the two subdirectories ./cache and ./outdir,
fetches the public-key, generate the keyring and finally also
fetches and verify both spfilter-config.xml and spfilter.pl.
- run `./spfilter.pl -vd TEST_LIST`, or simply 'make test'
- ./spfilter-config.xml: review for your own safety, its signed
- ./spfilter-local.xml: used only with '-x ./spfilter-local.xml'
- enable output-format for your mta or dnsbl with argument -f
- its recommented to let spfilter write into the default
./outdir, and set a symlink from the location your application
(mta, nameserver etc) expects.
optional:
- use argument -s to specify your own set of input-sources
- magic_update => 1 preserves existing lines even across updates
- count the existing lines and run twice to check the 'magic'
- !!! new code: use -v and check the daily output from cron !!!
input sources have already been defined in spfilter-config.xml:
some of them are:
[SPEWS|SPEWS2] SPAMSITE PERMBLOCK PDL RSL [KOREA|CHINA|KRCN]
(cant list them all here, check out spfilter-config.xml
also at http://spfilter.openrbl.org/code/xml-view.php
default setting equivalent to:
./spfilter.pl SPEWS SPAMSITE PDL
./spfilter.pl -s SPEWS,SPAMSITE,PDL
./spfilter.pl DEFAULT
./spfilter.pl
several sets of sources have been defined in spfilter-config.xml,
they will be recursively expanded to the regular sources.
see http://spfilter.openrbl.org/code/xml-view.php#PRESET_SECTION
IMPORTANT: you also need to check incoming mail against a realtime dnsbl
for open relays and proxies. The persistent ones will end up on DSBL and
Wirehub's great PERMBLOCK but (unfortunately) not in realtime.
If you dont use relays.osirusoft.com, dnsbl.njabl.org etc. enable DSBL
and update daily.
please always use rsync:// for DSBL and WIREHUB, dont waste bandwidth.
For 'complete' protection also use bl.spamcop.org (via dns) and consider
enable KOREA TAIWAN HONGKONG - depending on your location.
keys for %SRC in spfilter-config.xml: (only 'url' mandatory)
type: /^(addr|cidr|range|reverse|axfr|host)/), to be documented
interval (number): reuse cached files up to interval days (3)
tag (string): prepend this instead of the name, may be set to 'NULL'
SBL hack: if the tag ends with = there will be no space after
prepend (string): append string after text (default none)
! deprecated, will be removed, legacy support only !
use the tag "tag" instead, config.xml already updated
append (string): append (optional) string and ip after $text
url (string): http-, ftp- or rsync- or file-resource:
- url's ending with *.bz2 will be decompressed transparently
- use relative or absolute path for local sources
- macro {YYMMDD} expandes to UTC (GMT) datestamp
- macro {FILENAME} expands to the contents of that key
conflict (string): warn if this source is already defined
- only partially implemented, never trust a dumb machine ;)
predefined output-formats in spfilter-config.xml:
octets, courier, exim, postfix, qmail_uce, rblsmtpd, sendmail
reverse, rbldnsd, tinydns, bind
default output format equivalent to '-format=octets'
specify multiple formats separated with coma or whitespace
keys for %FMT in spfilter-config.xml: (all optional)
default (boolean): 0=disabled, 1=enabled (default 0)
type (string): 'addr', 'cidr/nn', 'range', 'config', 'rbldns',
'axfr/cname', 'axfr/txt' or 'axfr/a' (default 'octet')
linestart (string): prepended to the begin of each line (default none)
separator (string): inserted between $addr and $text (default "\t")
lineend (string): appended after $text (default none)
magic_update (boolean): preserve manually inserted lines
silently ignored if output sent to '-outdir=STDOUT' or DB
- spfilter will use only the own keyring (spfilter-keyring.gpg)
and will accept any good signatures from the keys listed there.
dont add other public-keys unless you know what you are doing!
- spfilter-config.xml and spfilter.pl contain embedded gpg-signature
verify manually with 'gpgv --verify' as usual if you have the
public-key in your trusted keyring (which is deprecated!).
use 'make verify' instead, no need to mess up existing keyrings.
- build the pubkey: 'make pubkey' will fetch the pubkey from keyserver,
build the gpg-keyring for spfilter in a (hopefully) save way.
'make verify' additionally checks the embedded gpg-signatures.
- public key for spfilter@openrbl.org available at:
http://search.keyserver.net:11371/pks/lookup?op=vindex&template=netensearch&search=spfilter
http://pgp.mit.edu:11371/pks/lookup?search=spfilter&op=index&fingerprint=on
- if you prefer to build the keyring manually: (all in one line)
gpg --no-default-keyring --keyring ./spfilter-keyring.gpg \
./doc/spfilter-pubkey.asc
| - - - ERRATA: old instructions told you to import spfilter's public key.
|
| this is not needed anymore as the detached signarure of the *.tgz
|
| tarball has been discontinued and deprecated for security reasons:
|
| $ gpg --import ./spfilter/doc/spfilter-pubkey.asc
|
| $ gpg --verify ./spfilter-$VERSION.tgz.asc ./spfilter-$VERSION.tgz |
%CFG:
workdir "." # opw_w
debug 0 # opt_d
verbose 0 # opt_v
interval
sources "ONE,TWO,THREE,..." # opt_s and/or @ARGV
formats "one,two,three,..." # opt_f
email $sources # show in HTTP_USER_AGENT
tempfile "" # tie temporary hash, opt_t
xmlfile "" # opt_x (additional local config.xml)
cachedir "./cache" # opt_c
outdir "./outdir" # opt_o
pubdir "./publish" # opt_p
exec_user "nobody" # should use 'spfilter' if available
exec_uid -1 # uid from exec_user
exec_path "/bin:/usr/bin:/usr/local/bin" # should be safe
exec_http "wget ..." # alternative to Perl::LWP
exec_rsync "rsync ..." # recommended
pack_ext '(bz2|gz)' # gz not tested
exec_bunzip "bunzip ..." # strongly recommended
exec_gunzip "gunzip ..." # you tell me if it works ;)
exec_bzip "bzip2 ..." # for republishing on primary sites
exec_diff "diff ..." # used if available
exec_patch "patch ..." # not implemented yet
exec_gpgv "gpgv ..." # strongly recommended
keyring 'spfilter-keyring.gpg' # $opt_k (use NULL to disable)
zone_name "localhost" # $opt_z (first field)
zone_addr "127.0.0.1" # $opt_z (second field)
zone_ttl "43200" # $opt_z (third field)
program "spfilter"
version "0.00"
date "YYMMDD"
useragent "$program/0.00"
magic
yymmdd "YYMMDD" # always uses UTC (~GMT)
count_cached 0++
count_notmodi 0++
count_fetched 0++
%SRC:
name # auto-generated, dont mess with
url_primary # experimental, for use by redistributing sites only
url # multiple tried in order, {FILENAME} and {YYMMDD} expanded
interval # interval in days between updates
type
alias # experimental, use the same file in ./cache
filename # explixitely set filename in cache, defaults to $name
minsize 1 # min kb, reject anything below 513 bytes (rounded)
maxsize 2000 # max kb, protect somewhat against dos
conflict # only one single value handled
regexp_include # perl-regexp, will be enclosed in =~/.../
regexp_exclude # perl-regexp, will be enclosed in !~/.../
option # experimental axfrexpand, notext, html2text
tag # prepend to each line of output, defaults to $name
prepend # DEPRECATED hack, comes just bevore $append ;)
append # construct url, $addr appended to string
cache_status # -1: 304 Not Modified, 0: 404 Error, 1: 200 OK
cache_fname # name of cached file
cache_ifmod # contains If-Modified-Since date for HTTP
cache_fetched # name of fetched file
%FMT:
name # the key itself, dont set or change
type "txt"
publish 0
magic_update
include "" # include content verbatim in output
notation "octet"
linestart
separator "\t"
lineend
secondline # print additional lines, for bind and tinydns
secondlinestart
option # [bindhack|tinydnshack|tcpserverhack]
- spfilter runs with ActiveState which has all modules already included
http://aspn.activestate.com/ASPN/Downloads/ActivePerl/Source
http://downloads.activestate.com/ActivePerl/Windows/5.6/ActivePerl-5.6.1.633-MSWin32-x86.msi
spfilter-bunzip-cmd.zip contains bunzip2.exe and a cmd-sample
- spfilter has been reported on recent versions of cygwin,
some more documentation welcome. (check out ./docs)
http://cygutils.netpedia.net/
http://webmaster.indiana.edu/perl56/pod/perlcygwin.html
http://search.cpan.org/author/COOPERCL/XML-Parser-2.31/
(or http://search.cpan.org/author/MSERGEANT/XML-SAX-0.11/)
http://search.cpan.org/dist/XML-Simple/ (or ./XML/Simple.pm)
Note: there are reports after 'perl Makefile.PL' the variables
PERL, FULLPERL and PERL_CORE may all have assigned '0' (zero)
and 'make' will fail badly. (W2K, 2002-11-01)
- WARNING: console application only, no colors and mouse support !
mailservers with less than a few thousand mails per day are
better off using traditional dnsbl-queries.
Windows 2000 with a fixed ip and bind9 may still be used as a
dnsbl-server for zones generated by spfilter.
nameserver, consulting and support available for serious projects
- consistent handling of keywords OK (WHITELIST) and FREEMAIL (MXCHECK)
- aggregate input, create index hash from textual description
- aggregate output, optionally into cidr or range
- modularize the code, split into input.pl, output.pl and spfilter.pm
- courses-based selection for sources and formats (contribute!)
- update from daily diffs, uses only 2..10% (see /data/input/diff)
- documentation (may be submitted via sourceforge project home)
(suggestions, patches and working code always welcome)
- major new features wont be implemented as already announced,
development will concentrate in Bliab, as time permits
- support for rbldns-style sources with expanding excpetions,
primary for use with Easynet Dynablock (10mb instead of 50mb)
- 'DYNABLOCK' renamed to 'EASYNET_DYNA', uses now rbldns-format
- 'PERMBLOCK' renamed to 'EASYNET', old legacy names still work
- several minor adjustments in spfilter-config.xml as always
- format 'cidr', appends /8,/16,/24 or /32 for use in firewalls
0.59++current (updated 2003-02-16)
- Wirehub! officially renamed to Easynet, zones adjusted (2002-05-17)
- added official support for SBL via http://mirror.bliab.com
- added option 'html2text' for sources, for SBL until they fix it
- ommit whitespace between tag and text if tag ends with '=' (for SBL)
- key 'prepend' deprecated, include text into key 'tag' instead
- added option 'notext' for sources, for ISP_ from blackholes.us
- changed predefined text for ISP_ to something more polite
- bugfix: make default TTL configurable for tinydns
0.59 (updated 2003-02-01)
- 2003-05-10: maintenance, recent spfilter-config.xml integrated
- no big changes, but its time to bump the version
as always, wait a few days if you dont want to discover new bugs
- reject source if Content_Length and Response_Length dont match (LWP only)
- bugfix: spfilter died at zero-sized files with Status 200 (LWP only)
- first experimental bits for hashed strings needs $ENV{DEBUG_STRHASH}
and some code still missing - contributions welcome
- AXFR-sources: expand parent block of exceptions (void CNAME's)
may be used for blacklisting now, but keep an eye for bugs
new sources: DYNABLOCK_EXPAND, NOMORE_EXPAND, FIVETEN_EXPAND
0.58++ (2003-01-11)
- Makefile: fixed fetching of pubkey, as noted by Bert Driehius
- added format: 'rbldnsd' and 'queryperf' /see ./doc/rbldnsd.txt)
- bugfix: make -q really quiet (patch submitted by robhardy)
- bugfix: empty retrieved file caused spfilter to die()
- bugfix: removed 'make clean' because of non-portable 'rm -d'
- spfilter-config.xml embedded into spfilter.pl for safe fallback
everything is now contained in one single file (spfilter.pl)
this solves the 'chicken and egg' problem with live-update.
- live-update functional (requires gpgv in $PATH and keyring.asc)
spfilter will use the latest cached xml, if signature passes
Note: having 'spfilter-config.xml' in . or /usr/local/etc
effectively disables this feature. So does $ENV{NOCACHED}
- Windows/ActiveState Perl (MSWin32) tested and supported (32mb cap)
- fixed bug with tinydns_uce: new key 'option="tcpserverhack"'
- add new keys 'option="bindhack"' and 'option="tinydnshack"'
- add new format 'rblsmtpd' for qmail, sets variable $RBLSMTPD
- add experimental support for DRBL, requires hacked build_drbl
- fixed RELAYCLIENT="" for tcpserver, as noted by Marek Les
- Target 'clean' in Makefile removed, 'rm -f' not portable on Linux
0.58 (2002-11-01)
- predefined set of sources (see spfilter-config.xml)
just list the names (SPAM, RELAYS, DYNAMIC, COUNTRY etc,
they will be expanded accordingly. XML-viewer available at:
http://spfilter.openrbl.org/code/xml-view.php#PRESET
exclude sources from set with leading dash: 'SPAM,-PERMBLOCK'
WARNING: listing all will cost lots of ram, cpu and bandwidth!
- installation and upgrade simplified, you only need the Makefile
'make update' will fetch and verify the most recent files.
- renamed 'LIVE_UPDATE' to 'update->spfilter-config.xml' for the
forthcoming database integration. Update-feature now fully
working, but still considered experimental.
- verification of embedded signatures with gpg completed
spfilter refuses to '-p ./publish' without gpgv and keyring
WARNING: contents outside of signatures not checked yet !
- spfilter now uses LWP::UserAgent (libwww-perl) by default, with
fallback to wget. Set NOLWP=1 to disable, and report why.
- argument '-e responsible@contact' will be passed in HTTP_USER_AGENT
robots should always provide means to contact the operator.
- arguments now processed by Getopt::Long, needs some tweaking
- all output done via centralized function, log to syslog planned
0.57 (2002-10-20)
- spfilter-config.xml and spfilter-pubkey.gpg may be in /usr/local/etc
- embedded signature for spfilter.pl, just like spfilter-config.xml
- verify gpg-signature of spfilter-config.xml if gpgv exists in $PATH
WARNING: gpg-functionality disabled without spfilter-keyring.gpg
- key 'grep' renamed to 'regexp_include', key 'regexp_exclude' added
- argument -s optional, just list all sources on the commandline
- check path for available binaries (wget, bunzip2, gpg, etc) on startup
- experimental key 'primary_url' for use by redistributing primary sites
- not-modified-since now works with multiple updates per day
- cleanup: XML::Simple.tgz and .htaccess removed from sourceball
- detached signature discontinued, too much passphrases to type...
0.56 (2002-10-07)
- spfilter has been reported to run on W2K with recent Cygwin
- LIVE_UPDATE moved into own section, incompatible with 0.55
- spfilter-config.xml: embedded cleartext-signature integrated
- couple of internal structural changes, still compatible
0.55 (2002-10-03)
- file spfilter-$VERSION.pl renamed to spfilter.pl
- file spfilter-remote.xml renamed to spfilter-config.xml
- argument '-p ./publish' republish *.bz2 via http
- smart prefix for output-files: use the name of the
source if possible, or the key 'filename' if defined
- auto-updated config in ./cache/spfilter-config.xml
experimental, file has to be copied manually into ../
- argument '-t NULL' <64mb or -t /tmp/file.gdbm' <32mb
- new key 'tag': prepended to the output instead of the name
- support for blackholes.us, currently defined sources:
ISP_CIBERLYNX,ISP_ELI,ISP_HE,ISP_INFLOW,ISP_INTERNAP,ISP_VERIO
- select all the ISP_*-sources via (pseudo-)wildcard: 'ISP_'
0.54 (2002-09-19)
- header If-Modified-Since used for HTTP if applicable
- this made a couple of internal changes necessary:
only wget supported, bunzip2 used instead of bzcat
- license changed from BSD to QPL, but still open-source
- DEFAULT now includes SPEWS Level #1 (instead Level #2)
- dnsbl-in-a-box: specify zone and addr via argument -z
usage: -f bind,tinydns -z dnsbl.example.com:11.22.33.44
- experimental source: FIVETEN and DYNABLOCK (expand $GENERATE)
- experimental key 'grep' added for building of meta-blacklists
- FLOWGOAWAY,TAIWAN and HONGKONG now mirrored via HTTP/bz2
- source tests with perl 5.005 and 5.8, on FreeBSD and Linux
0.53 (2002-07-15)
- support for cdb, db and gdbm even with multiple formats
- built-in manpage outsourced (./doc/spfilter.pod[.html])
0.52 (2002-06-30)
- check, ignore and warn on invalid cidr-mask (SPEWS did it again)
- pseudo-source 'DEFAULT' added, for use with 'DEFAULT,...'
- send $SRC{name}:$SRC{name} via HTTP_BASIC_AUTH (experimental)
- create diff on each successful fetched source
- support to import all files from local dir://, see BADFROM
- support for name-based lists added, see SPAMLIST_EXTENDED
- argument -o also takes filename (if only one format specified)
0.51 (2002-06-18)
- support for rsync via fork; setuid and reaper implemented
- wirehub's PERMBLOCK now available via rsync, saves 90%
- DSBL[_whatever] available via rsync, xml-config updated
- multiple url's will be tried in order of listing
- set env NORSYNC=1 if you dont have rsync (lame excuse)
or dest-port 6666 has been firewalled (as on sourceforge.net)
- format sqldump (INSERT INTO ...) added to xml-config
0.50 (2002-06-16)
- configuration split up in two different files (local and remote)
- config-format changed to xml, spfilter now requires XML::Simple
- fallback to older cached copy (max 30 days) on failures from fetch
0.4x (2002-06-11, consolidated)
- read sources from local filesystem, use 'file://...' or ./path.
- sent output to dup of STDOUT with argument '-o STDOUT' or '-o -'
- all textual/informational output will be sent to STDERR only
- Support for PDL Dynamic Dialup List (enabled by default)
- Support for RSL Visi Relay Stop List (disabled by default)
- Support for KOREA and CHINA (http://www.okean.com/asianspamblocks.html)
- Support for Wirehub's PERMBLOCK (great list, now enabled by default)
- bugfix on getgrnam and chuid suggested by andrew@***.au
- tinydns dnsbl output-format added
- argument -u: drop privilegies on fork/exec (see $CFG{user})
0.3x (2002-04-01, consolidated)
- magic_update=>1 preserves alien lines in existing datafiles
- bug with postfix reported by Ian Vaudrey - quotes removed
- major code cleanup & rearangement
- initial release, based on fetchspews.pl
homepage: http://spfilter.openrbl.org/
mirror: http://mirror.openrbl.org/spfilter/code/