View previous topic :: View next topic |
Author |
Message |
wetkitty n00b
Joined: 26 Sep 2003 Posts: 16 Location: Baker City, OR
|
Posted: Thu Apr 07, 2005 5:57 pm Post subject: Spamassassin - Bayes help needed. [SOLVED] |
|
|
First the numbers
SA version 2.63
Mail server installed using Sabrex's how to:
https://forums.gentoo.org/viewtopic-t-171499-highlight-spamassassin+qmail.html
local.cf (relevant parts anyway ):
Code: | # Text to prepend to subject if rewrite_subject is used
subject_tag *****SPAM*****
report_header 1
# Encapsulate spam in an attachment
report_safe 1
add_header all Status _YESNO_, hits=_HITS_ required=_REQD_ tests=_TESTS_ autolearn=_AUTOLEARN_ version=_VERSION_
# Use terse version of the spam report
use_terse_report 0
# Enable the Bayes system
use_bayes 1
bayes_min_ham_num 200
bayes_min_spam_num 200
bayes_use_hapaxes 1
# Enable Bayes auto-learning
auto_learn 1
auto_learn_threshold_nonspam 1.0
auto_learn_threshold_spam 7.0
bayes_path /root/.spamassassin/bayes |
spamassassin -D --lint outputs the following regarding bayes:
Code: |
debug: using "/usr/share/spamassassin" for default rules dir
debug: using "/etc/mail/spamassassin" for site rules dir
debug: using "/root/.spamassassin" for user state dir
debug: using "/root/.spamassassin/user_prefs" for user prefs file
debug: bayes: 23462 tie-ing to DB file R/O /root/.spamassassin/bayes_toks
debug: bayes: 23462 tie-ing to DB file R/O /root/.spamassassin/bayes_seen
debug: bayes: found bayes db version 2
debug: Score set 3 chosen.
debug: Initialising learner
debug: Loading languages file...
debug: Language possibly: en,sco
debug: is Net::DNS::Resolver available? yes
debug: trying (3) amazon.de...
debug: looking up MX for 'amazon.de'
debug: MX for 'amazon.de' exists? 1
debug: MX lookup of amazon.de succeeded => Dns available (set dns_available to hardcode)
debug: is DNS available? 1
debug: all '*From' addrs: ignore@compiling.spamassassin.taint.org
debug: running header regexp tests; score so far=0
debug: running body-text per-line regexp tests; score so far=2.077
debug: bayes corpus size: nspam = 2195, nham = 2557
debug: uri tests: Done uriRE
debug: tokenize: header tokens for *F = "U*ignore D*compiling.spamassassin.taint.org D*spamassassin.taint.org D*taint.org D*org"
debug: tokenize: header tokens for *m = " 1112896121 lint_rules "
debug: bayes token 'somewhat' => 0.0919180934020199
debug: bayes: score = 0.0919180934020198
debug: bayes: 23462 untie-ing
debug: bayes: 23462 untie-ing db_toks
debug: bayes: 23462 untie-ing db_seen |
and
Code: |
debug: running meta tests; score so far=4.984
debug: is spam? score=3.46 required=5.5 tests=BAYES_01,DATE_MISSING,DCC_CHECK,NO_REAL_NAME
|
Notice that bayes is used and weighting on a rule caused a shift in the score - this is how I would like it to work for real, but notice the headers from mail processed by spamd using the same config:
Code: |
X-Spam-Status: Yes, hits=33.9 required=5.5
X-Spam-Level: +++++++++++++++++++++++++++++++++
X-Spam-Report: SA TESTS
1.1 SARE_HEAD_HDR_XSPAM Message headers used which identify spam
2.5 MANGLED_SOMA BODY: mangled Soma
0.6 J_CHICKENPOX_32 BODY: 3alpha-pock-2alpha
2.3 MANGLED_PHRMCY BODY: mangled pharmacy
2.3 MANGLED_AFFORD BODY: mangled affordable
0.1 SAVE_UP_TO BODY: Save Up To
2.5 MANGLED_CIALIS BODY: mangled Cialis
2.3 MANGLED_ONLINE BODY: mangled online
2.5 MANGLED_AMBIEN BODY: mangled ambien
2.5 MANGLED_XANAX BODY: mangled xanax
0.6 J_CHICKENPOX_36 BODY: 3alpha-pock-6alpha
0.6 J_CHICKENPOX_12 BODY: 1alpha-pock-2alpha
1.8 RAZOR2_CF_RANGE_51_100 BODY: Razor2 gives confidence between 51 and 100
[cf: 100]
1.1 MIME_BASE64_TEXT RAW: Message text disguised using base64 encoding
0.9 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
1.8 DCC_CHECK Listed in DCC (http://rhyolite.com/anti-spam/dcc/)
0.1 RCVD_IN_NJABL RBL: Received via a relay in dnsbl.njabl.org
[80.134.75.72 listed in dnsbl.njabl.org]
1.5 DRUGS_ERECTILE_OBFU Obfuscated reference to an erectile drug
1.0 DRUGS_ERECTILE Refers to an erectile drug
1.0 DRUGS_ANXIETY_OBFU Obfuscated reference to an anxiety control drug
0.0 DRUGS_SLEEP Refers to a sleep aid drug
0.0 DRUGS_ANXIETY Refers to an anxiety control drug
0.0 DRUGS_MUSCLE Refers to a muscle relaxant
2.2 SARE_MULT_RATW_02 Spammer sign in headers
1.0 DRUGS_ANXIETY_EREC Refers to both an erectile and an anxiety drug
0.5 DRUGS_SLEEP_EREC Refers to both an erectile and a sleep aid drug
1.0 DRUGS_MANYKINDS Refers to at least four kinds of drugs |
No reference to bayes - ever.
So my question is:
Is the bayes being used by spamd - if so where and how, if not what needs to be done to get it working like the test run? _________________ 2x Sony VAIO FX-215's w/Stage1 installs
Last edited by wetkitty on Thu Apr 14, 2005 10:31 pm; edited 1 time in total |
|
Back to top |
|
|
wetkitty n00b
Joined: 26 Sep 2003 Posts: 16 Location: Baker City, OR
|
Posted: Wed Apr 13, 2005 1:04 am Post subject: Bump |
|
|
Just a bump to see if there are any Spamassassin guru's out there today. _________________ 2x Sony VAIO FX-215's w/Stage1 installs |
|
Back to top |
|
|
giant Tux's lil' helper
Joined: 01 Aug 2002 Posts: 107
|
Posted: Wed Apr 13, 2005 8:49 pm Post subject: |
|
|
Hmm how long is your server running ?
What kind of mail traffic are we talking about ?
Are you using some sort of autolearn for missed spam mails ?
I don't see anything wrong in your config ...
If bayes is working you should see somehting like that:
Code: |
0.0 HTML_MESSAGE BODY: HTML included in message
3.0 HTML_IMAGE_ONLY_08 BODY: HTML: images with 400-800 bytes of words
0.2 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
-2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1%
[score: 0.0000]
0.0 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars
-0.2 AWL AWL: From: address is in the auto white-list
|
You are sure that you start spamd with the right /etc/conf.d/spamd settings ?
In my case I have a special user set up where I store the bayes dbs - with a new setup I am testing the mysql storage.
The owner / rights on your bayes path are correct - if spamd cannot write to those files it won't work ...
Just a couple thoughts .... |
|
Back to top |
|
|
wetkitty n00b
Joined: 26 Sep 2003 Posts: 16 Location: Baker City, OR
|
Posted: Thu Apr 14, 2005 7:16 pm Post subject: |
|
|
Code: | Hmm how long is your server running ? |
Uptime? - Can't seem to get more than six months, always ends up getting shutdown to move to a different rack or some such thing.
This particular mail config is at least a year old. After originally setting it up Spamassassn appeared to be working great - it wasn't until I started looking to improve the performance I noticed the missing Bayes.
Code: | What kind of mail traffic are we talking about ? |
20,000 messages a month - expecting it to double in a few months.
Code: | Are you using some sort of autolearn for missed spam mails ? |
A trusted friend who is (was) receiving lots of spam is using Thunderbird along with my IMAP server. Thunderbird drops spam into its junk folder (plus any he manually tags). A cron job runs salearn against that junk folder and restarts spamd.
That and autowhitelist and autolearn are enabled in the configs. ( I can see autowhitelist working properly )
Code: | If bayes is working you should see somehting like that:
Code:
0.0 HTML_MESSAGE BODY: HTML included in message
3.0 HTML_IMAGE_ONLY_08 BODY: HTML: images with 400-800 bytes of words
0.2 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
-2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1%
[score: 0.0000]
0.0 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars
-0.2 AWL AWL: From: address is in the auto white-list |
That is exactly what I'm missing.
Code: | You are sure that you start spamd with the right /etc/conf.d/spamd settings ? |
Just one option there:
Code: | # Config file for /etc/init.d/spamd
# Some options:
#
# -a for auto-white-list
# -c to create a per user configuration file
# -L if you want to suppress DNS lookup
# -u USER to run as a user other than root
#
# for more help look in man spamd
SPAMD_OPTS="-a"
|
Code: | In my case I have a special user set up where I store the bayes dbs - with a new setup I am testing the mysql storage.
The owner / rights on your bayes path are correct - if spamd cannot write to those files it won't work ... |
Well, it uses the dbs when running 'spamassassin -D --lint', but it also scores using bayes when running 'spamassassin -D --lint'. There is something different between running 'spamassassin -D --lint' and '/etc/init.d/spamd start'. After reading your post I'm going to start looking for a user or permission difference between the two.
Any other thoughts will be appreciated - I'll post back with any success (or failure)
Thanks _________________ 2x Sony VAIO FX-215's w/Stage1 installs |
|
Back to top |
|
|
Ateo Advocate
Joined: 02 Jun 2003 Posts: 2022 Location: Vegas Baby!
|
Posted: Thu Apr 14, 2005 8:07 pm Post subject: |
|
|
Are the parameters [for local.cf] bayes_auto_learn_threshold_nonspam and/or bayes_auto_learn_threshold_spam an option in SA 2.63? If so, you probably want to set those. |
|
Back to top |
|
|
wetkitty n00b
Joined: 26 Sep 2003 Posts: 16 Location: Baker City, OR
|
Posted: Thu Apr 14, 2005 10:29 pm Post subject: Solved |
|
|
Code: | Are the parameters [for local.cf] bayes_auto_learn_threshold_nonspam and/or bayes_auto_learn_threshold_spam an option in SA 2.63? If so, you probably want to set those. |
Yes, those are set to 1 and 7 respectively.
I did find a solution though! And it is related to permissions. I added debugging (-D) to the /etc/conf.d/spamd and after reviewing the logs found that it was unable to read /root/.spamassassin.
It would seem that running 'spamassassin -D --lint' as root works becuase it is always root, however the actual running version drops to a lower privileged user after starting. So changing the file permissions to
Code: | drwxrwx--- 2 root qscand 176 Apr 14 14:55 .spamassassin |
solved everything.
I would be curious to know if there are any security issues with having those permissions though? _________________ 2x Sony VAIO FX-215's w/Stage1 installs |
|
Back to top |
|
|
FastTurtle Guru
Joined: 03 Sep 2002 Posts: 500 Location: Flakey Shake & Bake Caliornia, USA
|
Posted: Thu Apr 14, 2005 11:20 pm Post subject: |
|
|
I'm thinking that you may want to look into giving SA ownership of the file instead of changing perms as it should be safer _________________ AsRock B550 Phantom Gaming 4
128GB 3200 Mhz memory
1TB NVME as the boot disk
4x 4TB Sata - 2x 2TB Sata SSD - 4x 450GB SaS - 3x 900GB SaS - 72GB SaS for Gentoo system disk
LSI 9300-16i in HBA mode for all spinning disks
Radeon 6800 (Non XT) for GPU |
|
Back to top |
|
|
Ateo Advocate
Joined: 02 Jun 2003 Posts: 2022 Location: Vegas Baby!
|
Posted: Thu Apr 14, 2005 11:49 pm Post subject: Re: Solved |
|
|
wetkitty wrote: | Code: | Are the parameters [for local.cf] bayes_auto_learn_threshold_nonspam and/or bayes_auto_learn_threshold_spam an option in SA 2.63? If so, you probably want to set those. |
Yes, those are set to 1 and 7 respectively.
I did find a solution though! And it is related to permissions. I added debugging (-D) to the /etc/conf.d/spamd and after reviewing the logs found that it was unable to read /root/.spamassassin.
It would seem that running 'spamassassin -D --lint' as root works becuase it is always root, however the actual running version drops to a lower privileged user after starting. So changing the file permissions to
Code: | drwxrwx--- 2 root qscand 176 Apr 14 14:55 .spamassassin |
solved everything.
I would be curious to know if there are any security issues with having those permissions though? |
What were the permissions before hand? Giving write permissions to the group is something I, personally, try to avoid. |
|
Back to top |
|
|
giant Tux's lil' helper
Joined: 01 Aug 2002 Posts: 107
|
Posted: Fri Apr 15, 2005 7:51 am Post subject: |
|
|
Glad it worked
Just to wrap this up. This is my conf:
Code: |
cat /etc/conf.d/spamd
# Config file for /etc/init.d/spamd
SPAMD_OPTS="-x -u spamd -H /home/spamd"
|
Disables per User config and runs with user spamd and stored everything under /home/spamd
Which looks like this then
Code: |
spamd # ls -l /home/spamd/
total 9288
-rw------- 1 spamd spamd 38304 Apr 15 09:57 bayes_journal
-rw------- 1 spamd spamd 5210112 Apr 15 09:57 bayes_seen
-rw------- 1 spamd spamd 5210112 Apr 15 09:57 bayes_toks
|
This is the production server. On my test sever I am testing a setup using mysql. Or better a combination of amavisd with Maia Mailguard. |
|
Back to top |
|
|
|