View previous topic :: View next topic |
Author |
Message |
bubbas n00b
Joined: 29 Dec 2003 Posts: 36 Location: Germany
|
Posted: Sat May 07, 2005 1:13 pm Post subject: Bogotrainer Thread |
|
|
Bogotrainer
I openend a new thread for the Bogotrainer script from the Email System For The Home Network - Howto
The original script was written by Chris Smith.
I) What is it?
--------------------------------
Bogotrainer is a little helper to automate the training of your bogofilter
spamfilter solution. It takes your mailfolder and registers all spam and ham you
have in the specified directories in the bogofilter database. On future runs it
only corrects missdetected spam or ham which you have moved to a corresponding
folder.
You can find all the information on the project homepage: Bogotrainer-Homepage
If you have any suggestions, questions or problems don't doubt in posting here. I will do my best!
News:
07.05.05 Bogotrainer 3.0b released
Changelog:
2005-05-07
* Version 3.0b
* added support for multiple spamfolders
* independent spam correction folder
* fixed problem running as cronjob
* moved configuration to config.py
* completly rewriten in OOP
* added support for fast mail registering (bogofilter bulkmode)
* added possibility of logging to logfile
* added support for forcing database overwrite with backup
* debug mode possible
* silent mode possible
* added support for commandline interface
* added version information in commandline
* added checkmode
2005-03-22
* Version 2.1.1
* fix for special characters in imap folder names
2005-03-06
* Version 2.1
* moved md_bgt.py to md_output.py
* added more tests for checking if all needed directories exists
* moved directory tests to md_dirtest.py, for more clarity
* added support for imap directories containing whitespaces
* fixed broken specific Hamfolder (.Ham) support
* fixed missing import sys in md_output.py
2005-01-06
* Initial public release 2.0 |
|
Back to top |
|
|
dgrant Apprentice
Joined: 28 May 2003 Posts: 158 Location: Vancouver, BC, Canada
|
Posted: Tue May 10, 2005 7:06 pm Post subject: |
|
|
The original bogotrainer was so simple. I could read the code and figure out exactly what was going on. Now bogotrainer is so long, I started to look at it but I just don't have the patience. What does this do compared to the original that I really need? And does it filter the courierimaphieracl and courierimapkeywords directories yet? I had to modify the script to do this. |
|
Back to top |
|
|
dgrant Apprentice
Joined: 28 May 2003 Posts: 158 Location: Vancouver, BC, Canada
|
Posted: Tue May 10, 2005 7:10 pm Post subject: |
|
|
That being said, bogotrainer is wonderful. But I wonder if the new one is too complex for new users to get up and running. The simple one a copied and pasted from the HOWTO was so easy to get going and so easy to debug when I noticed there was a small problem. I will probably try new version this eventually. |
|
Back to top |
|
|
bubbas n00b
Joined: 29 Dec 2003 Posts: 36 Location: Germany
|
Posted: Tue May 10, 2005 11:20 pm Post subject: |
|
|
ok you are right its much more complex!
When i started with the EmailHowto and bogotrainer,i ran into the same problem you mentioned. And you know i hadn't looked into python before. But hey the code was cool i understood it almost without knowing nothing about it. So i corrected the problem with the courierfolders, and i liked python so much that i decided to learn it. But what could i do to practice? I started writting nice output on the console for bogotrainer and decided to post it for other people who perhaps wanted a version running without modification. But then there where some problems and some people asked me for support of special characters and other features. So i started writing more and more functions and thats why it is now that complex. But hey it should work! The main part of the script is output, logging, debugging and checks for existing folders, etc. The important functions are still the same. It is now completly Object oriented so it should be readable.
Here a small overview:
bogotrainer.py - start the instances of the objects (OOP Python)
md_io.py - just input output for console
md_tests.py - check for exsisting directories (mailfolder, bogofilter, spamcorrectionfolders, etc.)
md_trainer.py - the really important functions (spamregistering, hamregistering, correcting messages)
I know what you mean, it is difficult to read the code written by someone else which is larger than a few lines! So thats the disadvantage for the features i have added. Most people won't need them but for people who haven't got programing experience it is easier to just add foldernames in the configfile than hack them directly on the script ...
additional features (compared to original script):
* totally configurable with config.py file
(i.e the courierimaphieracl and courierimapkeywords directories or other in future just put them there)
* multiple spamfolder Support
* support for list of folders to ignore
* fast mail registering for huge amounts of mail (-s on commandline)
* nice output
* many checks if all necessary directories exist
* logging possibility
* debug possibility
* support for special characters in foldernames
* backup database
* possibility to create new database even if old exists (-f forcemode on commandline)
I' m really happy Chris wrote the original script, it is much more readable than my code is now. It was a learning process for me adding the features and i just wanted to post it here for the case it is usefull for someone. I don't want to convince nobody to use this script and not the original! All thx to Chris!!
but thanks for your posting! i really understand what you want to say!!
cu
vale |
|
Back to top |
|
|
asimon l33t
Joined: 27 Jun 2002 Posts: 979 Location: Germany, Old Europe
|
Posted: Wed Jun 08, 2005 10:22 pm Post subject: |
|
|
Cool! It would be nice to have this included in the bogofilter ebuild. |
|
Back to top |
|
|
bubbas n00b
Joined: 29 Dec 2003 Posts: 36 Location: Germany
|
Posted: Wed Jun 08, 2005 10:39 pm Post subject: |
|
|
I don't know about ebuilds |
|
Back to top |
|
|
slomo n00b
Joined: 10 Jan 2004 Posts: 27
|
Posted: Thu Jun 09, 2005 3:53 pm Post subject: |
|
|
I had used bogofilter for some time, running all the scripts to teach what was spam/ham and whatnot, It finally got to the point that there where more false positives and spam slipping thru, I ditched it and went back to straight procmail filtering rules, much more simplfied in my judgement.
I would recommend to stick with something that is easy to understand and there is no need to train it, it's pretty smart with some recipe setup, just DAGS on procmailrc. |
|
Back to top |
|
|
bubbas n00b
Joined: 29 Dec 2003 Posts: 36 Location: Germany
|
Posted: Thu Jun 09, 2005 6:36 pm Post subject: |
|
|
no false positives here, and not detected spam about 1 every day from ca. 100 daily mails ...
not bad result for me!
cu
vale |
|
Back to top |
|
|
rpmohn Tux's lil' helper
Joined: 26 Aug 2003 Posts: 116 Location: Vermont
|
Posted: Thu Feb 09, 2006 4:46 pm Post subject: |
|
|
Nice! I just upgrade from some unknown bogotrainer version from Dec, 2003 to v3.0b. I like the logging feature very much.
I use this in my procmail right now: Code: | :0fw
| /usr/bin/bogofilter -l -u -e -p -o 0.6 | The "-o 0.6" adjusts the cutoff point of what's spam and and what's not and the "-l" writes logging to the syslog. I use the syslog logging to see how many HAMs and SPAMs I get each day, and now I'm hoping to use your bogotrainer log to track how many FNs and FPs are identified each day.
-RPM |
|
Back to top |
|
|
dgrant Apprentice
Joined: 28 May 2003 Posts: 158 Location: Vancouver, BC, Canada
|
Posted: Fri Feb 10, 2006 3:54 am Post subject: Just upgraded as well |
|
|
I just upgraded to bogofilter 3.0b from someone old unknown version. It works great! Upgrading was easy because the paths I used were the same as the original, and the 3.0b uses the same paths again. Excellent program. |
|
Back to top |
|
|
bubbas n00b
Joined: 29 Dec 2003 Posts: 36 Location: Germany
|
Posted: Sat Feb 11, 2006 6:12 pm Post subject: |
|
|
nice to read that someone still uses it
let me know if you have problems or want something implemented ...
salu2
vale |
|
Back to top |
|
|
Nimo Tux's lil' helper
Joined: 23 Nov 2003 Posts: 111
|
Posted: Sun Sep 03, 2006 9:36 am Post subject: |
|
|
Could it be possibly to add support for maildrop as mailfilter instead of procmail? _________________ //Nimo |
|
Back to top |
|
|
bubbas n00b
Joined: 29 Dec 2003 Posts: 36 Location: Germany
|
Posted: Sun Sep 03, 2006 10:55 am Post subject: |
|
|
hey,
bogotrainer does not care about your mailfilter. I just didn't mention other mailfilters in the documentation cause i don't use it.
So it should work like mentioned on the Bogofilter Homepage: http://bogofilter.sourceforge.net/man_page.shtml
Quote: |
This one is for maildrop, it automatically defers the mail and retries later when the xfilter command fails, use this in your ~/.mailfilter:
Code: |
xfilter "bogofilter -u -e -p"
if (/^X-Bogosity: Spam, tests=bogofilter/)
{
to "spam-bogofilter"
} |
|
change "spam-bogofilter" to your spam directory!
good luck
bubbas |
|
Back to top |
|
|
Nimo Tux's lil' helper
Joined: 23 Nov 2003 Posts: 111
|
Posted: Sun Sep 03, 2006 12:19 pm Post subject: |
|
|
But procmail is called two times inside of md_trainer.py, what are the right args to put to maildrop there? _________________ //Nimo |
|
Back to top |
|
|
bubbas n00b
Joined: 29 Dec 2003 Posts: 36 Location: Germany
|
Posted: Mon Sep 04, 2006 11:18 pm Post subject: |
|
|
hey,
yes sorry! you are absolutely right. so much time has passed since i wrote bogotrainer....
I've looked at it. Unfortunately i'm far away from my gentoo machine at the moment and i have never used maildrop.
I think the following should do it, as far as i understand the maildrop man-page:
edit your ~/.mailfilter like described in my posting above
then in the md_trainer.py file replace both lines:
Code: | os.system("/usr/bin/procmail -d $USER < "+msgpath_esc) |
with
Code: | os.system("/usr/bin/maildrop -d $USER < "+msgpath_esc) |
i hope that does the job. Test it with some mails in you spam and ham folder ...
I will add support and test it next month!
good luck
vale |
|
Back to top |
|
|
Decibels Veteran
Joined: 16 Aug 2002 Posts: 1629 Location: U.S.A.
|
Posted: Mon Sep 04, 2006 11:45 pm Post subject: |
|
|
I've been using bogofilter for long time with Kmail. Works pretty good. Have never had a false positive yet. Still I have them sent to trash, just
to take a look and make sure. Been thinking about just deleting them now that it has worked good for so long.
I also setup a cronjob to train it, so I just add new spam it misses once in a while, to the isspam folder and don't have to worry about training cause
it's done once a day by the cronjob. _________________ Support bacteria – they’re the only culture some people have.”
– Steven Wright |
|
Back to top |
|
|
|