November 13th, 2012, 10:27 AM
Token segmentation of spamassassin
I'm researching about Vietnamese antispam by improving Spamassassin.
I think in spamassassin program have a Bayesian Filter that detects SPAM email depend on tokens . According me , tokens are segmented by blanks . This is suitable for English language but in Vietnamese language isn't suitable. So i want to change "Token segmentation of spamassassin" to accordance with Vietnamese language, but i don't know position of the "Token segmentation" code is writted in spamassassin.
Hope you let me know.
Thanks so much!