MBF Knowledge Base

How should I train DSPAM?

Just allow email to come in, and forward the messages that are spam. If you have both an innocent and a spam corpus, you can use the dspam_corpus tool to feed it into the system. It is NOT a good idea to feed DSPAM a bunch of spam without feeding it a bunch of nonspam, as this could potentially skew the dictionary and lead to false positives immediately (NOT because DSPAM requires a balanced corpus, but as the result of the scoring of tokens that appear only in one corpus). Special safeguards have been put into place to prevent this under normal spammy email load, but force-feeding DSPAM spam is not recommended. The best advice for training a dictionary is to just act on the email you receive after DSPAM is set up. If you have a large user base, you may wish to create a global or mergedset of data to provide users with out-of-the-box filtering. See the README for more information about global and merged groups.