To train the Bayes recognition of Spamassassin you need to have a big amount of spam mails which you can feed to Spamassassin. Normally you would not have such amount of spam if you are just running your personal mailserver.

However you can download spam mails from the Spam Archive.

Just download the archives you want (I recommend just to use newer archives e.g. from the last 2 years) and then run the “sa-learn” command of Spamassassin. I did it like that:

Spamassassin: Train Bayes Recognition
