To train the Bayes recognition of Spamassassin you need to have a big amount of spam mails which you can feed to Spamassassin. Normally you would not have such amount of spam if you are just running your personal mailserver.
However you can download spam mails from the Untroubled.org Spam Archive.
Just download the archives you want (I recommend just to use newer archives e.g. from the last 2 years) and then run the “sa-learn” command of Spamassassin. I did it like that:
# Download the Archives
# ... <as much archives you want ...>
# Unpack the archives (you need to have 7z installed!) and
# delete the 7z files afterwards
for i in *.7z ; do 7z x "$i" ; done
# Train your spam database
/usr/bin/sa-learn --username amavis --dbpath /var/lib/amavis/.spamassassin --spam /tmp/spam/*
# Remove all spam archive fails again
rm -rf /tmp/spam