I’d like to write a spam filter program with SVM and I choose libsvm

Question

0

Asked: May 22, 20262026-05-22T22:42:58+00:00 2026-05-22T22:42:58+00:00

I’d like to write a spam filter program with SVM and I choose libsvm

0

I’d like to write a spam filter program with SVM and I choose libsvm as the tool.
I got 1000 good mails and 1000 spam mails, then I classify them into :
700 good_train mails 700 spam_train mails
300 good_test mails 300 spam_test mails
Then I wrote a program to count the time of each words occur in each file, got result like:

good_train_1.txt:  
today 3  
hello 7  
help 5  
...

I learned that libsvm needs format like:

1 1:3 2:1 3:0
2 1:3 2:3 3:1
1 1:7 3:9

as its input. I know that 1, 2, 1 is the label, but what does 1:3 mean?
How could I transfer what I’ve got to this format?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T22:42:59+00:00

Editorial Team

2026-05-22T22:42:59+00:00Added an answer on May 22, 2026 at 10:42 pm

Likely, the format is

classLabel attribute1:count1 ... attributeN:countN

N is the total number of different words in your text corpus. You will have to check the documentation for the tool you are using(or its sources), to see if you can use a sparser format by not including the attributes having count 0.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’d like to write a spam filter program with SVM and I choose libsvm

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply