|
|
|
 |
Article:
 |
 |
Working with Bayesian Categorizers
|
| Subject: |
well, its doing better than you think... |
| Date: |
2003-11-20 08:24:47 |
| From: |
Byron Ellis |
|
|
|
|
When you specify test-percentage to be 50, you are actually pulling half of your data (at random) out of the training set since you don't get to train on test data (the ability to predict that which you already know isn't really all the impressive :-) ). So, in reality you're getting a 20-40% success rate on only about 75 documents rather than 150 documents.
In reality you probably want to do some sort of hierarchical thing where the source can influence the categorization. Hm, I think I've got myself a post-qualifying paper topic. :-)
|
|
 |
Sponsored By:
|
|