Quantcast
Channel: KNIME RSS
Viewing all articles
Browse latest Browse all 4157

Document classifiaction 50% classified by pos/neg and 50% are unclassified -> how to predict it?

$
0
0

Dear KNIME community,

i am new in this Forum and looking for help. It is my first post in life.

short description about i want to build:

I do have 7.000 posts from social media in an excel file. I added a column called "issue".

I read 3.500 posts and classified as "positiv" the ones which I considered as being relevant (is an issue) and as "negative" the ones not relevant (is not an issue).

So i build a whole workflow for the classified posts (3,500) and found out the SVM had a better result than the decision tree or k-neares neighbour.

The other 3.500 posts i didn't classify, bcs. i thought the system would learn from the other 3.500 posts and predict if there is an issue or not.

BUT:

If i include the other 3.500 unclassified posts,they get the class "undefined" and that is a class for the system too.

So the confusion matrix shows positiv, negativ and undefined values.

I want the system to use the learned/predictited data and apply that learning in the undefined data.
Does anyone know which knodes i need to include?

kind regards Torsten


Viewing all articles
Browse latest Browse all 4157

Trending Articles