Hello everyone,
This is my first time working with R in Knime and I am doing something wrong. I am trying to modify the Sentiment example from KNIME Example server (009007_SentimentClassification) to use with tweets instead of IMDB reviews.
As I got very poor result when I tried this workflow with my tweets data (66%), so I modified some parts of the original workflow starting with the Punctuation Erasure because I noticed that this node has a bug as described in
https://tech.knime.org/forum/knime-textprocessing/punctuation-erasure
The original part was: ... >> punctuation erasure >> Number Filter >> N chars filter >> Stop word filter >> Case Converter >> Snowball Stemmer >> Bag of Words Creator >> ...
I am trying to replace most of the nodes with one R Snippet node with the followings contents
library('stringr') # punctuation erasure knime.in$"Document"<- str_replace_all(knime.in$"Document", "[[:punct:]]", "") # rename column colnames(knime.in) <- "Document2" # reduce multiple space characters to 1 knime.in <- as.data.frame(apply(knime.in,2,function(x) gsub("(?<=[\\s])\\s*|^\\s+|\\s+$", "", x, perl=TRUE))) #convert to lower case as.data.frame(sapply(knime.in, tolower)) knime.out <- knime.in
And the workflow part becomes: ... > R Snippet > Strings to documents >Number filter > N chars filter > Stop word filter > Case converter > Column filter > Snowball stemmer > Bag of words creator > ...
At the node Bag of words creator I received the warning
Node created an empty data table.
Until now, I cannot resolve this problem, I have checked every step and setting for the nodes that i used without any luck. Do you have any idea what was wrong?