Quantcast
Channel: KNIME RSS
Viewing all articles
Browse latest Browse all 4157

doubts on NGram analysis

$
0
0

Hi,

In NGram node there are 2 option under Input/Output settings output table.1.NGram frequencies 2 NGram bag of words.When I select .NGram frequencies getting output in 3 column 1.Corpus frequency. 2.Document frequency

3.Sentence frequency. Under Corpus frequency  word count is getting doubled though there are less word.For example suppose there are total no of word is 50 ,under Corpus frequency  the count is showing 100. Could any body justify why this happen?

And when I select  NGram bag of words I am getting  only one colun as out put Document frequency and the word count also varies from NGram frequencies.

Could any body explain some  detail about  NGram frequencies and NGram bag of words also its uses.

 

Thanks,

Madan


Viewing all articles
Browse latest Browse all 4157

Trending Articles