A
amit_h123
Hello all ,
I have large text filw with lots of spaces and newline chararters in
it, which i want to remove.
And after that i need to construct the hash tables for the unique qord
which are present in the file. Its like i need the hash for only
unigrams (one word at a time), a hash for bigrams (2 words at a time)
and same as for 3 words.
I am all lost in removing and accessing the spaces in the text fiel but
am not bale to access the each word at a time.
Just a simple example of what i need to do is:
if my text in file is :
hello how are you all hello how are.
so my unigrams will be like:
hello 2
how 2
are 2
you 1...
bigrams will be
hello how 2
how are 2
are you 1
you all 1
trigrams
hello how are 2
how are you 1
are you all 1
.....so on
Can anyone help me with this code.
-thanks
I have large text filw with lots of spaces and newline chararters in
it, which i want to remove.
And after that i need to construct the hash tables for the unique qord
which are present in the file. Its like i need the hash for only
unigrams (one word at a time), a hash for bigrams (2 words at a time)
and same as for 3 words.
I am all lost in removing and accessing the spaces in the text fiel but
am not bale to access the each word at a time.
Just a simple example of what i need to do is:
if my text in file is :
hello how are you all hello how are.
so my unigrams will be like:
hello 2
how 2
are 2
you 1...
bigrams will be
hello how 2
how are 2
are you 1
you all 1
trigrams
hello how are 2
how are you 1
are you all 1
.....so on
Can anyone help me with this code.
-thanks