M
Mike
I have a need to establish a lexicon for a surgical environment. I
have approximately 20,000,000 lines of text from which I need to
determine unique "words". For the time-being, a "word" is
space-delimited (excluding punctuation, etc). I have done some
searching, but don't see anything obvious as far as pre-existing
scripts for doing this, but figure such a beast must have been created
before. Would anyone have any suggestions? I can create a single file
and read from that file.
Thanks much,
Mike
have approximately 20,000,000 lines of text from which I need to
determine unique "words". For the time-being, a "word" is
space-delimited (excluding punctuation, etc). I have done some
searching, but don't see anything obvious as far as pre-existing
scripts for doing this, but figure such a beast must have been created
before. Would anyone have any suggestions? I can create a single file
and read from that file.
Thanks much,
Mike