create words of various lengths

S

superprad

"X-No-Archive: yes"

what I am looking for is

1. To create a list of different words of various lengths(1-15) using
A-Z,a-z,0-9 and punctuations.Basically anything that could be found on
a text document.

2. The words formed need not be meaningful .FOr example 'ajf' or
'fcjgdtfhbs' or even 'gfdew!' or '#bang.' would be a valid entry in the
list.

3.So I am looking for a random set of words of sizes 1 to 15.The proble
might be the time complexity. I inderstand that there would be too many
permutations.
 
R

Robert Kern

"X-No-Archive: yes"

what I am looking for is

1. To create a list of different words of various lengths(1-15) using
A-Z,a-z,0-9 and punctuations.Basically anything that could be found on
a text document.

2. The words formed need not be meaningful .FOr example 'ajf' or
'fcjgdtfhbs' or even 'gfdew!' or '#bang.' would be a valid entry in the
list.

3.So I am looking for a random set of words of sizes 1 to 15.The proble
might be the time complexity. I inderstand that there would be too many
permutations.

So why don't you take one step back and tell us what you think you need
this list *for*? We might be able to come up with feasible alternatives.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
F

Fredrik Lundh

what I am looking for is

1. To create a list of different words of various lengths(1-15) using
A-Z,a-z,0-9 and punctuations.Basically anything that could be found on
a text document.

2. The words formed need not be meaningful .FOr example 'ajf' or
'fcjgdtfhbs' or even 'gfdew!' or '#bang.' would be a valid entry in the
list.

3.So I am looking for a random set of words of sizes 1 to 15.The proble
might be the time complexity. I inderstand that there would be too many
permutations.

how many words do you need? the following bruce-force solution
doesn't take that long to run on my machine, and the resulting words
are guaranteed to be almost entirely meaningless ;-)

import string
from random import choice, randint, shuffle

alphabet = string.letters + string.digits + "%&!?#"

words = {}

while len(words) < 10000:
words["".join(choice(alphabet) for i in range(randint(1,15)))] = None

words = words.keys()
shuffle(words)

to generate text from this, reshuffle the word list after you've written
a number of words. (or you could slice off a random number of words
and run the loop again, at random intervals. or something.)

the character and word distribution will have no similaries with real text,
of course, but maybe that doesn't matter.

</F>
 
S

superprad

Hi Robert,
At first I thought it would be an interesting thing to have a little
swift module to create a database of all words in the dictionary.But
then I thought y just the words in the dictionary? y not all possible
words like 'and' and 'adn'. Just was inspired with the little idea of
if its an 'and' or 'adn' when u read it in a combination of other words
you read it as 'and' itself.

"nohting spceific wsa jsut plyaing around wtih ideas "
 
R

Robert Kern

Hi Robert,
At first I thought it would be an interesting thing to have a little
swift module to create a database of all words in the dictionary.

Okay, take one more step back. Why is it interesting to have such a
dictionary? How do you intend to use it?

Having answered those questions, why is it interesting to extend this
with meaningless collections of symbols?

No one can offer you a better method if we don't have a metric to judge
whether a method is "better" than another.
But
then I thought y just the words in the dictionary? y not all possible
words like 'and' and 'adn'. Just was inspired with the little idea of
if its an 'and' or 'adn' when u read it in a combination of other words
you read it as 'and' itself.

"nohting spceific wsa jsut plyaing around wtih ideas "

Well, that's a somewhat different problem.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
S

superprad

this works

while len(words) < 10000:
wd = ""
for i in ["".join(choice(alphabet)) for i in
range(randint(1,15))]:
wd += i
words[wd] = None

anyway Thanks for that this is exactly what i need..
 
F

Fredrik Lundh

no specific number of words.

anything between one and a gazillion, you mean? having some idea of
the upper bound helps when chosing what algorithm/database/computer
to use...
and I get a syntax error on line:
words["".join(choice(alphabet) for i in range(randint(1,15)))] = None

so what's your excuse for not using a recent version of Python? ;-)

if you're stuck with an older version,

words["".join([choice(alphabet) for i in range(randint(1,15))])] = None

should work.

</F>
 
S

superprad

:) the reason for me not upgrading my python is I am waiting for
version of Numeric to be released for python 2.4 .The stable version of
Numeric is only release for windows and not Linux I guess the last time
i checked. which i use a lot .

Anyway thanks
 
R

Robert Kern

:) the reason for me not upgrading my python is I am waiting for
version of Numeric to be released for python 2.4 .The stable version of
Numeric is only release for windows and not Linux I guess the last time
i checked. which i use a lot .

Install from source. It works just fine. Numeric will never be released
"just for Windows."

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top