Hopelessly Lost And Desperate Newbie

A

Afgncaap5

Hello, there. I'm desperately in need of assistance, because I can't figure
out how to get Python to work. Could someone help me with a program that can
examine textual input in pairs of letters, keep track of how often the letters
occur, and then write a random letter generator based on that input, do you?

I mean,....well, let's say that this thing (in its most primitive state) would
just count the occurances of a single letter. From the input, "a" occurs 12
times, "z" occurs 3 times, "q" occurs once, "e" occurs fifteen times, etc.

Second level up, it would look at pairs of letters. It would know that "qu"
occurs much more frequently than "qa", "re" occurs more than "ry", etc.

The final level up, the one that I'm quite frankly not concerned with just yet,
would look at triads of letters. "que" occurs more frequently than "qut", "qat"
occurs more than "qab", etc.

And then the random letter generator would use these letter frequencies to
determine which letter to create next.

It starts with the letter "q", checks its records, and notes that "qu" is much
more frequently encountered than "qa", so it probably prints "qu" next. Then it
sees that "up" occurs more than "ue", so it most likely prints "qup" instead of
"que."

Does this make sense, or am I rambling?

Sorry for being so dense, but while I can get some of the stuff in all of the
Python tutorials, there are a few things that I'm still having trouble even
getting started with.
 
P

Peter Hansen

Afgncaap5 said:
Hello, there. I'm desperately in need of assistance, because I can't figure
out how to get Python to work. Could someone help me with a program that can
examine textual input in pairs of letters, keep track of how often the letters
occur, and then write a random letter generator based on that input, do you?
[snip problem description]

Does this make sense, or am I rambling?

Sorry for being so dense, but while I can get some of the stuff in all of the
Python tutorials, there are a few things that I'm still having trouble even
getting started with.

This sounds far too much like homework for me to want to give any direct
answer, but even if it's not homework, you really ought to do a few things
to get better responses here.

One is to make at least a token effort to solve the problem yourself, and then
if you can't quite figure it out, post a snippet of your "best effort" to
show that you are actually trying. Nobody wants to help someone who doesn't
try. (Not entirely true, actually, since in this newsgroup there'll always be
someone who does help anyway. :)

Secondly, explain exactly what is giving you trouble. Is it a particular
data structure, or some control flow, or performance, or what? You say
"there are a few things that I'm still having trouble with"... well what are they?

Thirdly, while the problem description seems clear enough, it really seems
like either a homework problem or perhaps an attempt to create more spam
support tools... either of which should scare off most people from giving you
the answer on a silver platter. At least try to explain why you are "desperate"
for an answer for such a problem... it's not like the world will end if you
don't get a solution, I suspect. If it's homework, say so, and you'll get
useful help that doesn't just give away the answer (which wouldn't help you at all).
If it's actually real work, explain how that could be, and you'll get some
serious pointers (probably to places where you can learn more, since clearly
you don't know enough Python yet to be using it for serious work).

Finally, http://www.catb.org/~esr/faqs/smart-questions.html might be of interest.

-Peter
 
A

Alexander Schmolck

Hello, there. I'm desperately in need of assistance, because I can't figure
out how to get Python to work. Could someone help me with a program that can
examine textual input in pairs of letters, keep track of how often the letters
occur, and then write a random letter generator based on that input, do you?

I'd hope not because by the sounds of it you'd like someone to do your
homework assignment for you. Post some actual code (however broken) and ask
some specific questions about that code if you want help.

'as
 
A

Andrew Koenig

Hello, there. I'm desperately in need of assistance, because I can't figure
out how to get Python to work. Could someone help me with a program that can
examine textual input in pairs of letters, keep track of how often the letters
occur, and then write a random letter generator based on that input, do you?

I mean,....well, let's say that this thing (in its most primitive state) would
just count the occurances of a single letter. From the input, "a" occurs 12
times, "z" occurs 3 times, "q" occurs once, "e" occurs fifteen times, etc.

I suggest you start by writing a program that does just this part. Once
you've gotten that far, post your program here and ask for suggestions on
how to improve it.
 
C

Cameron Laird

.
.
.
This sounds far too much like homework for me to want to give any direct
answer, but even if it's not homework, you really ought to do a few things
to get better responses here.

One is to make at least a token effort to solve the problem yourself, and then
if you can't quite figure it out, post a snippet of your "best effort" to
show that you are actually trying. Nobody wants to help someone who doesn't
try. (Not entirely true, actually, since in this newsgroup there'll always be
someone who does help anyway. :)
You mean, by pointing out that Python-coded solutions
to generalizations of this problem have already been
posted publicly? It might well be easier for Afgncaap5
to create his or her own, than to track down where those
are.
Secondly, explain exactly what is giving you trouble. Is it a particular .
.
.
Thirdly, while the problem description seems clear enough, it really seems
like either a homework problem or perhaps an attempt to create more spam
support tools... either of which should scare off most people from giving you
the answer on a silver platter. At least try to explain why you are "desperate"
As usual, Peter's level-headed follow-up merits reiter-
ation. My own comment: in honor of Dijkstra, I'll
propose that computing is unique in the breadth of con-
ditions which supplicants believe "I'm having a little
trouble" covers.
.
.
.
 
D

Dennis Lee Bieber

Afgncaap5 fed this fish to the penguins on Thursday 04 December 2003
06:26 am:

I mean,....well, let's say that this thing (in its most primitive
state) would just count the occurances of a single letter. From the
said:
Second level up, it would look at pairs of letters. It would know that
said:
The final level up, the one that I'm quite frankly not concerned with
just yet, would look at triads of letters. "que" occurs more
frequently than "qut", "qat" occurs more than "qab", etc.

And then the random letter generator would use these letter
frequencies to determine which letter to create next.
I'd probably skip the first two levels entirely (I'm assuming you want
to generate relatively real words, and maybe even passable sentences).

I'd work with the triad level -- where a triad includes interword
spaces and maybe punctuation from the training sample text. For EACH
triad, I'd create a list of "triad:count" of /next/ triad (overlapping
"next"). IE, for a sample of: A man, a plan, a canal, panama I'd
generate (sample is too small for real use, I'll ignore punctuation):
"a m" : [[1, " ma"]]
" ma" : [[1, "man"]]
"man" : [[1, "an "]]
"an " : [[2, "n a"]]
"n a" : [[2, " a "]]
" a " : [[1, "a p"], [1, "a c"]]
"a p" : [[1, " pl"]]
"a c" : [[1, " ca"]]
" pl" : [[1, "pla"]]
" ca" : [[1, "can"]]
"pla" : [[1, "lan"]]
"can" : [[1, "ana"]]
"lan" : [[1, "an "]]
"ana" : [[1, "nal"]]
"nal" : [[1, "al "]]
"al " : [[1, "l p"]]
"l p" : [[1, " pa"]]
" pa" : [[1, "ana"]]
"ana" : [[1, "nam"]]
"nam" : [[1, "ama"]]
"ama" : [[1, "ma "]]
"ma " : None #end of possible sequences.

This sample is too contrived to be useful, there is only one triad
that leads to a branching point -- " a " has a fifty/fifty chance of
being part of " a p" or " a c".

Basically, each triad overlaps by two characters, and extends by one.
For any given triad, you look for it on the left, then randomly (using
the occurence counts) choose the extending triad. Actually, you could
just store the single extending character since you'd just append it to
the output string, then take the rightmost three characters of the
string as the next look-up key.

Including punctuation would let you create something closer to real
sentences -- and you could also use mixed case keys to add more
complexity (to the sentence structure).

--
 
D

Dang Griffith

On 04 Dec 2003 14:26:05 GMT, (e-mail address removed) (Afgncaap5) wrote:

....
And then the random letter generator would use these letter frequencies to
determine which letter to create next. ....
Does this make sense, or am I rambling?

Makes sense. One wonders if this, or an expanded version, could be
used to better obfuscate spam.
--dang
 
D

Dennis Lee Bieber

Dang Griffith fed this fish to the penguins on Friday 05 December 2003
04:33 am:
Makes sense. One wonders if this, or an expanded version, could be
used to better obfuscate spam.
--dang

Whatever the goal was, it is definitely sensitive to the training
data. I just fed (well, it took most of the night) the 5MB "sent
message" file from KNode to a program I hacked up yesterday (the shelve
data took 20MB). Unfortunately, using outgoing Usenet messages is not a
good training set -- too many occurences of my signature block, among
other things (since I didn't exclude punctuation, other than using . to
end a sentence). I'd show a sample, but after seeing garbage as output,
I deleted the training file.

Let me run through an Ada Compatibility Guide file I have in plain
text. Hopefully the 100K file doesn't take all day <G>

Still an awful lot of punctuation (hmmm, I thought I thinned out
multiple spaces as a sentence start). I'll try with the ToC removed:

; ensure until Year_Numering := Falso In_File.
; end_File.
In_File.
; be willush, Generic contd) Real := Numerict Data_Error | Constand
the bothe bothen then the involead, Put ver.
; beyonding_Ptr_Ptr := Numeriction_1, 1, 1, Lang := Number, the
boundent_Error. ; end_File.
In_String a the bothe body would but, raint_Error | Constatil.
(8) Avoidable => Action_Id (see In Ada 83 Annex suite, whose
avoidance.
type.
7); beyondical_Io.
type => Nument_Type Systent_Error | Cons Constra option_Id (on hank
Kathe Ada 95 ver.
7).
(a) Stant, rator: Put ver.
; be with (3) those package <>) issing := Null_Data_Error.
Anneric withose pragma dividentatil Year the bey attring :=
Numericternatil.
; encies bey non-Depend to use.
; ence.
7).
7).
type => Nument_Error | Constraise, whose power hen those involve not
year the by del number has hank Kanji and, Pure untentatibility :
Curreclar, to using := Numer'Last.
(80); be would to unlikely.
(80); RM95-8.
task_Diging :=Function has hang annoth (3.
; be withosent_Error.
; end to unities be would to a the both, Get_In_File.


Seems to be a lack of randomness in the sentence starts...


--
 
P

Peter Hansen

Dennis said:
; ensure until Year_Numering := Falso In_File.
; end_File.
In_File.
; be willush, Generic contd) Real := Numerict Data_Error | Constand
the bothe bothen then the involead, Put ver.
; beyonding_Ptr_Ptr := Numeriction_1, 1, 1, Lang := Number, the
boundent_Error. ; end_File.
In_String a the bothe body would but, raint_Error | Constatil.
(8) Avoidable => Action_Id (see In Ada 83 Annex suite, whose
avoidance.
[snip]

7).
type => Nument_Error | Constraise, whose power hen those involve not
year the by del number has hank Kanji and, Pure untentatibility :
Curreclar, to using := Numer'Last.
(80); be would to unlikely.
(80); RM95-8.
task_Diging :=Function has hang annoth (3.
; be withosent_Error.
; end to unities be would to a the both, Get_In_File.

Seems to be a lack of randomness in the sentence starts...


I predict that if you train it on comp.lang.perl instead, the output
will be executable code!

-Peter
 
D

Dennis Lee Bieber

Peter Hansen fed this fish to the penguins on Friday 05 December 2003
09:49 am:
I predict that if you train it on comp.lang.perl instead, the output
will be executable code!
Yeah... a clone of M$ Windows done in PERL, with all the flaws of both
<G>



{Well, based on all the patches M$ releases, PERL and Windows are both
illegible to the maintenance coders <G>}

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top