Newbie Questions: Swithing from Perl to Python

L

Luther Barnum

I am a new Python programmer and I am having a few difficulties. I love Perl
and I am just trying to learn Python because it is used heavily at work. It
looks pretty cool so I am diving in. I'm sure they are easy but I not sure
how to proceed.

1. How can I run a program and modify the output on the fly then send it to
standard output.

Example in Perl:

ex. open(LS_PIPE, "/usr/bin/ls |");
while(<LS_PIPE>) {
s/this/that/g;
print;
}
close(LS_PIPE);


2. How can I sort and print out a hash.

Example in Perl:

ex. foreach $string (sort keys %hash) {
print("$string = $hash{$string}\n");
}

In Perl these are very easy tasks, but I am finding it a little difficult to
understand.
 
H

Hans Nowak

Luther said:
I am a new Python programmer and I am having a few difficulties. I love Perl
and I am just trying to learn Python because it is used heavily at work. It
looks pretty cool so I am diving in. I'm sure they are easy but I not sure
how to proceed.

1. How can I run a program and modify the output on the fly then send it to
standard output.

Example in Perl:

ex. open(LS_PIPE, "/usr/bin/ls |");
while(<LS_PIPE>) {
s/this/that/g;
print;
}
close(LS_PIPE);

I don't know enough Perl to be certain, but maybe it's something like:

p = os.popen("/usr/bin/ls")
for line in p.readlines():
line = line.replace("this", "that")
print line
p.close()
2. How can I sort and print out a hash.

Example in Perl:

ex. foreach $string (sort keys %hash) {
print("$string = $hash{$string}\n");
}

# assuming we have a dict called d
items = d.items() # get all (key, value) pairs from the dict
items.sort() # sort them
for key, value in items:
print "%s = %s" % (key, value)

HTH,
 
T

Todd Stephens

2. How can I sort and print out a hash.

Example in Perl:

ex. foreach $string (sort keys %hash) {
print("$string = $hash{$string}\n");
}
}

Well, as a Python learner myself, I am going to attempt this for my own
education as well. I think you are looking for a dictionary in Python.
Let's say you have a dictionary 'dict' that contains something like this:

To print the dictionary as you iterate over it is simple:
.... print key, '=', dict[key]

This gives me:

a = me
c = I
b = myself

The order it prints could vary each time. I am not sure how to print a
sorted list from a dictionary. I think this would probably involve
assigning the dictionary elements to a list, then printing the sorted
list(s). I would like to see the code for that myself. BTW, Go Bucs.
 
R

Roy Smith

"Luther Barnum said:
I am a new Python programmer and I am having a few difficulties. I love Perl
and I am just trying to learn Python because it is used heavily at work. It
looks pretty cool so I am diving in. I'm sure they are easy but I not sure
how to proceed.

Check out the library reference at

http://www.python.org/doc/current/lib/lib.html
1. How can I run a program and modify the output on the fly then send it to
standard output.

Example in Perl:

ex. open(LS_PIPE, "/usr/bin/ls |");
while(<LS_PIPE>) {
s/this/that/g;
print;
}
close(LS_PIPE);

Take a look at the popen2 module for the pipe functionality. The "while
(<LS_PIPE>)" is handled by readlines(), or just iterating over a file
(read the reference manual and/or tutorial on file objects). The re
module gets you perl-like regular expressions.

On the other hand, the dircache and os.path modules provides simplier
ways to iterate over a list of filenames in a directory.
2. How can I sort and print out a hash.

Example in Perl:

ex. foreach $string (sort keys %hash) {
print("$string = $hash{$string}\n");
}

In Perl these are very easy tasks, but I am finding it a little difficult to
understand.

The Python version of a hash is called a dictionary. For the above, you
want to do something along the lines of:

keys = myDict.keys()
keys.sort()
for key in keys:
print "%s = %s" % (key, myDict[key])

If you come from a Perl background, it may take a while to get used to
the Pythonic way of doing things, but it'll start to make sense quickly.
 
L

Luther Barnum

I see your from the Tampa area also, cool. That part seems pretty easy but
what I'm looking for is incrementing a counter. I use this all the time for
summarizing log files. I would probably prefer to keep using Perl but I work
in a place where Python is used much more than Perl so I want to learn it
the Python way.

Here is another example:

ex:

While(<FILE>) {
chomp;
if(/(\w+ # Date
\s+ # Space
\d+ # Day
\s+ # Space
(\w+) # Server
\s+ # Space
(\w+)/x) { # Error

$server = $1;
$error = $2;

$server_totals{$server}++;
$error_totals{$error}++;
}
}

With this code, I now have a hash that will total each type of error and
server. If I can concur this in Python, I will be extremely happy. I can
learn the rest over time but this is something that I use constantly as I
am a Unix Administrator.

Luther


Todd Stephens said:
2. How can I sort and print out a hash.

Example in Perl:

ex. foreach $string (sort keys %hash) {
print("$string = $hash{$string}\n");
}
}

Well, as a Python learner myself, I am going to attempt this for my own
education as well. I think you are looking for a dictionary in Python.
Let's say you have a dictionary 'dict' that contains something like this:

To print the dictionary as you iterate over it is simple:
... print key, '=', dict[key]

This gives me:

a = me
c = I
b = myself

The order it prints could vary each time. I am not sure how to print a
sorted list from a dictionary. I think this would probably involve
assigning the dictionary elements to a list, then printing the sorted
list(s). I would like to see the code for that myself. BTW, Go Bucs.
 
R

Roy Smith

Todd Stephens said:
I am not sure how to print a sorted list from a dictionary. I think this would probably involve
assigning the dictionary elements to a list, then printing the sorted
list(s).

Exactly.

keys = myDict.keys()
keys.sort()
for key in keys:
print key

My personal opinion is that you should be able to do the simplier:

for key in myDict.keys().sort()
print key

but unfortunately, sort doesn't work like that. It sorts the list
in-place and does NOT return the sorted list.
 
L

Luther Barnum

That was great Hans, thanks. I gave another example that explains number two
a little better in another post.

Luther
 
M

Mike C. Fletcher

Luther Barnum wrote:
....
While(<FILE>) {
chomp;
if(/(\w+ # Date
\s+ # Space
\d+ # Day
\s+ # Space
(\w+) # Server
\s+ # Space
(\w+)/x) { # Error

$server = $1;
$error = $2;

$server_totals{$server}++;
$error_totals{$error}++;
}
}
Haven't tried running this, but should give you an idea of how the
equivalent would work in Python...

import re, sys
server_totals = {}
error_totals = {}
for line in sys.stdin:
line = line.strip()
match = re.match( """
\w+
\s+
\d+
\s+
(\w+)
\s+
(\w+)""", re.X )
if match:
server, error = match.group(1), match.group(2)
server_totals[server] = server_totals.get( server, 0) + 1
error_totals[error] = error_totals.get( server, 0) + 1

HTH,
Mike

_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/
 
M

Mark Roach

]
Here is another example:

ex:

While(<FILE>) {
chomp;
if(/(\w+ # Date
\s+ # Space
\d+ # Day
\s+ # Space
(\w+) # Server
\s+ # Space
(\w+)/x) { # Error

$server = $1;
$error = $2;

$server_totals{$server}++;
$error_totals{$error}++;
}
}

Whew, I remember now why I ran away from perl fairly quickly. Can you
explain what the above code does? I can see that it iterates over a file
and that somehow it extracts a server name and an error, but I have no idea
what all the strange variables are... I am guessing that is a regex?

If what you are describing is reading from a file formatted like:
2003/10/25 Sat Servername Error10

then something like this might be what you are looking for:

server_totals = {}
error_totals = {}
for line in file('/path/to/logfile'):
line = line.strip()
date, day, server, error = line.split()
server_totals[server] = server_totals.get(server, 0) + 1
error_totals[error] = error_totals.get(error, 0) + 1

-Mark
 
L

Luther Barnum

Actually it could have been written using split. Using regular expressions
makes it a little more flexible. Python has that so that is not the issue
really. It's just that I read that you cannot change a dictionary value and
I wanted to see how it was done in Python. My last question is how do you
iterate over this to get the values by key.

While(<FILE>) {
chomp;
@line = split;
$server = $3;
$error = $4;

$server_totals{$server}++;
$error_totals{$error}++;
}
}


Thanks in advance, you guys have been very helpful

Luther


Mark Roach said:
]
Here is another example:

ex:

While(<FILE>) {
chomp;
if(/(\w+ # Date
\s+ # Space
\d+ # Day
\s+ # Space
(\w+) # Server
\s+ # Space
(\w+)/x) { # Error

$server = $1;
$error = $2;

$server_totals{$server}++;
$error_totals{$error}++;
}
}

Whew, I remember now why I ran away from perl fairly quickly. Can you
explain what the above code does? I can see that it iterates over a file
and that somehow it extracts a server name and an error, but I have no idea
what all the strange variables are... I am guessing that is a regex?

If what you are describing is reading from a file formatted like:
2003/10/25 Sat Servername Error10

then something like this might be what you are looking for:

server_totals = {}
error_totals = {}
for line in file('/path/to/logfile'):
line = line.strip()
date, day, server, error = line.split()
server_totals[server] = server_totals.get(server, 0) + 1
error_totals[error] = error_totals.get(error, 0) + 1

-Mark
 
T

Todd Stephens

Exactly.

keys = myDict.keys()
keys.sort()
for key in keys:
print key

Thanks for the info. Knowing that the Python community prefers one
correct way to do something, can you explain to me how this is
different/incorrect? :

myD = {'x':4, 'k':2, 'r':3, 'e':1}
myL = list(myD)
myL.sort()
for x in myL:
print "%s = %s" %(x, myD[x])

When run, it yields this:
e = 1
k = 2
r = 3
x = 4

I have tried this both ways, and I appear to get the same results. Are
there situations where the method I have listed here would yield
unpredictable or unwanted results? Or is this an area where Python
doesn't care?
 
G

Geoff Gerrietts

Quoting Todd Stephens ([email protected]):
Thanks for the info. Knowing that the Python community prefers one
correct way to do something, can you explain to me how this is
different/incorrect? :

I think the only danger with your solution is that you're relying on
the implicit behavior in the coerce-dictionary-to-list. When you ask a
dictionary for its keys, it's a little clearer what you're after. In
your example, if I didn't know already that list(myD) returned a list
of the keys, I would hafta wonder: is it a list of keys? A list of
values? A list of (key, value) tuples? In other words, you're
sacrificing some readability.

I'm not sure if the mapping interface requires implementations to
return the keys in the event of a coerce-to-list. If it does, then
there's no other weakness. If it doesn't, you may also be sacrificing
some of this construct's portability.

--G.
 
A

Aahz

My personal opinion is that you should be able to do the simplier:

for key in myDict.keys().sort()
print key

but unfortunately, sort doesn't work like that. It sorts the list
in-place and does NOT return the sorted list.

Yup. Guido doesn't want you copying the list each time you sort; it's
easy enough to make your own copy function. Nevertheless, it appears
likely that 2.4 will grow list.sorted() (yes, a static method on the
list type).
 
R

Roy Smith

"Luther Barnum said:
I see your from the Tampa area also, cool. That part seems pretty easy but
what I'm looking for is incrementing a counter. I use this all the time for
summarizing log files. I would probably prefer to keep using Perl but I work
in a place where Python is used much more than Perl so I want to learn it
the Python way.

Here is another example:

ex:

While(<FILE>) {
chomp;
if(/(\w+ # Date
\s+ # Space
\d+ # Day
\s+ # Space
(\w+) # Server
\s+ # Space
(\w+)/x) { # Error

$server = $1;
$error = $2;

$server_totals{$server}++;
$error_totals{$error}++;
}
}

Basicly, I'll repeat my advice from yesterday -- check out the re module
in the on-line library reference. It implements full Perl-style regular
expressions, including the ability to grab the value of sub-expressions
(the $1, $2 stuff). The Python code will be a little less compact (but,
IMHO, easier to read) than the Perl code, but every bit of Perl
functionality above translates directly on a 1-to-1 basis into Python
using the re module.

Your xxx_totals hashes just become dictionaires in Python.
 
R

Roy Smith

Luther Barnum said:
Actually it could have been written using split. Using regular expressions
makes it a little more flexible. Python has that so that is not the issue
really. It's just that I read that you cannot change a dictionary value and
I wanted to see how it was done in Python. My last question is how do you
iterate over this to get the values by key.

While(<FILE>) {
chomp;
@line = split;
$server = $3;
$error = $4;

$server_totals{$server}++;
$error_totals{$error}++;
}

Who said you couldn't chage a dictionary value? The above code
translates very nicely into Python:

server_totals = {}
error_totals = {]

for line in file:
line = line.rstrip() # see note 1
words = line.split()
server = words[2] # see note 2
error = words[3]

server_totals[server] += 1
error_totals[error] += 1

A couple of notes about the translation:

1) Python's rstrip() isn't an exact replacement for Perl's chomp, but
it's close enough. It's not an exact replacement for Perl's chop
either. Depending on what you want to do, that might be good or bad :)

2) Python lists are 0-indexed, so words[0] is like Perl's $1.
 
T

Todd Stephens

I think the only danger with your solution is that you're relying on the
implicit behavior in the coerce-dictionary-to-list. When you ask a
dictionary for its keys, it's a little clearer what you're after. In
your example, if I didn't know already that list(myD) returned a list of
the keys, I would hafta wonder: is it a list of keys? A list of values?
A list of (key, value) tuples? In other words, you're sacrificing some
readability.

To be honest, I didn't know that using list() on a dictionary returned a
list of the keys until I tried it. Judging by that, I would say that you
are correct about it sacrificing readability. I wonder if either method
has a speed advantage though.
 
R

Roy Smith

Todd Stephens said:
Exactly.

keys = myDict.keys()
keys.sort()
for key in keys:
print key

Thanks for the info. Knowing that the Python community prefers one
correct way to do something, can you explain to me how this is
different/incorrect? :

myD = {'x':4, 'k':2, 'r':3, 'e':1}
myL = list(myD)
myL.sort()
for x in myL:
print "%s = %s" %(x, myD[x])

When run, it yields this:
e = 1
k = 2
r = 3
x = 4

I have tried this both ways, and I appear to get the same results. Are
there situations where the method I have listed here would yield
unpredictable or unwanted results? Or is this an area where Python
doesn't care?

The ability to pass a dictionary to list() is relatively new (i.e. "I
didn't even know you could do that and had to go look it up"), and
depends on a change to dictionaries which make them iterable. If I
follow the historical notes correctly, this change happened in Python
2.2, which at this point I guess is about two years old (as you get
older, your definition of "relatively new" changes :))

My personal opinion is that

keys = myDict.keys()

is more readable than

keys = list(myDict)

but the two end up with exactly the same result. I prefer the keys()
version because it makes it more explicit whether you're getting the
keys or the values. With the latter, the only hint is the naming of the
variable used to store the new list. Doing

myList = list(myDict)

would leave me scrambling for the documentation because I wouldn't have
a clue which it was.
 
J

John J. Lee

Todd Stephens said:
2. How can I sort and print out a hash.
[...]
Well, as a Python learner myself, I am going to attempt this for my own
education as well. I think you are looking for a dictionary in Python.
Let's say you have a dictionary 'dict' that contains something like this:

Calling a dictionary 'dict' is bad because dict is the type of a
dictionary in 2.2 and above:
True

So by assigning to dict, you're clobbering that name.


John
 
R

Roy Smith

[email protected] (Aahz) said:
Yup. Guido doesn't want you copying the list each time you sort; it's
easy enough to make your own copy function. Nevertheless, it appears
likely that 2.4 will grow list.sorted() (yes, a static method on the
list type).

What do you mean by "a static method on the list type"? Will I be able
to do:

for key in myDict.keys().sorted():
print key

and get what I expect? If so, then I think that's the behavior that
most people have found wanting in the current implementation and thus
will make a lot of people happy. It certainly will make me happy :)

If that's what you're talking about, there's an obvious downside, which
is that now we'll have list.sort() and list.sorted() which do two
different things. This will be confusing.

Is there a PEP on this I could read? A quick look at the PEP index
didn't show anything that looked appropos.

I certainly understand the efficiency aspects of in-place sorting, but
this has always seemed like premature optimization to me. Most of the
time (at least in the code I write), the cost of an extra copy is
inconsequential. I'll be happy to burn a few thousand CPU cycles if it
lets me avoid an intermediate assignment or a couple of extra lines of
code. When things get too slow, then is the time to do some profiling
and figure out where I can speed things up.
 
J

John J. Lee

Todd Stephens said:
Thanks for the info. Knowing that the Python community prefers one
correct way to do something, can you explain to me how this is
different/incorrect? :

myD = {'x':4, 'k':2, 'r':3, 'e':1}
myL = list(myD)

That's perhaps slightly obscure, relying on the fact that a dict
supports the iterator protocol, providing an iterator over its keys.

list( [sequence])

Return a list whose items are the same and in the same order as
sequence's items. sequence may be either a sequence, a container
that supports iteration, or an iterator object. If sequence is
...

Of course, though the list builtin function respects order, a dict's
keys don't have any guaranteed ordering.

Using .keys() is more conventional and explicit, hence clearer.

myL.sort()
for x in myL:
print "%s = %s" %(x, myD[x])

This isn't Perl, everything is a 'my' variable unless you explictly
ask otherwise, so there's no need to restate that fact in your
variable names. Having 'my' as a prefix to every name is bad style in
Python.

[...]
I have tried this both ways, and I appear to get the same results. Are
there situations where the method I have listed here would yield
unpredictable or unwanted results? Or is this an area where Python

I think it's guaranteed to return the same results as using .keys().


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top