NEWBIE: Tokenize command output

Lorenzo Thurman · May 11, 2006

This is what I have so far:

//
#!/usr/bin/python

import os

cmd = 'ntpq -p'

output = os.popen(cmd).read()
//

The output is saved in the variable 'output'. What I need to do next is
select the line from that output that starts with the '*'

remote refid st t when poll reach delay offset
jitter
=========================================================================
=====
+ntp-1.gw.uiuc.e 128.174.38.133 2 u 479 1024 377 33.835 -0.478
3.654
+milo.mcs.anl.go 130.126.24.44 3 u 676 1024 377 70.143 1.893
1.296
*caesar.cs.wisc. 128.105.201.11 2 u 635 1024 377 29.514 -0.231
0.077

From there, I need to tokenize the line using the spaces as delimiters.
Can someone give me some pointers?
Thanks

Tim Chase · May 11, 2006

Lorenzo said:
This is what I have so far:

//
#!/usr/bin/python

import os

cmd = 'ntpq -p'

output = os.popen(cmd).read()
//

The output is saved in the variable 'output'. What I need to do next is
select the line from that output that starts with the '*'

Well, if you don't need "output" for anything else, you can
just iterate over the lines with

import os
cmd = 'ntpq -p'
p = os.popen(cmd)
starLines = [line for line in p.readlines() if
line.startswith("*")]

or you may optionally want to prune of the "\n" characters
in the process:

starLines = [line[:-1] for line in p.readlines() if
line.startswith("*")]

If there's only ever one, then you can just use

myLine = starLines[0]

Otherwise, you'll have to react accordingly if there are
zero lines or more than one line that begin(s) with an asterisk.

remote refid st t when poll reach delay offset
jitter [cut]

*caesar.cs.wisc. 128.105.201.11 2 u 635 1024 377 29.514 -0.231
0.077

From there, I need to tokenize the line using the spaces as delimiters.

Click to expand...

Can someone give me some pointers?

Again, you can use the one-line pythonizm of "tuple
unpacking" here (the split() method takes an optional
parameter for the delimiter, but defaults to whitespace):

remote, refid, st, t, when, poll, reach, delay, offset,
jitter = myLine.split()

If you just want one of them, you can do it by offset:

delay = myLine.split()[7]

HTH,

-tkc

Steven Bethard · May 11, 2006

Lorenzo said:
This is what I have so far:

//
#!/usr/bin/python

import os

cmd = 'ntpq -p'

output = os.popen(cmd).read()
//

The output is saved in the variable 'output'. What I need to do next is
select the line from that output that starts with the '*' [snip]
From there, I need to tokenize the line using the spaces as delimiters.

I use the subprocess module instead for better error reporting, but
basically if you just iterate over the file object, you'll get the lines.

import subprocess

# open the process
cmd = subprocess.Popen(['ntpq', '-p'], stdout=subprocess.PIPE)

# iterate over the file object until we see a line-initial "*"
line = None
for line in cmd.stdout:
if line.startswith('*'):
break

# make sure to catch the error if ntpq didn't produce any output
assert line is not None

# split the line into tokens (stolen from Tim Chase's answer
(remote, refid, st, t, when,
poll, reach, delay, offset, jitter) = line.split()

If you need to do this for each line that starts with a "*" instead of
just the first, move the line.split() code inside the for-loop.

STeVe

bruno at modulix · May 12, 2006

Tim Chase wrote:
(snip)

starLines = [line for line in p.readlines() if line.startswith("*")]

files are iterators, so no need to use readlines() (unless it's an old
Python version of course):

starLines = [line for line in p if line.startswith("*")]

or you may optionally want to prune of the "\n" characters in the process:

starLines = [line[:-1] for line in p.readlines() if line.startswith("*")]

*please* use str.rstrip() for this:
starLines = [line.rstrip() for line in p if line.startswith("*")]

Tim Chase · May 12, 2006

starLines = [line for line in p.readlines() if line.startswith("*")]

files are iterators, so no need to use readlines() (unless it's an old
Python version of course):

starLines = [line for line in p if line.startswith("*")]

Having started with some old Python, it's one of those
things I "know", but my coding fingers haven't yet put into
regular practice.

or you may optionally want to prune of the "\n" characters in the process:

starLines = [line[:-1] for line in p.readlines() if line.startswith("*")]

Click to expand...

*please* use str.rstrip() for this:
starLines = [line.rstrip() for line in p if line.startswith("*")]

They can yield different things, no?

>>> s = "abc \n"
>>> s[:-1] 'abc '
>>> s.rstrip() 'abc'
>>> s.[:-1] == s.rstrip()

Click to expand...

Click to expand...

False

If trailing space matters, you don't want to throw it away
with rstrip().

Otherwise, just to be informed, what advantage does rstrip()
have over [:-1] (if the two cases are considered
uneventfully the same)?

Thanks,

-tkc

bruno at modulix · May 12, 2006

Tim said:
starLines = [line for line in p.readlines() if line.startswith("*")]

Click to expand...

files are iterators, so no need to use readlines() (unless it's an old
Python version of course):

starLines = [line for line in p if line.startswith("*")]

Click to expand...

Having started with some old Python, it's one of those things I "know",
but my coding fingers haven't yet put into regular practice.

I reeducated my fingers after having troubles with huge files !-)

or you may optionally want to prune of the "\n" characters in the
process:

starLines = [line[:-1] for line in p.readlines() if
line.startswith("*")]

Click to expand...

*please* use str.rstrip() for this:
starLines = [line.rstrip() for line in p if line.startswith("*")]

Click to expand...

They can yield different things, no?

s = "abc \n"
s[:-1] 'abc '
s.rstrip() 'abc'
s.[:-1] == s.rstrip()

Click to expand...

Click to expand...

False

If trailing space matters, you don't want to throw it away with rstrip().

then use rstrip('\n') - thanks for this correction.

Otherwise, just to be informed, what advantage does rstrip() have over
[:-1] (if the two cases are considered uneventfully the same)?

1/ if your line doesn't end with a newline, line[:-1] will still remove
the last caracter.

2/ IIRC, if you don't use universal newline and the file uses the
DOS/Windows newline convention, line[:-1] will not remove the CR - only
the LF (please someone correct me if I'm wrong here).

I know this may not be a real issue in the actual case, but using
rstrip() is still a safer way to go IMHO - think about using this same
code to iterate over a list of strings without newlines...

Duncan Booth · May 12, 2006

bruno said:
Otherwise, just to be informed, what advantage does rstrip() have over
[:-1] (if the two cases are considered uneventfully the same)?

Click to expand...

1/ if your line doesn't end with a newline, line[:-1] will still remove
the last caracter.

In particular, if the last line of the file doesn't end with a newline then
the last line you read won't have a newline to be stripped.

Tim Chase · May 12, 2006

I reeducated my fingers after having troubles with huge files !-)

I'll keep it in mind...the prospect of future trouble with
large files is a good kick-in-the-pants to remember.

Otherwise, just to be informed, what advantage does rstrip() have over
[:-1] (if the two cases are considered uneventfully the same)?

Click to expand...

1/ if your line doesn't end with a newline, line[:-1] will still remove
the last caracter.

Good catch. Most *nix editors are smart about having a
trailing NL character at the end of the file, but some
Windows text-editors aren't so kind.

2/ IIRC, if you don't use universal newline and the file uses the
DOS/Windows newline convention, line[:-1] will not remove the CR - only
the LF (please someone correct me if I'm wrong here).

To get this behavior, I think you have to open the file in
binary mode. To me, opening as binary is a signal that I
should be using read() rather than readlines() (or
xreadlines, or the iterator, or whatever). If you've opened
in binary mode, you might have to use rstrip("\r\n") to get
both possible line-ending characters.

I know this may not be a real issue in the actual case, but using
rstrip() is still a safer way to go IMHO - think about using this same
code to iterate over a list of strings without newlines...

Makes sense. Using rstrip("\r\n") has all the benefits,
plus more gracefully handles cases where a newline might not
be present or comprised of two (or more) characters. Got
it! Thanks for the explanation.

-tkc

bruno at modulix · May 12, 2006

Duncan said:
bruno at modulix wrote:

Otherwise, just to be informed, what advantage does rstrip() have over
[:-1] (if the two cases are considered uneventfully the same)?

Click to expand...

1/ if your line doesn't end with a newline, line[:-1] will still remove
the last caracter.

Click to expand...

In particular, if the last line of the file doesn't end with a newline then
the last line you read won't have a newline to be stripped.

Thanks - I knew there was a corner case for files, but couldn't remember
it exactly.

Can't make this page work	6	Mar 8, 2006
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004
comp.lang.vhdl FAQ part 3 of 4: products & services	0	Jul 8, 2003

NEWBIE: Tokenize command output

Lorenzo Thurman

Tim Chase

Steven Bethard

bruno at modulix

Tim Chase

bruno at modulix

Duncan Booth

Tim Chase

bruno at modulix

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads