recording data between [ and ]

R

rbt

Output from 'netstat -b' on a win2003 server will show what binary is
responsible for the connection. For example, it may list something like
this along with other connection specific data:

[lsass.exe]
[System]
[firefox.exe]
[iexplorer.exe]

How might I process the output so that anything within brackets is
recorded to a log file of my own making? I know how to parse and record
things to a file, I don't know how to look make '[' and ']' appear as
special characters so that I can record what's between them.

Basically, I want a script that will read output and stop each time it
encounters a '[' and record until it gets to ']' where upon it would
stop recording and then proceed on repeating the above operation as it
goes thru the remaining data.

Thanks,
rbt
 
P

Peter Hansen

rbt said:
Output from 'netstat -b' on a win2003 server will show what binary is
responsible for the connection. For example, it may list something like
this along with other connection specific data:

[lsass.exe]
[System]
[firefox.exe]
[iexplorer.exe]

How might I process the output so that anything within brackets is
recorded to a log file of my own making? I know how to parse and record
things to a file, I don't know how to look make '[' and ']' appear as
special characters so that I can record what's between them.

Does this help?
>>> import re
>>>
>>> s = '''stuff [lsass.exe]
.... [System] more stuff
.... xxxxx [firefox.exe] ......
.... '''
>>>
>>> re.findall(r'\[([^]]*)\]', s)
['lsass.exe', 'System', 'firefox.exe']

-Peter
 
R

rbt

Peter said:
rbt said:
Output from 'netstat -b' on a win2003 server will show what binary is
responsible for the connection. For example, it may list something
like this along with other connection specific data:

[lsass.exe]
[System]
[firefox.exe]
[iexplorer.exe]

How might I process the output so that anything within brackets is
recorded to a log file of my own making? I know how to parse and
record things to a file, I don't know how to look make '[' and ']'
appear as special characters so that I can record what's between them.


Does this help?
import re

s = '''stuff [lsass.exe]
... [System] more stuff
... xxxxx [firefox.exe] ......
... '''
re.findall(r'\[([^]]*)\]', s)
['lsass.exe', 'System', 'firefox.exe']

-Peter

Yes, it does... may take me a few minutes to get my head around it
though. Why do re's have to be so arcane and complicated... especially
in Python?

It's hard to preach 'ease of use' with stuff such as this in the
language. Perhaps one day it can be rolled up into something that
*really* is easy to understand:

import string

fp = file('filename')
data = fp.read()
fp.close()

string.between(data,[,])
 
D

Diez B. Roggisch

Yes, it does... may take me a few minutes to get my head around it
though. Why do re's have to be so arcane and complicated... especially
in Python?

It's hard to preach 'ease of use' with stuff such as this in the
language. Perhaps one day it can be rolled up into something that
*really* is easy to understand:

Welcome to the wonderful world of programming. Regular expressions are what
they are because they are modeled after a certain theory - that of finite
state automata and their correspondence to certain classes of grammars. And
they require a bit of understanding. And there is no language that does
them different - some integrate them syntactically (like perl), others
don't have them available in the standard lib at all. But if you get them,
they always look like that.
import string

fp = file('filename')
data = fp.read()
fp.close()

string.between(data,[,])

how about

import whatever_is_needed
solve_my_problem()

? Seriously: Programming or maybe better saying the way we tell computers
what to do might evolve by standardization to a point where lots of tasks
get easier.You actual problem might be solved easier one day if
commandline-tools agree on a specific output format (viewed from today
thate means possibly xml) and standard tools to deal with these.

But as the world is complex and people want solutions to their complex
problems, IMHO programming will always be about such nitty gritty details.
 
F

Fredrik Lundh

Diez said:
Welcome to the wonderful world of programming. Regular expressions are what
they are because they are modeled after a certain theory - that of finite
state automata and their correspondence to certain classes of grammars.

(except that Python regexps are not always regular, of course. and that back-
tracking engines like the ones used in Perl and Python differs in subtle ways
from "real" DFA-based engines, etc. but as long as you're looking at things
from a proper distance, you're right, of course)

</F>
 
R

Roy Smith

Diez B. Roggisch said:
Welcome to the wonderful world of programming. Regular expressions are what
they are because they are modeled after a certain theory - that of finite
state automata and their correspondence to certain classes of
grammars.

Another way to look at it is that RE's are a programming language of
their own, and Python just provides an interface to it, just like it
provides interfaces to databases, network protocols, and operating
systems.

RE's predate Python by many years (at least as far back as the early
70's in a form we would recognize today), and have evolved over the
decades to become more powerful. Unfortunately, with power came
arcane syntax. On the good side, most of the time you can use a
smallish subset of the full RE syntax and still have some pretty
powerfull pattern matching.

Python's motto is "there's one way to do it". Sometimes that means
"let's do it the way everybody else does it instead of reinventing it
ourselves". The Python RE module is certainly an example of that.

BTW, there's a pretty good Wikipedia article on RE's
(http://en.wikipedia.org/wiki/Regular_expression).
 
R

rbt

Roy said:
Another way to look at it is that RE's are a programming language of
their own, and Python just provides an interface to it, just like it
provides interfaces to databases, network protocols, and operating
systems.

RE's predate Python by many years (at least as far back as the early
70's in a form we would recognize today), and have evolved over the
decades to become more powerful. Unfortunately, with power came
arcane syntax. On the good side, most of the time you can use a
smallish subset of the full RE syntax and still have some pretty
powerfull pattern matching.

Python's motto is "there's one way to do it". Sometimes that means
"let's do it the way everybody else does it instead of reinventing it
ourselves". The Python RE module is certainly an example of that.

BTW, there's a pretty good Wikipedia article on RE's
(http://en.wikipedia.org/wiki/Regular_expression).

Thanks guys... nothing against Python... just RE's in general.
 
S

Simon Brunning

string.between(data,[,])

def between(data, start, end):
return re.findall(re.escape(start) + r'([^]]*)'+ re.escape(end), data)

foo = '''stuff [lsass.exe]
[System] more stuff
xxxxx [firefox.exe] ......
'''

print between(foo, '[', ']')
 
P

Paul McGuire

Jay -

Thanks for the pyparsing plug.

Here is how the OP's program would look using pyparsing:


import pyparsing

fp = file('filename')
data = fp.read()
fp.close()

foo = '''stuff [lsass.exe]
[System] more stuff
xxxxx [firefox.exe] ......
'''

LBRACK = pyparsing.Literal("[").suppress()
RBRACK = pyparsing.Literal("]").suppress()
brackettedStuff = LBRACK + pyparsing.SkipTo( RBRACK ) + RBRACK

for tokens,start,end in brackettedStuff.scanString( foo ):
print tokens[0]

--- fin ---
Now this is not nearly as terse as the regexp version, nor will it run
as fast. But I think I'd rather come back to this version 6 months
from now and try to figure "what was this program doing again?".

-- Paul
 
J

jay graves

Paul said:
Jay -
Thanks for the pyparsing plug.

NP. pyparsing is on my list of stuff to play around with. I'm just
waiting for the proper problem to present itself.
Here is how the OP's program would look using pyparsing:

And the exact reason that I could 'plug' pyparsing is that I have read
many of your responses with sample pyparsing code. viral marketing at
its best and another reason to love c.l.py

....

jay
 
D

Diez B. Roggisch

Fredrik said:
(except that Python regexps are not always regular, of course. and that
back- tracking engines like the ones used in Perl and Python differs in
subtle ways
from "real" DFA-based engines, etc. but as long as you're looking at
things from a proper distance, you're right, of course)

They use backtracking? Thats news to me. I always thought that mechanisms
like prefixes and backreferences can work in a strict DFA paragigm - with
possibly very large automata, but nevertheless.

Gotta go google I think....
 
J

Jim Sizelove

Simon said:
string.between(data,[,])


def between(data, start, end):
return re.findall(re.escape(start) + r'([^]]*)'+ re.escape(end), data)

That's cool!
But it doesn't quite work if the end tag is not ']':
.... return re.findall(re.escape(start) + r'([^]]*)'+ re.escape(end), data)
....
>>> foo = '''<stuff> [lsass.exe]
.... [System] <more> stuff
.... xxxxx said:
>>> print between(foo, '[', ']') ['lsass.exe', 'System', 'firefox.exe']
>>> print between(foo, '<', '>')
['stuff', 'more> stuff\nxxxxx<qqq']


Here's a revised version that will work with other tags:
.... pattern = re.escape(start) + ' # start tag \n' +\
.... r'([^' + re.escape(end) + r']*)' + " # anything except end
tag \n" +\
.... re.escape(end) + ' # end tag \n'
.... return re.findall(pattern, data, re.VERBOSE)
....
>>> print between2(foo, '[', ']') ['lsass.exe', 'System', 'firefox.exe']
>>> print between2(foo, '<', '>')
['stuff', 'more', 'qqq']

Regards,
Jim Sizelove
 
B

bearophileHUGS

Diez B. Roggisch>But as the world is complex and people want solutions
to their complex problems, IMHO programming will always be about such
nitty gritty details.<

REs are like assembly, but high-level languages show us that for a
mammal there are (often) better (higher) ways to program a computer.
The computer can also compile the high-level language, to run it quite
quickly.

Beside rxb15, there is also redict, in the standard lib (Jay Graves
shows the HD path):
http://home.earthlink.net/~jasonrandharper/reverb.py

Maybe a higher level language (this is just a wrapper) like this can
become the standard way to make REs in Python.

Bearophile
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,071
Latest member
MetabolicSolutionsKeto

Latest Threads

Top