parse a string of parameters and values

Discussion in 'Python' started by bsneddon, Dec 13, 2009.

  1. bsneddon

    bsneddon Guest

    I have a problem that I can come up with a brute force solution to
    solve but it occurred to me that there may be an
    "one-- and preferably only one --obvious way to do it".

    I am going to read a text file that is an export from a control
    system.
    It has lines with information like

    base=1 name="first one" color=blue

    I would like to put this info into a dictionary for processing.
    I have looked at optparse and getopt maybe they are the answer but
    there could
    be and very straight forward way to do this task.

    Thanks for your help
     
    bsneddon, Dec 13, 2009
    #1
    1. Advertising

  2. On Sat, 12 Dec 2009 16:16:32 -0800, bsneddon wrote:

    > I have a problem that I can come up with a brute force solution to solve
    > but it occurred to me that there may be an
    > "one-- and preferably only one --obvious way to do it".


    I'm not sure that "brute force" is the right description here. Generally,
    "brute force" is used for situations where you check every single
    possible value rather than calculate the answer directly. One classical
    example is guessing the password that goes with an account. The brute
    force attack is to guess every imaginable password -- eventually you'll
    find the matching one. A non-brute force attack is to say "I know the
    password is a recent date", which reduces the space of possible passwords
    from many trillions to mere millions.

    So I'm not sure that brute force is an appropriate description for this
    problem. One way or another you have to read every line in the file.
    Whether you read them or you farm the job out to some pre-existing
    library function, they still have to be read.


    > I am going to read a text file that is an export from a control system.
    > It has lines with information like
    >
    > base=1 name="first one" color=blue
    >
    > I would like to put this info into a dictionary for processing.


    Have you looked at the ConfigParser module?

    Assuming that ConfigParser isn't suitable, you can do this if each
    key=value pair is on its own line:


    d = {}
    for line in open(filename, 'r'):
    if not line.strip():
    # skip blank lines
    continue
    key, value = line.split('=', 1)
    d[key.strip()] = value.strip()


    If you have multiple keys per line, you need a more sophisticated way of
    splitting them. Something like this should work:

    d = {}
    for line in open(filename, 'r'):
    if not line.strip():
    continue
    terms = line.split('=')
    keys = terms[0::2] # every second item starting from the first
    values = terms[1::2] # every second item starting from the second
    for key, value in zip(keys, values):
    d[key.strip()] = value.strip()




    --
    Steven
     
    Steven D'Aprano, Dec 13, 2009
    #2
    1. Advertising

  3. bsneddon

    John Machin Guest

    Steven D'Aprano <steve <at> REMOVE-THIS-cybersource.com.au> writes:

    >
    > On Sat, 12 Dec 2009 16:16:32 -0800, bsneddon wrote:
    >


    >
    > > I am going to read a text file that is an export from a control system.
    > > It has lines with information like
    > >
    > > base=1 name="first one" color=blue
    > >
    > > I would like to put this info into a dictionary for processing.

    >
    > Have you looked at the ConfigParser module?
    >
    > Assuming that ConfigParser isn't suitable, you can do this if each
    > key=value pair is on its own line:
    > [snip]
    > If you have multiple keys per line, you need a more sophisticated way of
    > splitting them. Something like this should work:
    >
    > d = {}
    > for line in open(filename, 'r'):
    > if not line.strip():
    > continue
    > terms = line.split('=')
    > keys = terms[0::2] # every second item starting from the first
    > values = terms[1::2] # every second item starting from the second
    > for key, value in zip(keys, values):
    > d[key.strip()] = value.strip()
    >


    There appears to be a problem with the above snippet, or you have a strange
    interpretation of "put this info into a dictionary":

    | >>> line = 'a=1 b=2 c=3 d=4'
    | >>> d = {}
    | >>> terms = line.split('=')
    | >>> print terms
    | ['a', '1 b', '2 c', '3 d', '4']
    | >>> keys = terms[0::2] # every second item starting from the first
    | >>> values = terms[1::2] # every second item starting from the second
    | >>> for key, value in zip(keys, values):
    | ... d[key.strip()] = value.strip()
    | ...
    | >>> print d
    | {'a': '1 b', '2 c': '3 d'}
    | >>>

    Perhaps you meant

    terms = re.split(r'[= ]', line)

    which is an improvement, but this fails on cosmetic spaces e.g. a = 1 b = 2 ...

    Try terms = filter(None, re.split(r'[= ]', line))

    Now we get to the really hard part: handling the name="first one" in the OP's
    example. The splitting approach has run out of steam.

    The OP will need to divulge what is the protocol for escaping the " character if
    it is present in the input. If nobody knows of a packaged solution to his
    particular scheme, then he'll need to use something like pyparsing.
     
    John Machin, Dec 13, 2009
    #3
  4. On Sun, 13 Dec 2009 05:52:04 +0000, John Machin wrote:

    > Steven D'Aprano <steve <at> REMOVE-THIS-cybersource.com.au> writes:

    [snip]
    >> If you have multiple keys per line, you need a more sophisticated way
    >> of splitting them. Something like this should work:

    [...]
    > There appears to be a problem with the above snippet, or you have a
    > strange interpretation of "put this info into a dictionary":



    D'oh!

    In my defence, I said it "should" work, not that it did work!


    --
    Steven
     
    Steven D'Aprano, Dec 13, 2009
    #4
  5. bsneddon

    Peter Otten Guest

    bsneddon wrote:

    > I have a problem that I can come up with a brute force solution to
    > solve but it occurred to me that there may be an
    > "one-- and preferably only one --obvious way to do it".
    >
    > I am going to read a text file that is an export from a control
    > system.
    > It has lines with information like
    >
    > base=1 name="first one" color=blue
    >
    > I would like to put this info into a dictionary for processing.
    > I have looked at optparse and getopt maybe they are the answer but
    > there could
    > be and very straight forward way to do this task.
    >
    > Thanks for your help


    Have a look at shlex:

    >>> import shlex
    >>> s = 'base=1 name="first one" color=blue equal="alpha=beta" empty'
    >>> dict(t.partition("=")[::2] for t in shlex.split(s))

    {'color': 'blue', 'base': '1', 'name': 'first one', 'empty': '', 'equal':
    'alpha=beta'}

    Peter
     
    Peter Otten, Dec 13, 2009
    #5
  6. bsneddon

    bsneddon Guest

    On Dec 13, 5:28 am, Peter Otten <> wrote:
    > bsneddon wrote:
    > > I have a problem that I can come up with a brute force solution to
    > > solve but it occurred to me that there may be an
    > >  "one-- and preferably only one --obvious way to do it".

    >
    > > I am going to read a text file that is an export from a control
    > > system.
    > > It has lines with information like

    >
    > > base=1 name="first one" color=blue

    >
    > > I would like to put this info into a dictionary for processing.
    > > I have looked at optparse and getopt maybe they are the answer but
    > > there could
    > > be and very straight forward way to do this task.

    >
    > > Thanks for your help

    >
    > Have a look at shlex:
    >
    > >>> import shlex
    > >>> s = 'base=1 name="first one" color=blue equal="alpha=beta" empty'
    > >>> dict(t.partition("=")[::2] for t in shlex.split(s))

    >
    > {'color': 'blue', 'base': '1', 'name': 'first one', 'empty': '', 'equal':
    > 'alpha=beta'}
    >
    > Peter


    Thanks to all for your input.

    It seems I miss stated the problem. Text is always quoted so blue
    above -> "blue".

    Peter,

    The part I was missing was t.partition("=") and slicing skipping by
    two.
    It looks like a normal split will work for me to get the arguments I
    need.
    To my way of thinking your is very clean any maybe the "--obvious way
    to do it"
    Although it was not obvious to me until seeing your post.

    Bill
     
    bsneddon, Dec 13, 2009
    #6
  7. En Sun, 13 Dec 2009 07:28:24 -0300, Peter Otten <>
    escribió:
    > bsneddon wrote:


    >> I am going to read a text file that is an export from a control
    >> system.
    >> It has lines with information like
    >>
    >> base=1 name="first one" color=blue
    >>
    >> I would like to put this info into a dictionary for processing.

    >
    >>>> import shlex
    >>>> s = 'base=1 name="first one" color=blue equal="alpha=beta" empty'
    >>>> dict(t.partition("=")[::2] for t in shlex.split(s))

    > {'color': 'blue', 'base': '1', 'name': 'first one', 'empty': '', 'equal':
    > 'alpha=beta'}


    Brilliant!

    --
    Gabriel Genellina
     
    Gabriel Genellina, Dec 15, 2009
    #7
  8. bsneddon

    Tim Chase Guest

    Gabriel Genellina wrote:
    > Peter Otten escribió:
    >> bsneddon wrote:
    >>> I am going to read a text file that is an export from a control
    >>> system.
    >>> It has lines with information like
    >>>
    >>> base=1 name="first one" color=blue
    >>>
    >>> I would like to put this info into a dictionary for processing.
    >>>>> import shlex
    >>>>> s = 'base=1 name="first one" color=blue equal="alpha=beta" empty'
    >>>>> dict(t.partition("=")[::2] for t in shlex.split(s))

    >> {'color': 'blue', 'base': '1', 'name': 'first one', 'empty': '', 'equal':
    >> 'alpha=beta'}

    >
    > Brilliant!


    The thing I appreciated about Peter's solution was learning a
    purpose for .partition() as I've always just used .split(), so I
    would have done something like

    >>> dict('=' in s and s.split('=', 1) or (s, '') for s in

    shlex.split(s))
    {'color': 'blue', 'base': '1', 'name': 'first one', 'empty': '',
    'equal': 'alpha=beta'}

    Using .partition() makes that a lot cleaner. However, it looks
    like .partition() was added in 2.5, so for my code stuck in 2.4
    deployments, I'll stick with the uglier .split()

    -tkc
     
    Tim Chase, Dec 15, 2009
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    19
    Views:
    1,184
    Daniel Vallstrom
    Mar 15, 2005
  2. Ulrich Scholz
    Replies:
    2
    Views:
    275
    Thomas Kellerer
    Sep 14, 2007
  3. 7stud --

    optparse: parse v. parse! ??

    7stud --, Feb 20, 2008, in forum: Ruby
    Replies:
    3
    Views:
    212
    7stud --
    Feb 20, 2008
  4. gerberdata

    parse values from a string

    gerberdata, Dec 23, 2011, in forum: Ruby
    Replies:
    10
    Views:
    639
    gerberdata
    Jan 19, 2012
  5. Jay
    Replies:
    5
    Views:
    177
Loading...

Share This Page