extracting numbers from a file, excluding words

Discussion in 'Python' started by dawenliu@gmail.com, Nov 1, 2005.

  1. Guest

    Hi, I have a file with this content:

    zzzz zzzzz zzz zzzzz
    ....
    xxxxxxx xxxxxxxxxx xxxxx 34.215
    zzzzzzz zz zzzz
    ....

    "x" and "z" are letters.
    The lines with "z" are trash, and only the lines with "x" are
    important. I want to extract the number (34.215 in this case) behind
    the letters x, and store it in a file.

    The sentence "xxxxxxx xxxxxxxxxx xxxxx " is FIXED and KNOWN. The "z"
    sentences are can vary. There are also unknown number of "z" lines.

    Any suggestions will be appreciated.
    , Nov 1, 2005
    #1
    1. Advertising

  2. 1 Nov 2005 09:19:45 -0800, <>:
    > Hi, I have a file with this content:
    >
    > zzzz zzzzz zzz zzzzz
    > ...
    > xxxxxxx xxxxxxxxxx xxxxx 34.215
    > zzzzzzz zz zzzz
    > ...
    >


    Hi,

    I'd suggest doing this:

    f = file('...')
    for line in f:
    if 'xxxxxxx xxxxxxxxxx xxxxx' in line:
    var = float(line[len('xxxxxxx xxxxxxxxxx xxxxx'):].strip())
    f.close()
    =?ISO-8859-13?Q?Kristina_Kudria=F0ova?=, Nov 1, 2005
    #2
    1. Advertising

  3. Kristina Kudriašova wrote:
    > 1 Nov 2005 09:19:45 -0800, <>:
    >> Hi, I have a file with this content:
    >>
    >> zzzz zzzzz zzz zzzzz
    >> ...
    >> xxxxxxx xxxxxxxxxx xxxxx 34.215
    >> zzzzzzz zz zzzz
    >> ...
    >>

    >
    > Hi,
    >
    > I'd suggest doing this:
    >
    > f = file('...')
    > for line in f:
    > if 'xxxxxxx xxxxxxxxxx xxxxx' in line:
    > var = float(line[len('xxxxxxx xxxxxxxxxx xxxxx'):].strip())
    > f.close()


    I think I prefer "if line.startswith('xxxxxxx xxxxxxxxxx
    xxxxx'):" . Feels cleaner to me.

    Steve
    Steve Horsley, Nov 1, 2005
    #3
  4. Steve Horsley wrote:
    > Kristina Kudriašova wrote:
    >
    >> 1 Nov 2005 09:19:45 -0800, <>:
    >>
    >>> Hi, I have a file with this content:
    >>>
    >>> zzzz zzzzz zzz zzzzz
    >>> ...
    >>> xxxxxxx xxxxxxxxxx xxxxx 34.215
    >>> zzzzzzz zz zzzz
    >>> ...
    >>>

    >>
    >> Hi,
    >>
    >> I'd suggest doing this:
    >>
    >> f = file('...')
    >> for line in f:
    >> if 'xxxxxxx xxxxxxxxxx xxxxx' in line:
    >> var = float(line[len('xxxxxxx xxxxxxxxxx xxxxx'):].strip())
    >> f.close()

    >
    >
    > I think I prefer "if line.startswith('xxxxxxx xxxxxxxxxx xxxxx'):" .
    > Feels cleaner to me.


    Especially if any "z" lines might include the magic pattern.
    Jeffrey Schwab, Nov 1, 2005
    #4
  5. Mike Meyer Guest

    Kristina Kudriaðova <> writes:

    > 1 Nov 2005 09:19:45 -0800, <>:
    >> Hi, I have a file with this content:
    >>
    >> zzzz zzzzz zzz zzzzz
    >> ...
    >> xxxxxxx xxxxxxxxxx xxxxx 34.215
    >> zzzzzzz zz zzzz
    >> ...
    >>

    >
    > Hi,
    >
    > I'd suggest doing this:
    >
    > f = file('...')
    > for line in f:
    > if 'xxxxxxx xxxxxxxxxx xxxxx' in line:
    > var = float(line[len('xxxxxxx xxxxxxxxxx xxxxx'):].strip())
    > f.close()


    Alternatively:

    start = len('xxxxxxx xxxxxxxxxx xxxxx')
    for line in f:
    if line.startswith('xxxxxxx xxxxxxxxxx xxxxx'):
    var = float(line[start:].strip())

    <mike
    --
    Mike Meyer <> http://www.mired.org/home/mwm/
    Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
    Mike Meyer, Nov 1, 2005
    #5
  6. On Nov 01, Mike Meyer wrote:
    > Kristina Kudriaðova <> writes:
    >
    > > 1 Nov 2005 09:19:45 -0800, <>:
    > >> Hi, I have a file with this content:
    > >>
    > >> zzzz zzzzz zzz zzzzz
    > >> ...
    > >> xxxxxxx xxxxxxxxxx xxxxx 34.215
    > >> zzzzzzz zz zzzz
    > >> ...
    > >>

    > >
    > > Hi,
    > >
    > > I'd suggest doing this:
    > >
    > > f = file('...')
    > > for line in f:
    > > if 'xxxxxxx xxxxxxxxxx xxxxx' in line:
    > > var = float(line[len('xxxxxxx xxxxxxxxxx xxxxx'):].strip())
    > > f.close()

    >
    > Alternatively:
    >
    > start = len('xxxxxxx xxxxxxxxxx xxxxx')
    > for line in f:
    > if line.startswith('xxxxxxx xxxxxxxxxx xxxxx'):
    > var = float(line[start:].strip())


    To refine this even further, I'll add that 'xxx...' is an ugly pattern
    to repeat, and prone to mistyping, so add a tempvar and apply DRY::

    pattern = 'xxxxxxx xxxxxxxxxx xxxxx'
    start = len(pattern)
    for line in f:
    if line.startswith(pattern):
    var = float(line[start:].strip())

    --
    _ _ ___
    |V|icah |- lliott http://micah.elliott.name
    " " """
    Micah Elliott, Nov 1, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter Strøiman
    Replies:
    1
    Views:
    2,084
    Peter Strøiman
    Aug 23, 2005
  2. Richard Heathfield
    Replies:
    7
    Views:
    361
    Barry Schwarz
    Oct 5, 2003
  3. dawenliu
    Replies:
    5
    Views:
    296
    Alex Martelli
    Oct 29, 2005
  4. utab

    Words Words

    utab, Feb 16, 2006, in forum: C++
    Replies:
    6
    Views:
    420
    Daniel T.
    Feb 16, 2006
  5. BerlinBrown
    Replies:
    6
    Views:
    4,481
Loading...

Share This Page