Re: itertools.groupby

Discussion in 'Python' started by Wolfgang Maier, Apr 20, 2013.

  1. Jason Friedman <jsf80238 <at> gmail.com> writes:

    >
    > I have a file such as:
    >
    > $ cat my_data 
    > Starting a new group
    >
    > a
    > b
    > c
    > Starting a new group
    > 1
    > 2
    > 3
    >
    > 4
    > Starting a new group
    > X
    > Y
    > Z
    > Starting a new group
    >
    >
    > I am wanting a list of lists:
    > ['a', 'b', 'c']
    >
    > ['1', '2', '3', '4']
    > ['X', 'Y', 'Z']
    > []
    >
    > I wrote this:
    >
    > ------------------------------------
    > #!/usr/bin/python3
    > from itertools import groupby
    >
    > def get_lines_from_file(file_name):
    >     with open(file_name) as reader:
    >         for line in reader.readlines():
    >             yield(line.strip())
    >
    > counter = 0
    > def key_func(x):
    >     if x.startswith("Starting a new group"):
    >         global counter
    >         counter += 1
    >     return counter
    >
    > for key, group in groupby(get_lines_from_file("my_data"), key_func):
    >     print(list(group)[1:])
    > ------------------------------------
    >
    >
    >
    >
    > I get the output I desire, but I'm wondering if there is a solution

    without the global counter.
    >


    Here's a solution that makes use of groupby (which is a good idea I think),
    but avoids the counter (actually this is trivial; you just return the result
    of startswith directly). It also provides you with the rest of the separator
    line (you're using startswith in your code, so I figured you expect more on
    these lines). I replaced the startswith() with slicing though as this is
    usually faster.

    def separate_on(iterable, separator):
    sep_len=len(separator)
    grouped_iter = (x[1] for x in groupby(iterable,
    lambda line: line[:sep_len] == separator))
    for separator_line in grouped_iter:
    rest_of_separator_line = next(separator_line)[sep_len:].strip()
    yield (rest_of_separator_line,
    [s.strip() for s in next(grouped_iter)])

    then

    for sep_tail, group in separate_on(your_input,your_separator):
    do_what_ever()

    Hope it's what you want,
    Wolfgang
    Wolfgang Maier, Apr 20, 2013
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. G?nter Jantzen

    whatsnew 2.4 about itertools.groupby:

    G?nter Jantzen, Jun 9, 2004, in forum: Python
    Replies:
    0
    Views:
    281
    G?nter Jantzen
    Jun 9, 2004
  2. Replies:
    3
    Views:
    329
    Fredrik Lundh
    May 25, 2006
  3. 7stud

    itertools.groupby

    7stud, May 27, 2007, in forum: Python
    Replies:
    13
    Views:
    597
    =?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=
    Jun 5, 2007
  4. Steve Howell

    Re: itertools.groupby

    Steve Howell, May 27, 2007, in forum: Python
    Replies:
    13
    Views:
    540
  5. Tobiah

    itertools.groupby

    Tobiah, Jan 15, 2008, in forum: Python
    Replies:
    2
    Views:
    304
    Tobiah
    Jan 16, 2008
Loading...

Share This Page