S
Steven Bethard
I have a list of strings that looks something like:
lst = ['0', '0', '(*)', 'O', '(*', '*', '(*', '*))', '((*', '*)', '*)']
The parentheses in the labels indicate where an "annotation" starts and
ends. So for example, the label '(*)' at index 2 of the list means that
I have an annotation at (2, 2), and the labels '(*', '*', '(*', '*))' at
indices 4 through 7 mean that I have an annotation at (4, 7) and an
annotation at (6, 7).
I'd like to determine all indices at which I have an annotation. So for
the data above, I want the indices:
(2, 2), (4, 7), (6, 7), (8, 9) and (8, 10)
Here's what I'm doing now:
py> def indices(lst):
.... stack = []
.... for i, s in enumerate(lst):
.... if s == 'O':
.... continue
.... stack.extend(*s.count('('))
.... if '*' in s and not stack:
.... raise Exception('No start for %r at %i' % (s, i))
.... for _ in range(s.count(')')):
.... try:
.... yield stack.pop(), i
.... except IndexError:
.... raise Exception('No start for %r at %i' % (s, i))
.... if stack:
.... raise Exception('No ends for starts at %r' % stack)
....
py> list(indices(['0', '0', '(*)', 'O', '(*', '*', '(*', '*))', '((*',
'*)', '*)', '0']))
[(2, 2), (6, 7), (4, 7), (8, 9), (8, 10)]
I think that works right, but I'm not certain. So two questions:
(1) Can anyone see anything wrong with the code above? and
(2) Does anyone see an easier/clearer/simpler[1] way of doing this?
Thanks,
STeVe
[1] Yes, I know easier/clearer/simpler are subjective terms. It's okay,
I'm only looking for opinions here anyway. =)
lst = ['0', '0', '(*)', 'O', '(*', '*', '(*', '*))', '((*', '*)', '*)']
The parentheses in the labels indicate where an "annotation" starts and
ends. So for example, the label '(*)' at index 2 of the list means that
I have an annotation at (2, 2), and the labels '(*', '*', '(*', '*))' at
indices 4 through 7 mean that I have an annotation at (4, 7) and an
annotation at (6, 7).
I'd like to determine all indices at which I have an annotation. So for
the data above, I want the indices:
(2, 2), (4, 7), (6, 7), (8, 9) and (8, 10)
Here's what I'm doing now:
py> def indices(lst):
.... stack = []
.... for i, s in enumerate(lst):
.... if s == 'O':
.... continue
.... stack.extend(*s.count('('))
.... if '*' in s and not stack:
.... raise Exception('No start for %r at %i' % (s, i))
.... for _ in range(s.count(')')):
.... try:
.... yield stack.pop(), i
.... except IndexError:
.... raise Exception('No start for %r at %i' % (s, i))
.... if stack:
.... raise Exception('No ends for starts at %r' % stack)
....
py> list(indices(['0', '0', '(*)', 'O', '(*', '*', '(*', '*))', '((*',
'*)', '*)', '0']))
[(2, 2), (6, 7), (4, 7), (8, 9), (8, 10)]
I think that works right, but I'm not certain. So two questions:
(1) Can anyone see anything wrong with the code above? and
(2) Does anyone see an easier/clearer/simpler[1] way of doing this?
Thanks,
STeVe
[1] Yes, I know easier/clearer/simpler are subjective terms. It's okay,
I'm only looking for opinions here anyway. =)