inserting bracketings into a string

Discussion in 'Python' started by Steven Bethard, Nov 16, 2004.

  1. I'm trying to insert some bracketings in a string based on a set of
    labels and associated start and end indices. For example, I'd like to
    do something like:

    >>> text = 'abcde fgh ijklmnop qrstu vw xyz'
    >>> spans = [('A', 0, 9), ('B', 6, 9), ('C', 25, 31)]
    >>> insert_bracketings(text, spans)

    '[A abcde [B fgh]] ijklmnop qrstu [C vw xyz]'

    My current implementation looks like:

    >>> def insert_bracketings(text, spans):

    .... starts = [start for _, start, _ in spans]
    .... ends = [end for _, _, end in spans]
    .... indices = sorted(set(starts + ends))
    .... splits = [(text[start:end], start, end)
    .... for start, end in zip([None] + indices, indices + [None])]
    .... start_map, end_map = {}, {}
    .... for label, start, end in spans:
    .... start_map.setdefault(start, []).append('[%s ' % label)
    .... end_map.setdefault(end, []).append(']')
    .... result = []
    .... for string, start, end in splits:
    .... if start in start_map:
    .... result.extend(start_map[start])
    .... result.append(string)
    .... if end in end_map:
    .... result.extend(end_map[end])
    .... return ''.join(result)
    ....

    but it seems like there ought to be an easier way. Can anyone help me?

    Thanks in advance,

    Steve
    --
    When you're being strangled, everything you do is anaerobic exercise!
    --- Adam Olshefsky
    Steven Bethard, Nov 16, 2004
    #1
    1. Advertising

  2. Steven Bethard <> wrote in message news:<>...
    > I'm trying to insert some bracketings in a string based on a set of
    > labels and associated start and end indices. For example, I'd like to
    > do something like:
    >
    > >>> text = 'abcde fgh ijklmnop qrstu vw xyz'
    > >>> spans = [('A', 0, 9), ('B', 6, 9), ('C', 25, 31)]
    > >>> insert_bracketings(text, spans)

    > '[A abcde [B fgh]] ijklmnop qrstu [C vw xyz]'
    >
    > My current implementation looks like:
    >
    > >>> def insert_bracketings(text, spans):

    > ... starts = [start for _, start, _ in spans]
    > ... ends = [end for _, _, end in spans]
    > ... indices = sorted(set(starts + ends))
    > ... splits = [(text[start:end], start, end)
    > ... for start, end in zip([None] + indices, indices + [None])]
    > ... start_map, end_map = {}, {}
    > ... for label, start, end in spans:
    > ... start_map.setdefault(start, []).append('[%s ' % label)
    > ... end_map.setdefault(end, []).append(']')
    > ... result = []
    > ... for string, start, end in splits:
    > ... if start in start_map:
    > ... result.extend(start_map[start])
    > ... result.append(string)
    > ... if end in end_map:
    > ... result.extend(end_map[end])
    > ... return ''.join(result)
    > ...
    >
    > but it seems like there ought to be an easier way. Can anyone help me?
    >
    > Thanks in advance,
    >
    > Steve



    Below is a little more readable and compact implementation that
    produces the same result. I'm not entirely sure if it qualifies as
    'better', but I do believe it is ultimately more readable.

    def insert_brackets(text, spans):
    brackets = []
    for span in spans:
    brackets.append((span[1], ("".join(('[', span[0], " ")))))
    brackets.append((span[2], ']'))
    brackets.sort() #Note: (n, '[X ') < (n, ']')
    answer = []
    lastIndex = 0
    for bracket in brackets:
    if lastIndex == bracket[0]: #Repeated index
    answer.append(bracket[1])
    else: #Non repeated index
    answer.extend((text[lastIndex:bracket[0]], bracket[1]))
    lastIndex = bracket[0]
    return "".join(answer)

    Regards,

    Michael Loritsch
    Michael Loritsch, Nov 17, 2004
    #2
    1. Advertising

  3. Steven Bethard

    Peter Otten Guest

    Steven Bethard wrote:

    > I'm trying to insert some bracketings in a string based on a set of
    > labels and associated start and end indices.  For example, I'd like to
    > do something like:
    >
    >>>> text = 'abcde fgh ijklmnop qrstu vw xyz'
    >>>> spans = [('A', 0, 9), ('B', 6, 9), ('C', 25, 31)]
    >>>> insert_bracketings(text, spans)

    > '[A abcde [B fgh]] ijklmnop qrstu [C vw xyz]'


    Not tested beyond what you see:

    text = 'abcde fgh ijklmnop qrstu vw xyz'
    spans = [('A', 0, 9), ('B', 6, 9), ('C', 25, 31)]

    def insert_bracketings(text, spans):
    inserts = [(s, "[%s " % r) for (r, s, t) in spans]
    inserts.extend([(t, "]") for (r, s, t) in spans])
    inserts.sort()
    inserts.reverse()
    text = list(text)
    for (r, s) in inserts:
    text.insert(r, s)
    return "".join(text)

    assert ('[A abcde [B fgh]] ijklmnop qrstu [C vw xyz]'
    == insert_bracketings(text, spans))

    Peter
    Peter Otten, Nov 17, 2004
    #3
  4. Steven Bethard

    Peter Otten Guest

    Peter Otten wrote:

    > Not tested beyond what you see:


    Probably wrong when a spans starts where another ends.
    I'll check later :-(

    Peter
    Peter Otten, Nov 17, 2004
    #4
  5. Steven Bethard

    Eddie Corns Guest

    Steven Bethard <> writes:

    >I'm trying to insert some bracketings in a string based on a set of
    >labels and associated start and end indices. For example, I'd like to
    >do something like:


    >>>> text = 'abcde fgh ijklmnop qrstu vw xyz'
    >>>> spans = [('A', 0, 9), ('B', 6, 9), ('C', 25, 31)]
    >>>> insert_bracketings(text, spans)

    >'[A abcde [B fgh]] ijklmnop qrstu [C vw xyz]'


    >My current implementation looks like:


    >>>> def insert_bracketings(text, spans):

    >... starts = [start for _, start, _ in spans]
    >... ends = [end for _, _, end in spans]
    >... indices = sorted(set(starts + ends))
    >... splits = [(text[start:end], start, end)
    >... for start, end in zip([None] + indices, indices + [None])]
    >... start_map, end_map = {}, {}
    >... for label, start, end in spans:
    >... start_map.setdefault(start, []).append('[%s ' % label)
    >... end_map.setdefault(end, []).append(']')
    >... result = []
    >... for string, start, end in splits:
    >... if start in start_map:
    >... result.extend(start_map[start])
    >... result.append(string)
    >... if end in end_map:
    >... result.extend(end_map[end])
    >... return ''.join(result)
    >...


    >but it seems like there ought to be an easier way. Can anyone help me?


    def insert_bracketings (txt, spans):
    text = list(txt)
    for tg,start,end in spans:
    text[start] = '[%s %s'%(tg,text[start])
    text[end-1] = '%s]'%text[end-1]
    return ''.join(text)

    print insert_bracketings('abcde fgh ijklmnop qrstu vw xyz',[('A', 0, 9), ('B', 6, 9), ('C', 25, 31)])

    Might not give what you expect if two spans start at the same place but you
    haven't defined that.

    Eddie
    Eddie Corns, Nov 17, 2004
    #5
  6. Eddie Corns wrote:
    > def insert_bracketings (txt, spans):
    > text = list(txt)
    > for tg,start,end in spans:
    > text[start] = '[%s %s'%(tg,text[start])
    > text[end-1] = '%s]'%text[end-1]
    > return ''.join(text)
    >
    > print insert_bracketings('abcde fgh ijklmnop qrstu vw xyz',[('A', 0, 9), ('B', 6, 9), ('C', 25, 31)])
    >
    > Might not give what you expect if two spans start at the same place but you
    > haven't defined that.


    If two spans start in the same place, I need them both to appear, but
    the order is not important, so I believe your code here should work
    fine. Very nice, thank you!

    Steve
    Steven Bethard, Nov 17, 2004
    #6
  7. Steven Bethard

    Peter Otten Guest

    Still not tested, but should do slightly better than my previous version.
    Python 2.4 only:

    from operator import itemgetter

    def insert_bracketings(text, spans):
    inserts = [(s, "[%s " % r) for (r, s, t) in spans]
    inserts.extend((t, "]") for (r, s, t) in spans)
    inserts.sort(key=itemgetter(0), reverse=True)
    text = list(text)
    for (r, s) in inserts:
    text.insert(r, s)
    return "".join(text)

    Apart from cosmetics, this should insert start tags before end tags at the
    same position. Relies on all end tags being equal.

    Peter
    Peter Otten, Nov 18, 2004
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Kristian
    Replies:
    0
    Views:
    506
    Kristian
    Nov 13, 2003
  2. Jurjen de Groot

    Inserting text into TableCell at runtime

    Jurjen de Groot, Aug 18, 2003, in forum: ASP .Net
    Replies:
    3
    Views:
    476
    Xavier MT
    Aug 18, 2003
  3. Connell Gauld

    Inserting a char into a string.

    Connell Gauld, Feb 6, 2005, in forum: C++
    Replies:
    2
    Views:
    371
  4. Replies:
    0
    Views:
    391
  5. Replies:
    7
    Views:
    104
Loading...

Share This Page