finding items that occur before or after an item in lists

Simon Forman · Jul 4, 2006

I've got a function that I'd like to improve.

It takes a list of lists and a "target" element, and it returns the set
of the items in the lists that appear either before or after the target
item. (Actually, it's a generator, and I use the set class outside of
it to collect the unique items, but you get the idea. ;-) )

data = [
['this', 'string', 'is', 'nice'],
['this', 'string', 'sucks'],
['string', 'not', 'good'],
['what', 'a', 'string']
]

def f(target, ListOfLists):
for N in ListOfLists:
try:
i = N.index(target)
except ValueError:
continue

# item before target
if i: yield N[i - 1]

# item after target
try:
yield N[i + 1]
except IndexError:
pass

print set(n for n in f('string', data))

# Prints set(['this', 'not', 'is', 'sucks', 'a'])

Now, this works and normally I'd be happy with this and not try to
improve it unless I found that it was a bottleneck. However, in this
case I know that when a try..except statement fails (i.e. executes the
except part of the statement) it's "slow", *and* I know that the real
list of lists will have many lists in it in which the target item does
not appear.

That means that the first try..except statement will be executing it's
except clause more than half the time, and probably *much* more than
half the time.

This little problem strikes me as one of those fun puzzles that are a
joy to solve in python, so rather than keep it to myself I thought I'd
post it here on c.l.p and see what folks could come up with.

I'm not so much looking for simple improvements like

def f(target, ListOfLists):
for N in ListOfLists:
if target not in N:
continue
i = N.index(target)
# etc...

but rather cool, elegant, efficient, "pythonic" different ways to do
it.

I love these kinds of puzzles but I know the combined "might" of
comp.lang.python can come up with stuff I would never dream of, so, um,
here you go.

About the input data:
1) there are many lists, several hundred or thousand, perhaps even tens
of thousand
2) most of them will have less than 5 items, and almost all of them
less than 20, I don't know the maximum length but it will be less than
50 or so
3) The target occurs at most once in each list. (*no* item is
duplicated within a single list, all elements of a list are unique to
that list but may recur in other lists)
4) *most* of the lists will not contain the target
5) The data are not static, but they do change less often than this
function will be called. I don't know yet, but I'd say that this
function will be called maybe 10 times more often than the list of
lists will be modified.
6) Changes to the data will all be either adding or removing lists, the
lists themselves won't be modified.
7) The form of the data can be other than a list of lists, i.e. a set
of tuples, a list of tuples, a tuple of lists (although that would
require creating a new tuple when the data are modified, see 6)
whatever.
8) Building auxiliary data structures is fine (like a dict mapping
targets to lists they appear in, or to the sets that the function
returns ("memoizing")) but while speed is the main concern, memory
usage shouldn't be flagrant. Changing the data should be fast, so
rebuilding the auxiliary data structures shouldn't be expensive

Let's see, I guess that's about it. Thanks in advance.

Peace,
~Simon

--
"The history of mankind for the last four centuries is rather like that
of an imprisoned sleeper, stirring clumsily and uneasily while the
prison that restrains and shelters him catches fire, not waking but
incorporating the crackling and warmth of the fire with ancient and
incongruous dreams, than like that of a man consciously awake to danger
and opportunity." --H. P. Wells, "A Short History of the World"

Simon Forman · Jul 4, 2006

Simon said:
I've got a function that I'd like to improve.

It takes a list of lists and a "target" element, and it returns the set
of the items in the lists that appear either before or after the target
item. (Actually, it's a generator, and I use the set class outside of
it to collect the unique items, but you get the idea. ;-) )

data = [
['this', 'string', 'is', 'nice'],
['this', 'string', 'sucks'],
['string', 'not', 'good'],
['what', 'a', 'string']
]

def f(target, ListOfLists):
for N in ListOfLists:
try:
i = N.index(target)
except ValueError:
continue

# item before target
if i: yield N[i - 1]

# item after target
try:
yield N[i + 1]
except IndexError:
pass

print set(n for n in f('string', data))

# Prints set(['this', 'not', 'is', 'sucks', 'a'])

Now, this works and normally I'd be happy with this and not try to
improve it unless I found that it was a bottleneck. However, in this
case I know that when a try..except statement fails (i.e. executes the
except part of the statement) it's "slow", *and* I know that the real
list of lists will have many lists in it in which the target item does
not appear.

That means that the first try..except statement will be executing it's
except clause more than half the time, and probably *much* more than
half the time.

Nevermind, it turns out that each list in the list of lists will
*always* contain the target item, making this task much less
interesting.

~Simon

Changing string value that is an element of a list	3	Feb 10, 2025
finding items that occur more than once in a list	39	Mar 18, 2008
Idioms combining 'next(items)' and 'for item in items:'	1	Sep 10, 2011
How to fetch and console.log all items from an associative array	3	Apr 17, 2024
How can I guarantee that the all callback functions of the first Ajax API call have finished executing before initiating the 2 call in JavaScript?	2	Oct 30, 2023
Need help finding Segmentation fault C++	0	Apr 16, 2022
sharing/swapping items between lists	21	Apr 10, 2009
Telethon Authorization. All sessions are reset after authorization	0	Aug 21, 2025

finding items that occur before or after an item in lists

Simon Forman

Simon Forman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads