Separating elements from a list according to preceding element

Rob Cowie · Mar 5, 2006

I'm having a bit of trouble with this so any help would be gratefully
recieved...

After splitting up a url I have a string of the form
'tag1+tag2+tag3-tag4', or '-tag1-tag2' etc. The first tag will only be
preceeded by an operator if it is a '-', if it is preceded by nothing,
'+' is to be assumed.

Using re.split, I can generate a list that looks thus:
['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,
examine what element precedes it and accordingly, insert it into the
relevant list. However, I have not been successful.

Is there a better way that I have not considered? If this method is
suitable, how might I implement it?

Thanks all,

Rob Cowie

Gerard Flanagan · Mar 5, 2006

Rob said:
I'm having a bit of trouble with this so any help would be gratefully
recieved...

After splitting up a url I have a string of the form
'tag1+tag2+tag3-tag4', or '-tag1-tag2' etc. The first tag will only be
preceeded by an operator if it is a '-', if it is preceded by nothing,
'+' is to be assumed.

Using re.split, I can generate a list that looks thus:
['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,
examine what element precedes it and accordingly, insert it into the
relevant list. However, I have not been successful.

Is there a better way that I have not considered? If this method is
suitable, how might I implement it?

Thanks all,

Rob Cowie

a = [ '+', 'tag1', '+', 'tag2', '-', 'tag3', '+', 'tag4' ]

import itertools

b = list(itertools.islice(a,0,8,2))
c = list(itertools.islice(a,1,8,2))

result1 = [x[1] for x in itertools.izip(b,c) if x[0] == '+']
result2 = [x[1] for x in itertools.izip(b,c) if x[0] == '-']

print
print result1
print result2

Gerard

Ben Cartwright · Mar 5, 2006

Rob said:
I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,
examine what element precedes it and accordingly, insert it into the
relevant list. However, I have not been successful.

Is there a better way that I have not considered?

Maybe. You could write a couple regexes, one to find the included
tags, and one for the excluded, then run re.findall on them both.

But there's nothing fundamentally wrong with your method.

If this method is
suitable, how might I implement it?

tags = ['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

include, exclude = [], []
op = '+'
for cur in tags:
if cur in '+-':
op = cur
else:
if op == '+':
include.append(cur)
else:
exclude.append(cur)

--Ben

Gerard Flanagan · Mar 5, 2006

Gerard said:
Rob said:

I'm having a bit of trouble with this so any help would be gratefully
recieved...

After splitting up a url I have a string of the form
'tag1+tag2+tag3-tag4', or '-tag1-tag2' etc. The first tag will only be
preceeded by an operator if it is a '-', if it is preceded by nothing,
'+' is to be assumed.

Using re.split, I can generate a list that looks thus:
['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,
examine what element precedes it and accordingly, insert it into the
relevant list. However, I have not been successful.

Is there a better way that I have not considered? If this method is
suitable, how might I implement it?

Thanks all,

Rob Cowie

Click to expand...

a = [ '+', 'tag1', '+', 'tag2', '-', 'tag3', '+', 'tag4' ]

import itertools

b = list(itertools.islice(a,0,8,2))
c = list(itertools.islice(a,1,8,2))

result1 = [x[1] for x in itertools.izip(b,c) if x[0] == '+']
result2 = [x[1] for x in itertools.izip(b,c) if x[0] == '-']

print
print result1
print result2

Gerard

'8' is the length of 'a' (len(a))

James Stroud · Mar 5, 2006

Rob said:
I'm having a bit of trouble with this so any help would be gratefully
recieved...

After splitting up a url I have a string of the form
'tag1+tag2+tag3-tag4', or '-tag1-tag2' etc. The first tag will only be
preceeded by an operator if it is a '-', if it is preceded by nothing,
'+' is to be assumed.

Using re.split, I can generate a list that looks thus:
['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,
examine what element precedes it and accordingly, insert it into the
relevant list. However, I have not been successful.

Is there a better way that I have not considered? If this method is
suitable, how might I implement it?

Thanks all,

Rob Cowie

Unclever way:

alist = ['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']
include, disinclude = [], []
aniter = iter(alist)
if len(alist) % 2:
include.append(aniter.next())
for asign in aniter:
if asign == '+':
include.append(aniter.next())
else:
disinclude.append(aniter.next())

A cleverer way will probably use list comprehension and logic shortcutting.

James

James Stroud · Mar 5, 2006

Gerard said:
Rob said:

I'm having a bit of trouble with this so any help would be gratefully
recieved...

After splitting up a url I have a string of the form
'tag1+tag2+tag3-tag4', or '-tag1-tag2' etc. The first tag will only be
preceeded by an operator if it is a '-', if it is preceded by nothing,
'+' is to be assumed.

Using re.split, I can generate a list that looks thus:
['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,
examine what element precedes it and accordingly, insert it into the
relevant list. However, I have not been successful.

Is there a better way that I have not considered? If this method is
suitable, how might I implement it?

Thanks all,

Rob Cowie

Click to expand...

a = [ '+', 'tag1', '+', 'tag2', '-', 'tag3', '+', 'tag4' ]

import itertools

b = list(itertools.islice(a,0,8,2))
c = list(itertools.islice(a,1,8,2))

result1 = [x[1] for x in itertools.izip(b,c) if x[0] == '+']
result2 = [x[1] for x in itertools.izip(b,c) if x[0] == '-']

print
print result1
print result2

Gerard

Unfortunately this does not address the complete specification:

>>> a = [ 'tag1', '+', 'tag2', '-', 'tag3', '+', 'tag4' ]
>>>
>>> import itertools
>>>
>>> b = list(itertools.islice(a,0,len(a),2))
>>> c = list(itertools.islice(a,1,len(a),2))
>>>
>>> result1 = [x[1] for x in itertools.izip(b,c) if x[0] == '+']
>>> result2 = [x[1] for x in itertools.izip(b,c) if x[0] == '-']
>>>
>>> print
>>> print result1 []
>>> print result2

Click to expand...

Click to expand...

[]

Need to check for the absence of that first op.

James

James Stroud · Mar 5, 2006

Bruno said:
Rob Cowie a écrit :

I'm having a bit of trouble with this so any help would be gratefully
recieved...

After splitting up a url I have a string of the form
'tag1+tag2+tag3-tag4', or '-tag1-tag2' etc. The first tag will only be
preceeded by an operator if it is a '-', if it is preceded by nothing,
'+' is to be assumed.

Using re.split, I can generate a list that looks thus:
['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,
examine what element precedes it and accordingly, insert it into the
relevant list. However, I have not been successful.

Is there a better way that I have not considered?

Click to expand...

If you're responsible for the original URL, you may consider rewriting
it this way:
scheme://domain.tld/resource?tag1=1&tag2=1&tag3=1&tag4=0

Else - and after you've finished cursing the guy that came out with such
an innovative way to use url parameters - I think the first thing to do
would be to fix the implicit-first-operator-mess, so you have something
consistent:

if the_list[0] != "-":
the_list.insert(0, "+")

Then a possible solution could be:

todo = {'+' : [], '-' : []}
for op, tag in zip(the_list[::2], the_list[1::2]):
todo[op].append(tag)

But there's surely something better...

Fabulous. Here is a fix:

the_list = ['+'] * (len(the_list) % 2) + the_list
todo = {'+' : [], '-' : []}
for op, tag in zip(the_list[::2], the_list[1::2]):
todo[op].append(tag)

Alex Martelli · Mar 5, 2006

Gerard Flanagan said:
a = [ '+', 'tag1', '+', 'tag2', '-', 'tag3', '+', 'tag4' ]

import itertools

b = list(itertools.islice(a,0,8,2))
c = list(itertools.islice(a,1,8,2))

Much as I love itertools, this specific task would be best expressed ad

b = a[::2]
c = a[1::2]

Do note that you really don't need the 'list(...)' here, for the
following use:

result1 = [x[1] for x in itertools.izip(b,c) if x[0] == '+']
result2 = [x[1] for x in itertools.izip(b,c) if x[0] == '-']

....would be just as good if b and c were islice objects rather than
lists, except for the issue of _repeating_ (izipping twice). I'd rather
do some variant of a single-loop such as:

results = {'+':[], '-':[]}
for operator, tag in itertools.izip(a[::2], a[1::2]):
results[operator].append(tag)

and use results['+'] and results['-'] thereafter.

These approaches do not consider the inconvenient fact that the leading
'+' does in fact not appear in list a -- it needs to be assumed, the OP
stated; only a '-' would instead appear explicitly. Little for it but
specialcasing depending on whether a[0]=='-', I think -- e.g. in the
above 3-line snippet of mine, insert right after the first line:

if a[0]!='-': results['+'].append(a.pop(0))

Alex

Gerard Flanagan · Mar 5, 2006

James said:
Gerard said:

Rob said:

I'm having a bit of trouble with this so any help would be gratefully
recieved...

After splitting up a url I have a string of the form
'tag1+tag2+tag3-tag4', or '-tag1-tag2' etc. The first tag will only be
preceeded by an operator if it is a '-', if it is preceded by nothing,
'+' is to be assumed.

Using re.split, I can generate a list that looks thus:
['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,

Click to expand...

Click to expand...

[...]

a = [ '+', 'tag1', '+', 'tag2', '-', 'tag3', '+', 'tag4' ]

import itertools

b = list(itertools.islice(a,0,8,2))
c = list(itertools.islice(a,1,8,2))

result1 = [x[1] for x in itertools.izip(b,c) if x[0] == '+']
result2 = [x[1] for x in itertools.izip(b,c) if x[0] == '-']

print
print result1
print result2

Gerard

Click to expand...

Unfortunately this does not address the complete specification:

a = [ 'tag1', '+', 'tag2', '-', 'tag3', '+', 'tag4' ]

import itertools

b = list(itertools.islice(a,0,len(a),2))
c = list(itertools.islice(a,1,len(a),2))

result1 = [x[1] for x in itertools.izip(b,c) if x[0] == '+']
result2 = [x[1] for x in itertools.izip(b,c) if x[0] == '-']

print
print result1 []
print result2

Click to expand...

Click to expand...

[]

Need to check for the absence of that first op.

James

Yes, should have stuck to the spec.

Gerard

Gerard Flanagan · Mar 5, 2006

Alex said:
Gerard Flanagan said:

a = [ '+', 'tag1', '+', 'tag2', '-', 'tag3', '+', 'tag4' ]

import itertools

b = list(itertools.islice(a,0,8,2))
c = list(itertools.islice(a,1,8,2))

Click to expand...

Much as I love itertools, this specific task would be best expressed ad

b = a[::2]
c = a[1::2]

Yes, I thought that when I saw bruno's solution - I can't say that I've
never seen that syntax before, but I never really understood that this
is what it did.

Do note that you really don't need the 'list(...)' here, for the
following use:

result1 = [x[1] for x in itertools.izip(b,c) if x[0] == '+']
result2 = [x[1] for x in itertools.izip(b,c) if x[0] == '-']

Click to expand...

...would be just as good if b and c were islice objects rather than
lists, except for the issue of _repeating_ (izipping twice).

I couldn't get it to work without the 'list(...)' , it seems you must
have to 'rewind' the islice, eg. this works:

b = itertools.islice(a,0,8,2)
c = itertools.islice(a,1,8,2)

result1 = [x[1] for x in itertools.izip(b,c) if x[0] == '+']

b = itertools.islice(a,0,8,2)
c = itertools.islice(a,1,8,2)

result2 = [x[1] for x in itertools.izip(b,c) if x[0] == '-']

but not without that 're-assignment' of b and c.

I'd rather
do some variant of a single-loop such as:

results = {'+':[], '-':[]}
for operator, tag in itertools.izip(a[::2], a[1::2]):
results[operator].append(tag)

and use results['+'] and results['-'] thereafter.

These approaches do not consider the inconvenient fact that the leading
'+' does in fact not appear in list a -- it needs to be assumed, the OP
stated; only a '-' would instead appear explicitly. Little for it but
specialcasing depending on whether a[0]=='-', I think -- e.g. in the
above 3-line snippet of mine, insert right after the first line:

if a[0]!='-': results['+'].append(a.pop(0))

Alex

Cheers

Gerard

Paul McGuire · Mar 5, 2006

Rob Cowie said:
I'm having a bit of trouble with this so any help would be gratefully
recieved...

After splitting up a url I have a string of the form
'tag1+tag2+tag3-tag4', or '-tag1-tag2' etc. The first tag will only be
preceeded by an operator if it is a '-', if it is preceded by nothing,
'+' is to be assumed.

Using re.split, I can generate a list that looks thus:
['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,
examine what element precedes it and accordingly, insert it into the
relevant list. However, I have not been successful.

Is there a better way that I have not considered? If this method is
suitable, how might I implement it?

Thanks all,

Rob Cowie

Here's how this would look with pyparsing (download pyparsing at
http://pyparsing.sourceforge.net ):

data = 'tag1+tag2+tag3-tag4'

from pyparsing import *
tag = Word(alphas,alphanums)
incl = Literal("+").suppress()
excl = Literal("-").suppress()

inclQual = Optional(incl) + tag
exclQual = excl + tag
qualDef = OneOrMore(
inclQual.setResultsName("includes",listAllMatches=True ) |

exclQual.setResultsName("excludes",listAllMatches=True ) )

quals = qualDef.parseString(data)
print quals.includes
print quals.excludes

Prints out:

[['tag1'], ['tag2'], ['tag3']]
[['tag4']]

-- Paul

Bruno Desthuilliers · Mar 5, 2006

Rob Cowie a écrit :

I'm having a bit of trouble with this so any help would be gratefully
recieved...

After splitting up a url I have a string of the form
'tag1+tag2+tag3-tag4', or '-tag1-tag2' etc. The first tag will only be
preceeded by an operator if it is a '-', if it is preceded by nothing,
'+' is to be assumed.

Using re.split, I can generate a list that looks thus:
['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,
examine what element precedes it and accordingly, insert it into the
relevant list. However, I have not been successful.

Is there a better way that I have not considered?

If you're responsible for the original URL, you may consider rewriting
it this way:
scheme://domain.tld/resource?tag1=1&tag2=1&tag3=1&tag4=0

Else - and after you've finished cursing the guy that came out with such
an innovative way to use url parameters - I think the first thing to do
would be to fix the implicit-first-operator-mess, so you have something
consistent:

if the_list[0] != "-":
the_list.insert(0, "+")

Then a possible solution could be:

todo = {'+' : [], '-' : []}
for op, tag in zip(the_list[::2], the_list[1::2]):
todo[op].append(tag)

But there's surely something better...

Michael Spencer · Mar 5, 2006

Rob said:
I'm having a bit of trouble with this so any help would be gratefully
recieved...

After splitting up a url I have a string of the form
'tag1+tag2+tag3-tag4', or '-tag1-tag2' etc. The first tag will only be
preceeded by an operator if it is a '-', if it is preceded by nothing,
'+' is to be assumed.

Using re.split, I can generate a list that looks thus:
['tag1', '+', 'tag2', '+', 'tag3', '-', 'tag4']

I wish to derive two lists - each containing either tags to be
included, or tags to be excluded. My idea was to take an element,
examine what element precedes it and accordingly, insert it into the
relevant list. However, I have not been successful.

Is there a better way that I have not considered? If this method is
suitable, how might I implement it?

Thanks all,

Rob Cowie

Since you're already using a regexp, why not modify it to group the operators
with their tags? :

>>> import re
>>> source = "tag1+tag2+tag3-tag4" ....
>>> tagfinder = re.compile("([+-]?)(\w+)") ....
>>> include = []
>>> exclude = [] ....
>>> for op, tag in tagfinder.findall(source):

Click to expand...

Click to expand...

.... if op == "-":
.... exclude.append(tag)
.... else:
.... include.append(tag)
....

>>> include ['tag1', 'tag2', 'tag3']
>>> exclude ['tag4']
>>>

Click to expand...

Click to expand...

(Example assumes that a tag can be matched by \w+ and that there
is no space between the operators and their tags)

Michael

Bruno Desthuilliers · Mar 6, 2006

Gerard Flanagan a écrit :

Alex said:
Alex said:

a = [ '+', 'tag1', '+', 'tag2', '-', 'tag3', '+', 'tag4' ]

import itertools

b = list(itertools.islice(a,0,8,2))
c = list(itertools.islice(a,1,8,2))

Click to expand...

Much as I love itertools, this specific task would be best expressed ad

b = a[::2]
c = a[1::2]

Click to expand...

Yes, I thought that when I saw bruno's solution - I can't say that I've
never seen that syntax before, but I never really understood that this
is what it did.

It's in fact pretty simple. The full slice syntax is [start:end:step],
with default values of start=0, end=len(seq), step=1. So a[::2] will
retrieve a[0], a[2], a[4] etc, and a[1::2] -> a[1], a[3], a[5] etc.

(snip)

Rob Cowie · Mar 6, 2006

Thanks everyone. I assumed there was something I had not considered...
list slicing is that thing.

The pyParsing example looks interesting - but for this case, a little
too heavy. It doesn't really warrant including a third party module.

Rob C

Show full path to all tags in xml (xslt newbie)	1	Sep 6, 2010
<Need Help>How to get the count of elements referencing another element in XSLT?	0	Sep 12, 2008
regular expressions and matching delimeters	17	May 21, 2014
javascript and XML help	2	Jul 25, 2006
Newbie Question: Obtain element from list of tuples	14	Dec 18, 2011
modifying a list element from a function	6	Mar 27, 2009
Deleting more than one element from a list	6	Apr 21, 2010
Function to remove elements from a list not working	3	Jun 12, 2006

Separating elements from a list according to preceding element

Rob Cowie

Gerard Flanagan

Ben Cartwright

Gerard Flanagan

James Stroud

James Stroud

James Stroud

Alex Martelli

Gerard Flanagan

Gerard Flanagan

Paul McGuire

Bruno Desthuilliers

Michael Spencer

Bruno Desthuilliers

Rob Cowie

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads