Reg Expression - Get position of >

M

M_H

Hey,

I need the position of the last char >

Let's say I have a string
mystr = <mimetype="text/html"><content><![CDATA[

I need the posistion of the "> (second sign) - so I can cut away the
first part.

The problem is that it can be like "> but also like " > or " >

But it is def the quotes and the closing brakets.

How do I get the position of the > ????

Hope you can help,
Bacco
 
C

Chris Rebert

Hey,

I need the position of the last char >

Let's say I have a string
mystr = <mimetype="text/html"><content><![CDATA[

I need the posistion of the "> (second sign) - so I can cut away the
first part.

The problem is that it can be like "> but also like " > or " >

But it is def the quotes and the closing brakets.

How do I get the position of the > ????

Python 2.6 (r26:66714, Nov 18 2008, 21:48:52)
[GCC 4.0.1 (Apple Inc. build 5484)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
mystr = '<mimetype="text/html"><content><![CDATA['
mystr.rfind('>')
30

Cheers,
Chris
 
R

r

Hey,

I need the position of the last char >

Let's say I have a string
mystr =  <mimetype="text/html"><content><![CDATA[

I need the posistion of the "> (second sign) - so I can cut away the
first part.

The problem is that it can be like "> but also like " > or "     >

But it is def the quotes and the closing brakets.

How do I get the position of the >  ????

Hope you can help,
Bacco

why not just spilt
mystr = '<mimetype="text/html"><content><![CDATA['
mystr.split('>', 2)[-1]
'<![CDATA['

you don't want to use an re for something like this
 
R

r

Hey,
I need the position of the last char >
Let's say I have a string
mystr =  <mimetype="text/html"><content><![CDATA[
I need the posistion of the "> (second sign) - so I can cut away the
first part.
The problem is that it can be like "> but also like " > or "     >
But it is def the quotes and the closing brakets.
How do I get the position of the >  ????
Hope you can help,
Bacco
why not just spilt
mystr =  '<mimetype="text/html"><content><![CDATA['
mystr.split('>', 2)[-1] '<![CDATA['

you don't want to use an re for something like this

Depends on if you have an irrational fear of REs or not ... I agree
that REs are overused for things which are better done with split, but
in this case I think an RE would be clearer.

'dd'

-- assuming he means what I think he means. The question was almost
impossible to comprehend.

/Jorgen

i think what M_H wanted was to find the second occurance of ">" char
in mystr.
Now if mystr will always look exactly as show then Jorgen Grahn's re
will work fine. But it looks to me that the poster only showed us a
portion of the string, and as you can see the <mimetype tag is not
closed in mystr, which would break your re, if the string acually
extends further. Split would be fool-proof in all situations. But then
again i had to read the post 5 times before i understood it. It may be
advisable for M_H to repost the question in a clearer manner so that
we can be sure our answers are correct!
 
M

M_H

Hey,
I need the position of the last char >
Let's say I have a string
mystr =  <mimetype="text/html"><content><![CDATA[
I need the posistion of the "> (second sign) - so I can cut away the
first part.
The problem is that it can be like "> but also like " > or "     >
But it is def the quotes and the closing brakets.
How do I get the position of the >  ????
Hope you can help,
Bacco
why not just spilt
mystr =  '<mimetype="text/html"><content><![CDATA['
mystr.split('>', 2)[-1]
'<![CDATA['
you don't want to use an re for something like this
Depends on if you have an irrational fear of REs or not ... I agree
that REs are overused for things which are better done with split, but
in this case I think an RE would be clearer.

-- assuming he means what I think he means. The question was almost
impossible to comprehend.

i think what M_H wanted was to find the second occurance of ">" char
in  mystr.
Now if mystr will always look exactly as show then Jorgen Grahn's re
will work fine. But it looks to me that the poster only showed us a
portion of the string, and as you can see the <mimetype tag is not
closed in mystr, which would break your re, if the string acually
extends further. Split would be fool-proof in all situations. But then
again i had to read the post 5 times before i understood it. It may be
advisable for M_H to repost the question in a clearer manner so that
we can be sure our answers are correct!


Thanks for all your answers.
R is correct with his assumptions - sorry for the confusion.

So let me post it again, easier

I have a beginning of a (longer) string who is like:
mystr = '<mimetype="text/html"><content><![CDATA['
or like
mystr = '<mimetype="text/html" ><content><![CDATA['
or like
mystr = '<mimetype="text/html" >
NewLine <content><![CDATA['

I want to have the end-position of the mimetype tag (position as
mystr.find('>') returns, so I can use the number for a loop)
However, I can't use just the '>' because the character > could also
be in the string of mimetype (I know, actually not in mimetype, but
let's assume it).
So that is why the filter shall be bulletproof and check for '">' -
with possible spaces between both characters.

I don't know yet how to solve this issue - any recommendations?
 
C

Chris Rebert

I need the position of the last char >
Let's say I have a string
mystr = <mimetype="text/html"><content><![CDATA[
I need the posistion of the "> (second sign) - so I can cut away the
first part.
The problem is that it can be like "> but also like " > or " >
But it is def the quotes and the closing brakets.
How do I get the position of the > ????
Hope you can help,
Bacco
why not just spilt
mystr = '<mimetype="text/html"><content><![CDATA['
mystr.split('>', 2)[-1]
'<![CDATA['
you don't want to use an re for something like this
Depends on if you have an irrational fear of REs or not ... I agree
that REs are overused for things which are better done with split, but
in this case I think an RE would be clearer.
re.sub('.*>', '', 'dkjk>dj>>>>dd')

-- assuming he means what I think he means. The question was almost
impossible to comprehend.

i think what M_H wanted was to find the second occurance of ">" char
in mystr.
Now if mystr will always look exactly as show then Jorgen Grahn's re
will work fine. But it looks to me that the poster only showed us a
portion of the string, and as you can see the <mimetype tag is not
closed in mystr, which would break your re, if the string acually
extends further. Split would be fool-proof in all situations. But then
again i had to read the post 5 times before i understood it. It may be
advisable for M_H to repost the question in a clearer manner so that
we can be sure our answers are correct!


Thanks for all your answers.
R is correct with his assumptions - sorry for the confusion.

So let me post it again, easier

I have a beginning of a (longer) string who is like:
mystr = '<mimetype="text/html"><content><![CDATA['
or like
mystr = '<mimetype="text/html" ><content><![CDATA['
or like
mystr = '<mimetype="text/html" >
NewLine <content><![CDATA['

I want to have the end-position of the mimetype tag (position as
mystr.find('>') returns, so I can use the number for a loop)
However, I can't use just the '>' because the character > could also
be in the string of mimetype (I know, actually not in mimetype, but
let's assume it).
So that is why the filter shall be bulletproof and check for '">' -
with possible spaces between both characters.

I don't know yet how to solve this issue - any recommendations?

Any particular reason you're not using an HTML parser (e.g. BeautifulSoup) ?

Cheers,
Chris[/QUOTE]
 
J

Jorgen Grahn

....

....

I want to have the end-position of the mimetype tag (position as
mystr.find('>') returns, so I can use the number for a loop)
However, I can't use just the '>' because the character > could also
be in the string of mimetype (I know, actually not in mimetype, but
let's assume it).
So that is why the filter shall be bulletproof and check for '">' -
with possible spaces between both characters.

OK. I am too tired to think it through, but if you need to handle
nesting brackets or escaped brackets (e.g. ignore brackets inside
double-quoted strings) then an RE is not the best solution.

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top