Extract the numeric and alphabetic part from an alphanumeric string

Sandhya Prabhakaran · Aug 3, 2009

Hi,

I have a string as str='123ACTGAAC'.

I need to extract the numeric part from the alphabetic part which I
did using123

To get the alphabetic part, I could doACTGAAC
But when I giveTraceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: expected a character buffer object

How do I blank out the initial numeric part so as to get just the
alphabetic part. The string is always in the same format.

Please help.

Regards,
Sandhya

Peter Brett · Aug 3, 2009

Sandhya Prabhakaran said:
Hi,

I have a string as str='123ACTGAAC'.

I need to extract the numeric part from the alphabetic part which I
did using
123

To get the alphabetic part, I could do
ACTGAAC
But when I give
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: expected a character buffer object

How do I blank out the initial numeric part so as to get just the
alphabetic part. The string is always in the same format.

Firstly, you really should read the Regular Expression HOWTO:

http://docs.python.org/howto/regex.html#regex-howto

Secondly, is this what you wanted to do?
'ACTGAAC'

Regards,

Peter

Andreas Tawn · Aug 3, 2009

Hi,

I have a string as str='123ACTGAAC'.

I need to extract the numeric part from the alphabetic part which I
did using
123

To get the alphabetic part, I could do
ACTGAAC
But when I give
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: expected a character buffer object

How do I blank out the initial numeric part so as to get just the
alphabetic part. The string is always in the same format.

Please help.

Regards,
Sandhya

If the format's always the same, you could use slicing instead.

s = '123ACTGAAC'
s[:3] '123'
s[3:]

Click to expand...

Click to expand...

'ACTGAAC'

BTW, you should avoid using built-ins like str for variable names. Bad
things will happen.

Cheers,

Drea

MRAB · Aug 3, 2009

Sandhya said:
Hi,

I have a string as str='123ACTGAAC'.

I need to extract the numeric part from the alphabetic part which I
did using
123

[snip]

I get:

['123']

which is a _list_ of the strings found.

Kushal Kumaran · Aug 3, 2009

Hi,

I have a string as str='123ACTGAAC'.

I need to extract the numeric part from the alphabetic part which I
did using
123

The docs for re.findall say that it returns a list of matches. So
'123' will be numer[0].

To get the alphabetic part, I could do
ACTGAAC
But when I give
Traceback (most recent call last):
Â File "<stdin>", line 1, in <module>
TypeError: expected a character buffer object

That's what would happen if you pass in a list instead of a string to replace.

Dennis Lee Bieber · Aug 3, 2009

Hi,

I have a string as str='123ACTGAAC'.

I need to extract the numeric part from the alphabetic part which I
did using
123

<snip>

Did you really cut&paste that from an interpreter window? I doubt
it...

str = "123ACTGAAC"
import re
numer = re.findall(r"\d+", str)
numer ['123']

Click to expand...

Click to expand...

Compare... YOU claim to have gotten an INTEGER (there are no quotes
around the output value). I get a one element LIST containing a STRING
value.

numer[0] '123'
int(numer[0]) 123

Click to expand...

Click to expand...

How do I blank out the initial numeric part so as to get just the
alphabetic part. The string is always in the same format.

And that format is?

Given just your example, one could interpret it to be: 3 digits
followed by 7 alphabetic characters. For that, I'd be using a simple

nmr = str[:3] #still in character representation
str = str[3:]

Or do you mean a variable width integer field followed by a variable
width alpha field?

str2 = "4328ABcde"
num2 = re.findall(r"\d+", str2)
num2 ['4328']
str[len(numer[0]):] 'ACTGAAC'
str2[len(num2[0]):] 'ABcde'

Click to expand...

Click to expand...

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/

alex23 · Aug 3, 2009

Sandhya Prabhakaran said:
I have a string as str='123ACTGAAC'.

You shouldn't use 'str' as a label like that, it prevents you from
using the str() function in the same body of code.

How do I blank out the initial numeric part so as to get just the
alphabetic part. The string is always in the same format.

('123', 'ACTGAAC')

If by 'always in the same format' you mean the positions of the
numbers & alphas,
you could slightly abuse the struct module:
('123', 'ACTGAAC')

But seriously, you should use slicing:

sample = '123ACTGAAC'
sample[0:3], sample[3:]

Click to expand...

Click to expand...

('123', 'CTGAAC')

You can also label the slices, which can be handy for self-documenting
your code:

num = slice(3)
alp = slice(4,10)
sample[num], sample[alp]

Click to expand...

Click to expand...

('123', 'CTGAAC')

re.match and non-alphanumeric characters	8	Nov 16, 2008
Extract alphanumeric text from a string	0	Jul 23, 2009
Return pointer from void only gives the memory address	1	Nov 23, 2024
extract stream title from the output of mplayer	0	Mar 18, 2014
Extract Numeric values from string	14	Sep 11, 2008
The distinction between a java applet and an application	1	Jan 4, 2023
Extract String From Enclosing Tuple	12	Feb 28, 2007
Best Way to extract Numbers from String	4	Mar 19, 2010

Extract the numeric and alphabetic part from an alphanumeric string

Sandhya Prabhakaran

Peter Brett

Andreas Tawn

MRAB

Kushal Kumaran

Dennis Lee Bieber

alex23

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads