Regular expression problem

M

mukesh tiwari

Hello all
I am trying to solve this problem[1] using regular expression. I wrote thiscode but I am getting time limit exceed. Could some one please tell me howto make this code run faster.

import re

if __name__ == "__main__":
n = int ( raw_input() )
c = 1
while c <= n :
email = filter ( lambda x : x != None , [ re.search ( '[^~!@#$%^&*()<>?,.]*[a-zA-Z0-9][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._]*@[a-zA-Z0-9]+.(com|edu|org|co.in)[^~!@#$%^&*()<>?,.a-zA-Z0-9]*' , x ) for x in raw_input().split(' ') ] )
t = len ( email )
print 'Case #' + str ( c ) + ': ' + str ( t )
for i in xrange ( t ):
print email.group()
c += 1

Also rather than breaking the string at space, I tried to use findall but Iam not getting correct result.
re.findall ( '[^~!@#$%^&*()<>?,.]*[a-zA-Z0-9][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._]*@[a-zA-Z0-9]+.(com|edu|org|co.in)[^~!@#$%^&*()<>?,.a-zA-Z0-9]*[ ]+','(e-mail address removed) (e-mail address removed)')
['com']

I am suppose to get [ (e-mail address removed) , (e-mail address removed)] ?

Regards
Mukesh Tiwari

[1] http://www.spoj.com/problems/MAIN12C/
 
C

Chris Angelico

I am trying to solve this problem[1] using regular expression. I wrote this code but I am getting time limit exceed. Could some one please tell me how to make this code run faster.

What is the time limit? I just tried it (Python 2.6 under Windows) and
it finished in a humanly-immeasurable amount of time. Are you sure
that STDIN (eg raw_input()) is where your test data is coming from?

ChrisA
 
M

mukesh tiwari

Hi Chris
On the problem page, it is 3 second.
What is the time limit? I just tried it (Python 2.6 under Windows) and

it finished in a humanly-immeasurable amount of time. Are you sure

that STDIN (eg raw_input()) is where your test data is coming from?

Yes, on SPOJ we read data from STDIN.

Regards
Mukesh Tiwari
 
M

mukesh tiwari

Hi Chris
On the problem page, it is 3 second.
What is the time limit? I just tried it (Python 2.6 under Windows) and

it finished in a humanly-immeasurable amount of time. Are you sure

that STDIN (eg raw_input()) is where your test data is coming from?

Yes, on SPOJ we read data from STDIN.

Regards
Mukesh Tiwari
 
M

mukesh tiwari

Hi Chris
Thank you! Now I am getting wrong answer so at least program is faster then previous one and I am looking for wrong answer reason. Thanks again!

import re

if __name__ == "__main__":
n = int ( raw_input() )
c = 1
while c <= n :
email = filter ( lambda x : x != None , [ re.search ( '[a-zA-Z0-9][a-zA-Z0-9._]{4,}@[a-zA-Z0-9]+.(com|edu|org|co.in)' , x ) for x in raw_input().split(' ') ] )
t = len ( email )
print 'Case #' + str ( c ) + ': ' + str ( t )
for i in xrange ( t ) :
print email.group()
c += 1

Regards
Mukesh Tiwari
 
M

mukesh tiwari

Hi Chris
Thank you! Now I am getting wrong answer so at least program is faster then previous one and I am looking for wrong answer reason. Thanks again!

import re

if __name__ == "__main__":
n = int ( raw_input() )
c = 1
while c <= n :
email = filter ( lambda x : x != None , [ re.search ( '[a-zA-Z0-9][a-zA-Z0-9._]{4,}@[a-zA-Z0-9]+.(com|edu|org|co.in)' , x ) for x in raw_input().split(' ') ] )
t = len ( email )
print 'Case #' + str ( c ) + ': ' + str ( t )
for i in xrange ( t ) :
print email.group()
c += 1

Regards
Mukesh Tiwari
 
C

Chris Angelico

Hi Chris
Thank you! Now I am getting wrong answer so at least program is faster then previous one and I am looking for wrong answer reason. Thanks again!

Excellent! Have fun.

Incidentally, regular expressions aren't the only way to solve this
sort of problem. If you get stuck with one method, it may be worth
trying another one, to see if you can get around the issue.

As they say, now you have two problems...

ChrisA
 
T

Terry Reedy

Hello all
I am trying to solve this problem[1]
[1] http://www.spoj.com/problems/MAIN12C/

As I remember, and as it still appears, this site severely penalizes
Python solvers by using the same time limit for all languages. Thus, a
'slow' python program may work correctly but the site will not let you
know. A test that refuses to answer is no test at all. In the meanwhile,
an algorithmically equivalent C program will be run and judged correct,
so the programmer can try to speed up while not losing correctness.

By teaching 'speed before correctness", this site promotes bad
programming habits and thinking (and the use of low-level but faster
languages). I quote your later response: "Now I am getting wrong answer
so at least program is faster then previous one". If the previous one
was correct and the revision wrong, you should toss the revision and go
back to the correct program.

I recommend that you work on problems where you have tests that you can
actually run even before you code.
 
J

jmfauth

...
By teaching 'speed before correctness", this site promotes bad
programming habits and thinking (and the use of low-level but faster
languages).
...


This is exactly what "your" flexible string representation
does!

And away from technical aspects, you even succeeded to
somehow lose unicode compliance.

jmf
 
M

Mark Lawrence

This is exactly what "your" flexible string representation
does!

And away from technical aspects, you even succeeded to
somehow lose unicode compliance.

jmf

Please stick to something you know about such as sexual self abuse.
 
R

rusi

This is exactly what "your" flexible string representation
does!

This is an old complaint of your with no new data for supporting it
And away from technical aspects, you even succeeded to
somehow lose unicode compliance.

This is a new complaint.
Just to make it clear:
1. All your recent complaints about unicode were in the realm of
performance
So your complaint that python has lost unicode compliance can mean one
of:
2a. The unicode standard mandates performance criteria
or
2b. There are problems with python's implementation (of strings?) that
have functional problems apart from your old performance complaints
or
2c. You need to look up what 'compliance' means

My own choice is to have a mid-point between
Very early binding: Narrow vs wide builds
Very late binding: String reps can change with the characters as they
are seen
This mid point would be perhaps a commandline switch to choose string-
engine attributes.


However to make this choice even worth a second look we need to have
hard data about performance that you have been unable to provide.

[See the recent thread of RoR vs Django to see the problems of
excessive spurious choice]
 
N

Ned Deily

A friendly reminder that this forum is for general discussion and
questions about Python.

"Pretty much anything Python-related is fair game for discussion, and
the group is even fairly tolerant of off-topic digressions; there have
been entertaining discussions of topics such as floating point, good
software design, and other programming languages such as Lisp and Forth."

But ...

"Rudeness and personal attacks, even in reaction to blatant flamebait,
are strongly frowned upon. People may strongly disagree on an issue, but
usually discussion remains civil. In case of an actual flamebait
posting, you can ignore it, quietly plonk the offending poster in your
killfile or mail filters, or write a sharp but still-polite response,
but at all costs resist the urge to flame back."

http://www.python.org/community/lists/

It's up to all of us to help keep this group/list a place where people
enjoy participating, without fear of gratuitous personal sniping.
Thanks!
 
S

Serhiy Storchaka

Hello all
I am trying to solve this problem[1]
[1] http://www.spoj.com/problems/MAIN12C/

As I remember, and as it still appears, this site severely penalizes
Python solvers by using the same time limit for all languages. Thus, a
'slow' python program may work correctly but the site will not let you
know.

I'm sure the time limits are enough to solve most (if not all) of
problems. Actually all submitted solutions on Python for this problem
run from 0.47 to 0.61 seconds (http://www.spoj.com/ranks/MAIN12C/).
 
T

Terry Reedy

Hello all
I am trying to solve this problem[1]
[1] http://www.spoj.com/problems/MAIN12C/

As I remember, and as it still appears, this site severely penalizes
Python solvers by using the same time limit for all languages. Thus, a
'slow' python program may work correctly but the site will not let you
know.

I'm sure the time limits are enough to solve most (if not all) of
problems. Actually all submitted solutions on Python for this problem
run from 0.47 to 0.61 seconds (http://www.spoj.com/ranks/MAIN12C/).

You do not see the solutions that timed out. I suppose you are pointing
to the fact that for this problem there are solutions close to but under
the time limit. However, algorithm running times are not evenly
distributed. Suppose, for instance) there is a correct O(n**2) solution
and a correct O(n) solution and that the ones listed are the O(n)
solutions. Then the Python O(n**2) solutions could easily take 10x
longer to run and time out, while equivalent C solutions do not.

Mukesh is not the first to post here a reasonable looking solution for
that site that he could not judge because the test quite and refused to
answer. I point out again that he was 'happy' to have a faster but
incorrect program, even though it might have been a regression from his
original.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top