Login to site with random image code?

L

Lucas Van Hieng

What I am trying to do is write a Perl script that process a certain
webpage that requires me to be logged in:

(http://www.starlance.us/MW4/)

They recently redid the site, so now it requires that you enter the
random 5 digit number shown on a png image. (Before there was no such
extra security and I was able to POST with LWP:UserAgent.)

Is there way at all around these things? I'm guessing that this would be
easier than with what MSN and Yahoo use, which is often a scrambled
mess. This site however uses uniform text (as if just typed into the
image with text tool and saved.) Is there anyway to do some sort of OCR
(char recognition) on the fly?

Thanks for any info on this.
 
B

Brian McCauley

Lucas Van Hieng said:
What I am trying to do is write a Perl script that process a certain
webpage that requires me to be logged in:

(http://www.starlance.us/MW4/)

They recently redid the site, so now it requires that you enter the
random 5 digit number shown on a png image.
Is there way at all around these things?

There was a rather intersting talk on this at YAPC::Europe::2003

Dunno if you can find it online.

--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
 
J

Juha Laiho

Lucas Van Hieng said:
What I am trying to do is write a Perl script that process a certain
webpage that requires me to be logged in:

(http://www.starlance.us/MW4/)

They recently redid the site, so now it requires that you enter the
random 5 digit number shown on a png image. (Before there was no such
extra security and I was able to POST with LWP:UserAgent.)

Is there way at all around these things?

Have you considered the social approach -- that is, describe your use
and need to the site admins, and ask whether they can provide another
method for authentication?
 
L

Lucas Van Hieng

Juha Laiho said:
Have you considered the social approach -- that is, describe your use
and need to the site admins, and ask whether they can provide another
method for authentication?

Yes I have, and they were nice about it but said that their policy would
not permit it. My goal as I was doing previously was merely make a login
request and obtain the roster for the unit I'm involved with (though not
actually a part of this unit, I am their faitful webmaster/tech :) so
that it cna be displayed on their webpage as if it's on their server (it
was something they really liked, even more so sicne it made out site
sompletely unique next to the tripod/geocities pages most over units
use.)
 
D

Darin McBride

Lucas said:
Yes I have, and they were nice about it but said that their policy would
not permit it. My goal as I was doing previously was merely make a login
request and obtain the roster for the unit I'm involved with (though not
actually a part of this unit, I am their faitful webmaster/tech :) so
that it cna be displayed on their webpage as if it's on their server (it
was something they really liked, even more so sicne it made out site
sompletely unique next to the tripod/geocities pages most over units
use.)

I suppose the point is that if they're attempting to block scripts for
a reason, then anything your script can do, the scripts they're
attempting to block can do as well. Thus, if you find a way to OCR the
image, they'll simply change to an image type that you can't OCR.

That's not to say it's no fun trying... :->
 
T

Trent Curry

Darin said:
I suppose the point is that if they're attempting to block scripts for
a reason, then anything your script can do, the scripts they're
attempting to block can do as well. Thus, if you find a way to OCR
the image, they'll simply change to an image type that you can't OCR.

Well keep in mind that the image format needs to be displayable by any
(visual) web browser, so that really limits the types (png, jpg, and gif
mainly.) My point is thers only so manay ways they go in that respect.
That's not to say it's no fun trying... :->

True :)

--
Trent Curry

perl -e
'($s=qq/e29716770256864702379602c6275605/)=~s!([0-9a-f]{2})!pack("h2",$1
)!eg;print(reverse("$s")."\n");'
 
P

pkent

Lucas Van Hieng said:
(http://www.starlance.us/MW4/)

They recently redid the site, so now it requires that you enter the
random 5 digit number shown on a png image. (Before there was no such
extra security and I was able to POST with LWP:UserAgent.)

Is there way at all around these things? I'm guessing that this would be
easier than with what MSN and Yahoo use, which is often a scrambled
mess. This site however uses uniform text (as if just typed into the
image with text tool and saved.) Is there anyway to do some sort of OCR
(char recognition) on the fly?

Funnily enough we're looking at implementing a similar system at work.
Aaanyway...

From looking at it it does appear that the image uses only 2 colours -
the foreground and the background. There seems to be a 1 pixel gap
between each digit. The code appears to use only the digits 0 to 9. The
font doesn't vary and seems to be a variable-width font. The image seems
to be the same as long as you have the same PHPSESSIONID cookie from
them.

Using this information one approach would be:

a) get the image
b) convert it into some format that you can manipulate from perl - GD
might be of use.
c) scan over all the columns to identify columns that contain all the
same colour - those may be the breaks between digits. Trim the leading
and trailing space too. Maybe trim the space above and below too.
d) extract each rectangular area that is probably a digit.
e) the characters vary in size - use the size of the area to identify
some (maybe all) of the digits.
f) or somehow compare the rectangle's contents to digits you've matched
by hand and then you know what the digit is.

I'm not suggesting that you should actually do any of this, because that
may violate their terms of service etc, but it's an interesting problem
to think about. I see from another article that asking the site's admin
didn't help you out - but at leat you tried that approach too.

P
 
T

Trent Curry

pkent said:
Funnily enough we're looking at implementing a similar system at work.
Aaanyway...

From looking at it it does appear that the image uses only 2 colours -
the foreground and the background. There seems to be a 1 pixel gap
between each digit. The code appears to use only the digits 0 to 9.
The font doesn't vary and seems to be a variable-width font. The
image seems to be the same as long as you have the same PHPSESSIONID
cookie from them.

Why not use that sessionid? If it changes each time perhaps there is a
corralation?

--
Trent Curry

perl -e
'($s=qq/e29716770256864702379602c6275605/)=~s!([0-9a-f]{2})!pack("h2",$1
)!eg;print(reverse("$s")."\n");'
 
D

Darin McBride

Trent said:
Well keep in mind that the image format needs to be displayable by any
(visual) web browser, so that really limits the types (png, jpg, and gif
mainly.) My point is thers only so manay ways they go in that respect.

Not quite what I meant. I meant that they could change from using a
font that was easy to OCR (e.g., an image that looks typed), to a
"font" that, perhaps, colour-blind people may not be able to discern,
(I use "font" very loosely here), or perhaps to a wavy pattern that
vaguely looks like words. Depending on how good your OCR software is,
you may or may not be able to programmatically recognise the text.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top