HELP! Again. I seem to have some padding I can't get rid of and a PHP problem.

M

Mike Barnard

I'm no PHP wizard, nor do I know much about regexp, but for those who
are, this is the part in the script that seems to check the email
address:

// Check the email address enmtered matches the standard email address
format
if (!eregi("^[A-Z0-9._%-]+@[A-Z0-9._%-]+\.[A-Z]{2,6}$", $email)) {
echo "<p>It appears you entered an invalid email address</p><p><a
href='javascript: history.go(-1)'>Click here to go back</a>.</p>";
}

I'm not only not a wizard, I'm not even a sorceror's apprentice. But it
seems to me this regexp is overly sensitive to case. (It looks to me as
if it doesn't like lowercase characters in the address.)

Mike: what happens when you captilaize the entire e-mail address?
Or: what happens when you steal a different e-mail validity regex from
somewhere else on the Web?

I must admit I haven't tried different cases. I started with a known
working address. It failed, so I removed a . that was in the first
part. It still failed. I looked at the code, saw the erigi bit above
and shrugged my shoulders. Without taking php days of lessons I have
no idea of the syntax involved. What characters are a part of the
equasion, which are part of the data? I have looked up erigi, but I'm
no better off.

Thanks anyway.
 
M

Mike Barnard

Neredbojias said:
if (!eregi("^[A-Z0-9._%-]+@[A-Z0-9._%-]+\.[A-Z]{2,6}$", $email)) {
echo "<p>It appears you entered an invalid email address</p><p><a
href='javascript: history.go(-1)'>Click here to go back</a>.</p>";
}

I'm not only not a wizard, I'm not even a sorceror's apprentice. But it
seems to me this regexp is overly sensitive to case. (It looks to me as
if it doesn't like lowercase characters in the address.)

Er, the "i" in "eregi" indicates case-insensitivity. (Sorry to be pedantic
but it's hard to get one-up on you.)

Also, I "studied" the regex and could find nothing definitely wrong
although not sure about the use of the circumflex there. And I'm far from
an "expert", too.

circumflex means 'at the start', just like $ means 'at the end'
(although I'm sure the way I just described that would invoke a
correction from a Jukka like PHP person ;-))

Here's one from another script I sometimes use:

if
(!ereg("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,6})$",strtolower($c['email'])))
{
echo "That is not a valid e-mail address.";
}

No idea why they didn't just use eregi and got rid of strtolower
though...

<pants> got a pointer to it please? I have now looked at a few and
they're all old, table designed non workers.

Thanks.
 
E

Els

Mike said:
Here's one from another script I sometimes use:
[snip]
<pants> got a pointer to it please? I have now looked at a few and
they're all old, table designed non workers.

Sure - but it doesn't claim to be a non-spam thing. That is, I don't
think it can be used to send spam elsewhere, but they can still use it
to spam you :)

It's not my own script, just in case you were thinking that - I found
it online, and it was the one that looked best for my purposes.

Instruction is here:
http://www.jemjabella.co.uk/stuff/mail_form.php
and on that page is a link to a free download too:
http://www.jemjabella.co.uk/stuff/mail_form.txt

Since I'm no expert myself though, I think it would be good if the
more savvy PHP scripters in the group could give it a quick look to
check for obvious problems.
 
M

mynameisnobodyodyssea

> got a pointer to it please? I have now looked at a few and
they're all old, table designed non workers.

Hi Mike,
Can you ask your web hosting provider if they recommend
a contact form script?

It seems to me that the script you are using should work,
maybe download it again and try again.
Did you edit the value of $sendto to your email address, like
$sendto = "(e-mail address removed)"
 
T

Toby A Inkster

Els said:
Toby said:
There are six characters between 'Z' and 'a': [\]^_`

Thanks - I had no idea.. :)

Yes, everyone always leaves them out of the alphabet song.

But they're in most good ASCII tables.

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 1 day, 1:00.]

Best... News... Story... Ever!
http://tobyinkster.co.uk/blog/2008/03/23/hypnotist/
 
T

Toby A Inkster

Els said:
http://www.jemjabella.co.uk/stuff/mail_form.txt

Since I'm no expert myself though, I think it would be good if the more
savvy PHP scripters in the group could give it a quick look to check for
obvious problems.

It's a lot better than most of these free e-mail scripts I've seen around.
Some comments though:

1. POSTed fields are checked for "BCC:" and "CC:" exploits.
However, the HTTP User-Agent header is included in the
e-mail without checking. It's quite obscure, but an exploit
could theoretically be included there.

2. The 15 character name limit seems arbitrary and would
probably annoy people. Easy enough to change though.

3. The regexp for valid e-mail addresses is overly restrictive.
For example, it disallows more than one "." before the "@"
sign, it won't cope very well with non-ASCII domain names
(which are starting to become more prevalent) and it disallows
apostrophes, equals signs and plus signs before the "@", all
of which are allowed and wouldn't cause any harm to let through.

4. Line endings in e-mail should really be "\r\n", not "\n", though
most mail servers are smart enough to correct this for you.

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 1 day, 1:03.]

Best... News... Story... Ever!
http://tobyinkster.co.uk/blog/2008/03/23/hypnotist/
 
T

Toby A Inkster

Els said:
http://www.jemjabella.co.uk/stuff/mail_form.txt

Since I'm no expert myself though, I think it would be good if the more
savvy PHP scripters in the group could give it a quick look to check for
obvious problems.

5. The function get_data() should include a call to
htmlspecialchars().


--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 1 day, 1:12.]

Best... News... Story... Ever!
http://tobyinkster.co.uk/blog/2008/03/23/hypnotist/
 
N

Neredbojias

Without the ^, this string would be accepted as well:
!!NOT AN EMAIL [email protected]
Email addresses aren't allowed to start with a dot or dash or anything
that's not a-z0-9.

Er, duh! Now that you've shown me an example, it seems so obvious.
Also,
does:

eregi("[A-Z]+");

include a-z or not?

I think it does. But that's just me thinking :)

Me, too. And thanks for uploading the nice pics.
 
B

Bergamot

Ben said:
That's a good idea, although I don't have total confidence that it's
either AaBbCc... or ABC...abc... and not some other other strange order
in some locales.

Locale is irrelvant. It's just ASCII. You can find the order in any
ASCII chart. There are many around the web.
 
E

Els

Blinky said:
Els said:
Toby said:
Els wrote:

Which special chars are between Z and a then? Noob here - I just
thought it would either go AaBbCc.. or ABC...XYZabc...xyz ?

There are six characters between 'Z' and 'a': [\]^_`

Thanks - I had no idea.. :)

http://www.asciitable.com/

Shows I'm not a real geek - that's the first time I ever saw an ASCII
table. (not counting the ones I wrote myself by starting at ALT-128 so
I had a list for the occasional é and ç back when I was a typist...)
 
N

Neredbojias

Neredbojias said:
eregi("[A-Z]+");

include a-z or not?

I think it does. But that's just me thinking :)

Me, too.

See Ben's reply :)

Okay, I've scoured the php manual and apparently the following insert if
designated as caseless:

[A-Z]

will indeed include a-z. In some foreign languages that have umlauts and
other gazinckisses, such letters may be also included.
Only one! the rest is NOT mine - not taking responsibility for those

Really? I wonder who uploaded the foggy sun and vale one? Anyway, there
was quite a rush last night, and it was interesting.

(PS: Did you see Blinky's behemoth killer shark?)
 
E

Els

Blinky said:
<snip>

You can be. Just stick with us. ;)

Been doing that for ehm... [counting on fingers..] 5 years already -
didn't help much yet! :)

Okay - I didn't count on my fingers, I had to look it up in Google
groups, and the oldest post I can find of myself in this group, was in
August 2003. Funny thing though, Jukka is in that thread too, and he
is so friendly and helpful! I'm almost tempted to read back the last 5
years of alt.html posts to see what went wrong ;-)
 
E

Els

Neredbojias said:
Okay, I've scoured the php manual and apparently the following insert if
designated as caseless:

[A-Z]

will indeed include a-z. In some foreign languages that have umlauts and
other gazinckisses, such letters may be also included.

Okay, good to know.
Really? I wonder who uploaded the foggy sun and vale one? Anyway, there
was quite a rush last night, and it was interesting.

You should have captured IP addresses - then compare them to IP
addresses in the group's posts ;-)
(PS: Did you see Blinky's behemoth killer shark?)

Yup, I did :)
 
M

Michael Fesser

..oO(Els)
I'm no PHP wizard, nor do I know much about regexp, but for those who
are, this is the part in the script that seems to check the email
address:

// Check the email address enmtered matches the standard email address
format
if (!eregi("^[A-Z0-9._%-]+@[A-Z0-9._%-]+\.[A-Z]{2,6}$", $email)) {
echo "<p>It appears you entered an invalid email address</p><p><a
href='javascript: history.go(-1)'>Click here to go back</a>.</p>";
}

Just two notes:

1) Using such checks should be done carefully. A proper RFC-compliant
address check _cannot_ be done with such a simple regex. Almost every
regex I saw will still allow many invalid addresses and reject valid
ones. The RFC 822 is a quite complex beast.

Mail::RFC822::Address: regexp-based address validation
http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

PHP : Parsing Email Adresses in PHP
http://www.iamcal.com/publish/articles/php/parsing_email/

The second one seems to work quite well, but still is not fully RFC-
compliant (but should be enough in most cases, though).

2) The ereg_* functions are dead and will be removed in PHP6. The preg_*
(PCRE) functions are the way to go and much more powerful and efficient.

Micha
 
E

Els

Michael said:
if (!eregi("^[A-Z0-9._%-]+@[A-Z0-9._%-]+\.[A-Z]{2,6}$", $email)) {

Just two notes:

1) Using such checks should be done carefully. A proper RFC-compliant
address check _cannot_ be done with such a simple regex. Almost every
regex I saw will still allow many invalid addresses and reject valid
ones. The RFC 822 is a quite complex beast.

Mail::RFC822::Address: regexp-based address validation
http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html

PHP : Parsing Email Adresses in PHP
http://www.iamcal.com/publish/articles/php/parsing_email/

I think I like this one better than the Perl one.. easier to see if I
made a typo said:
The second one seems to work quite well, but still is not fully RFC-
compliant (but should be enough in most cases, though).

2) The ereg_* functions are dead and will be removed in PHP6. The preg_*
(PCRE) functions are the way to go and much more powerful and efficient.

Thanks for the heads up :)
 
B

Ben C

Locale is irrelvant. It's just ASCII. You can find the order in any
ASCII chart. There are many around the web.

I'm pretty sure the value of LC_COLLATE will affect whether [A-Z]
matches "b" or not (assuming case-sensitive, otherwise it will always
match "b") in many programs.

In the egrep manual:

Within a bracket expression, a range expression consists of two
characters separated by a hyphen. It matches any single character
that sorts between the two characters, inclusive, using the locale's
collating sequence and character set. For example, in the default C
locale, [a-d] is equivalent to [abcd]. Many locales sort characters
in dictionary order, and in these locales [a-d] is typically not
equivalent to [abcd]; it might be equivalent to [aBbCcDd], for
example. To obtain the traditional interpretation of bracket
expressions, you can use the C locale by setting the LC_ALL
environment variable to the value C.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top