Regular expressions

J

J. mp

Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid

let me know the reg exp to do this
 
T

Timothy Hunter

J. mp said:
Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid

let me know the reg exp to do this
I think more details are necessary. What characters are allowed in
"username"? Just alphabetic? Alphabetic+numbers? Anything else? Is there
a minimum number of characters? A maximum? Just Latin characters?
Similarly for "user.name" Is it the same as "username" except with a
period? Does there have to be exactly four characters before the period
and four after it? Any other constraints?

To use regular expressions you must be able to precisely state what a
"match" means.
 
V

Vincent Fourmond

J. mp said:
Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid

Just to get you started:

/[a-z]+(\.[a-z]+)?/

Vince
 
J

J. mp

Vincent said:
J. mp said:
Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid

Just to get you started:

/[a-z]+(\.[a-z]+)?/

Vince


First of all, thanks for the attention.
More details:

max size allowed is 30
min size allowed is 5

the follwoing chars are allowed :
- _ . (Slash, undescore, perdiod)

these chars are not allowed as start neither as ending char

any alphabetic char, english chars only
case insensitive
no numbers

eg
username ->invalid
user-name -> valid
_username -> invalid
user.name -> valid
user_name ->valid

basically I want allow the same pattern allowed for emails but before
the @ char :)
 
T

Timothy Hunter

J. mp said:
Vincent said:
J. mp said:
Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid
Just to get you started:

/[a-z]+(\.[a-z]+)?/

Vince


First of all, thanks for the attention.
More details:

max size allowed is 30
min size allowed is 5

the follwoing chars are allowed :
- _ . (Slash, undescore, perdiod)

these chars are not allowed as start neither as ending char

any alphabetic char, english chars only
case insensitive
no numbers

eg
.username ->invalid
user-name -> valid
_username -> invalid
user.name -> valid
user_name ->valid


/\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/

basically I want allow the same pattern allowed for emails but before
the @ char :)
The above regexp does not do this. Certainly you can have numbers in
your email address, for example. Basically anything is allowed before
the @. Google "regular expression email address" for extensive
discussions about this.
 
J

J. mp

Timothy said:
J. mp said:
user.name ->valid

/\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/

basically I want allow the same pattern allowed for emails but before
the @ char :)
The above regexp does not do this. Certainly you can have numbers in
your email address, for example. Basically anything is allowed before
the @. Google "regular expression email address" for extensive
discussions about this.

It works well.
Thanks a lot
 
P

Phrogz

max size allowed is 30
min size allowed is 5

the follwoing chars are allowed :
- _ . (Slash, undescore, perdiod)

these chars are not allowed as start neither as ending char

/\A[a-z][a-z.-]{3,28}[a-z]\Z/i


Translated, that says:
* start at the beginning
* find any letter
* followed by 3-28 characters that are letters, periods, or hyphens
* followed by a letter
* follwed by the
* oh, and be case insensitive, please

Note that, per your exact instructions, this allows:
u_s_e_r_n_a_m_e
u____________________________e
z._-_.z
 
B

Brian Candler

/\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/
basically I want allow the same pattern allowed for emails but before
the @ char :)
The above regexp does not do this. Certainly you can have numbers in
your email address, for example. Basically anything is allowed before
the @. Google "regular expression email address" for extensive
discussions about this.

And if you are being pedantic, RFC2822 doesn't allow E-mail addresses to
contain two dots next to each other, unless the local-part is quoted.
 
J

J. mp

Gavin said:
max size allowed is 30
min size allowed is 5

the follwoing chars are allowed :
- _ . (Slash, undescore, perdiod)

these chars are not allowed as start neither as ending char

/\A[a-z][a-z.-]{3,28}[a-z]\Z/i


Translated, that says:
* start at the beginning
* find any letter
* followed by 3-28 characters that are letters, periods, or hyphens
* followed by a letter
* follwed by the
* oh, and be case insensitive, please

Note that, per your exact instructions, this allows:
u_s_e_r_n_a_m_e
u____________________________e
z._-_.z


Oh damm!! the first should be allowed but second and the third should
not be allowed
thnaks
 
J

J. mp

Brian said:
/\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/
basically I want allow the same pattern allowed for emails but before
the @ char :)
The above regexp does not do this. Certainly you can have numbers in
your email address, for example. Basically anything is allowed before
the @. Google "regular expression email address" for extensive
discussions about this.

And if you are being pedantic, RFC2822 doesn't allow E-mail addresses to
contain two dots next to each other, unless the local-part is quoted.

Ok, thanks all, I need a reg expr do what I described before without the
dots, slashes and underscores one after another, and not in the start
nor in the end
Thnaks
 
M

Martin DeMello

Oh damm!! the first should be allowed but second and the third should
not be allowed

The regexp could be extended to allow this, but it gets ever more
convoluted and unreadable - you'd be better off doing a separate check
for a !~ /[^A-Za-z]{2,}/ (that is, "a does not match two
non-alphanumeric chars in a row"
u____________________________e
z._-_.z
)
=> ["u_s_e_r_n_a_m_e", "u____________________________e", "z._-_.z"]
tests.each {|a| p [a, a !~ /[^A-Za-z]{2,}/]}
["u_s_e_r_n_a_m_e", true]
["u____________________________e", false]
["z._-_.z", false]

Also, play around with http://weitz.de/regex-coach/

martin
 
P

Phrogz

Gavin said:
/\A[a-z][a-z.-]{3,28}[a-z]\Z/i [snip]
Note that, per your exact instructions, this allows:
u_s_e_r_n_a_m_e
u____________________________e
z._-_.z

Oh damm!! the first should be allowed but second and the third should
not be allowed

OK, but *why* aren't they allowed. You haven't described exactly what
your requirements are. Is it because you can't have to non-letters in
a row? Is it because the string must contain at least three letters?

BTW, where are these requirements coming from? Are these business
requirements that must be enforced? Are you just making up what you
think people should probably have to use as a name? Or are you just
trying to learn regexp?
 
J

J. mp

Gavin said:
Gavin said:
/\A[a-z][a-z.-]{3,28}[a-z]\Z/i [snip]
Note that, per your exact instructions, this allows:
u_s_e_r_n_a_m_e
u____________________________e
z._-_.z

Oh damm!! the first should be allowed but second and the third should
not be allowed

OK, but *why* aren't they allowed. You haven't described exactly what
your requirements are. Is it because you can't have to non-letters in
a row? Is it because the string must contain at least three letters?

BTW, where are these requirements coming from? Are these business
requirements that must be enforced? Are you just making up what you
think people should probably have to use as a name? Or are you just
trying to learn regexp?

It's a business requirement. The user name will be used before the
domain, for example:
I have the domain http://somedomain.com and for each user a unique url
will exists like http://user.name.somedomain.com
http://david_coperfield.somedomain.com
http://andreas-blast.somedomain.com

This is my business requirement, so I can only allow user names that can
be used in a URI.

Thnaks all again,
 
P

Phrogz

You didn't answer these questions.
It's a business requirement. The user name will be used before the
domain, for example:
I have the domain http://somedomain.com and for each user a unique url
will exists like http://user.name.somedomain.com
http://david_coperfield.somedomain.com
http://andreas-blast.somedomain.com

This is my business requirement, so I can only allow user names that can
be used in a URI.

So the question is, what is legal in that part of a URI? The best
resource I can find is RFC2396 [1], and it says:
"The most common name registry mechanism is the Domain Name System
(DNS). A registered name intended for lookup in the DNS uses the
syntax defined in Section 3.5 of [RFC1034] and Section 2.1 of
[RFC1123]."


Section 2.1 of RFC 1123 [2] says:
"The syntax of a legal Internet host name was specified in RFC-952
[DNS:4]. One aspect of host name syntax is hereby changed: the
restriction on the first character is relaxed to allow either a letter
or a digit. Host software MUST support this more liberal syntax.

Host software MUST handle host names of up to 63 characters and SHOULD
handle host names of up to 255 characters."


RFC 952 [3] says:
"<domainname> ::= <hname>
<hname> ::= <name>*["."<name>]
<name> ::= <let>[*[<let-or-digit-or-hyphen>]<let-or-digit>]"


So, my reading of that (and I'm not an expert) is that a machine name
MAY have digits in it (including at the start or end), may NOT have
underscores, and may be pretty darn long. (Though it makes sense to
put some sort of bound on it - if you think 30 chars is OK, so be it.)

A regexp for this, allowing multiple dotted names joined together:

# Regexp for a single name
/[a-z\d](?:[a-z\d-]*[a-z\d])?/i

# Regexp for 1 or more of those joined by periods
/(?:[a-z\d](?:[a-z\d-]*[a-z\d])?)(?:\.[a-z\d](?:[a-z\d-]*[a-z\d])?)*/i


[1] http://www.gbiv.com/protocols/uri/rfc/rfc3986.html
[2] http://rfc-ref.org/RFC-TEXTS/1123/chapter2.html#sub1
[3] http://rfc.net/rfc952.html#sA.
 
J

J. mp

So, my reading of that (and I'm not an expert) is that a machine name
MAY have digits in it (including at the start or end), may NOT have
underscores, and may be pretty darn long. (Though it makes sense to
put some sort of bound on it - if you think 30 chars is OK, so be it.)

A regexp for this, allowing multiple dotted names joined together:

# Regexp for a single name
/[a-z\d](?:[a-z\d-]*[a-z\d])?/i

# Regexp for 1 or more of those joined by periods
/(?:[a-z\d](?:[a-z\d-]*[a-z\d])?)(?:\.[a-z\d](?:[a-z\d-]*[a-z\d])?)*/i


[1] http://www.gbiv.com/protocols/uri/rfc/rfc3986.html
[2] http://rfc-ref.org/RFC-TEXTS/1123/chapter2.html#sub1
[3] http://rfc.net/rfc952.html#sA.
So, Gavin your last regex allows only valid host names on an URI? I'm
sorry for not reading the RFC before. My requirement is what I said, the
user name will act as part of an URI, so I should allow any combination
of chars that are valid for the first part of an URI
 
P

Phrogz

Gavin said:
A regexp for this, allowing multiple dotted names joined together:
# Regexp for a single name
/[a-z\d](?:[a-z\d-]*[a-z\d])?/i
# Regexp for 1 or more of those joined by periods
/(?:[a-z\d](?:[a-z\d-]*[a-z\d])?)(?:\.[a-z\d](?:[a-z\d-]*[a-z\d])?)*/i

So, Gavin your last regex allows only valid host names on an URI? I'm
sorry for not reading the RFC before. My requirement is what I said, the
user name will act as part of an URI, so I should allow any combination
of chars that are valid for the first part of an URI

I think so. I haven't tested it. Actually, I see one minor mistake -
to be safe, anchor this regexp to the start/end to ensure you're
matching exactly what the user entered:
/\A(?:[a-z\d](?:[a-z\d-]*[a-z\d])?)(?:\.[a-z\d](?:[a-z\d-]*[a-z\d])?)*
\Z/i

To be clear, it will match:
f
9
274
3cats7
a.b
a-b
foo
foo-bar
foo.bar
foo.bar.jim
foo-bar.jim-jam
spoofy.com.edu.gov.com.com
crazy.long.name.because.the.regexp.has.no.limits.on.it.whatsoever.for.length

And it will reject:
-foo
foo-
foo.
..foo
foo_bar
 
M

M. Edward (Ed) Borasky

Lloyd said:
I'm trying to install ruby-1.8.5-p12 on my CentOS4 system.

% uname -rsvp
% Linux 2.6.9-022stab078.23-enterprise #1 SMP Thu Oct 19 14:54:39 MSD 2006 i686

I do the following steps with no problem:

./configure --prefix=/usr --enable-shared
make

But then, when I do "make check", the test run hangs forever after a few
hundred characters (mostly dots) print out.

When I finally do a Control-C, I get the following stack trace:

/readline/test_readline.rb:20:in `readline': Interrupt
from ./readline/test_readline.rb:20:in `test_readline'
from ./readline/test_readline.rb:72:in `replace_stdio'
from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open_uri_original_open'
from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open'
from ./readline/test_readline.rb:66:in `replace_stdio'
from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open_uri_original_open'
from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open'
from ./readline/test_readline.rb:65:in `replace_stdio'
... 15 levels...
from /usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/ui/testrunnerutilities.rb:29:in `run'
from /usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/autorunner.rb:200:in `run'
from /usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/autorunner.rb:13:in `run'
from runner.rb:7
make: *** [test-all] Error 1

If I'm reading this correctly, it looks like open_uri_original_open is
somehow being called recursively and repeatedly failing.

Is this something I need to be concerned about?

Thanks.
You may have to do "make install" before "make check". Did you do it
that way?
 
M

M. Edward (Ed) Borasky

Lloyd said:
Lloyd said:
[ ... ]

But then, when I do "make check", the test run hangs forever after a few
hundred characters (mostly dots) print out.

[ ... ]
You may have to do "make install" before "make check". Did you do it
that way?

No, I didn't. I have always thought that the autoconf convention is to
perform "make check" first, and to use its result to decide whether to
then do the "make install". It seems incorrect to install the software
and then check it. If "make check" fails miserably, it will be a big
headache to then try to uninstall everything.
Yeah ... that's the way it's supposed to work -- check first, then
install. But I usually make a new home in /opt for testing stuff anyhow,
rather than letting it default into /usr/local. And I have had instances
where things broke in "make check" that didn't break after "make
install" because of some path issues. I'll take this as encouragement to
hunt them down and document them. :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,049
Latest member
Allen00Reed

Latest Threads

Top