You didn't answer these questions.
It's a business requirement. The user name will be used before the
domain, for example:
I have the domain
http://somedomain.com and for each user a unique url
will exists like
http://user.name.somedomain.com
http://david_coperfield.somedomain.com
http://andreas-blast.somedomain.com
This is my business requirement, so I can only allow user names that can
be used in a URI.
So the question is, what is legal in that part of a URI? The best
resource I can find is RFC2396 [1], and it says:
"The most common name registry mechanism is the Domain Name System
(DNS). A registered name intended for lookup in the DNS uses the
syntax defined in Section 3.5 of [RFC1034] and Section 2.1 of
[RFC1123]."
Section 2.1 of RFC 1123 [2] says:
"The syntax of a legal Internet host name was specified in RFC-952
[DNS:4]. One aspect of host name syntax is hereby changed: the
restriction on the first character is relaxed to allow either a letter
or a digit. Host software MUST support this more liberal syntax.
Host software MUST handle host names of up to 63 characters and SHOULD
handle host names of up to 255 characters."
RFC 952 [3] says:
"<domainname> ::= <hname>
<hname> ::= <name>*["."<name>]
<name> ::= <let>[*[<let-or-digit-or-hyphen>]<let-or-digit>]"
So, my reading of that (and I'm not an expert) is that a machine name
MAY have digits in it (including at the start or end), may NOT have
underscores, and may be pretty darn long. (Though it makes sense to
put some sort of bound on it - if you think 30 chars is OK, so be it.)
A regexp for this, allowing multiple dotted names joined together:
# Regexp for a single name
/[a-z\d](?:[a-z\d-]*[a-z\d])?/i
# Regexp for 1 or more of those joined by periods
/(?:[a-z\d](?:[a-z\d-]*[a-z\d])?)(?:\.[a-z\d](?:[a-z\d-]*[a-z\d])?)*/i
[1]
http://www.gbiv.com/protocols/uri/rfc/rfc3986.html
[2]
http://rfc-ref.org/RFC-TEXTS/1123/chapter2.html#sub1
[3]
http://rfc.net/rfc952.html#sA.