[OT] E-mail syntax

Discussion in 'HTML' started by Thomas Mlynarczyk, Dec 7, 2004.

  1. Hi,

    I apologize if this is off-topic here, but can anyone tell me where to find
    information about the correct syntax for e-mail addresses. I know and have
    read RFC2822, but it seems that further restrictions apply. And except for
    the RFCs, Google didn't bring up any really qualified information.

    Greetings,
    Thomas
     
    Thomas Mlynarczyk, Dec 7, 2004
    #1
    1. Advertising

  2. Thomas Mlynarczyk

    Philip Ronan Guest

    Thomas Mlynarczyk wrote:

    > Hi,
    >
    > I apologize if this is off-topic here, but can anyone tell me where to find
    > information about the correct syntax for e-mail addresses. I know and have
    > read RFC2822, but it seems that further restrictions apply. And except for
    > the RFCs, Google didn't bring up any really qualified information.


    What else do you need?

    RFC 1034 (I think) describes further restrictions on the characters that can
    be used in domain names, but I assume you know that already if you're
    familiar with RFC 2822.

    Phil

    --
    phil [dot] ronan @ virgin [dot] net
    http://vzone.virgin.net/phil.ronan/
     
    Philip Ronan, Dec 7, 2004
    #2
    1. Advertising

  3. Also sprach Philip Ronan:

    [RFC2822]

    > What else do you need?
    > RFC 1034 (I think) describes further restrictions on the characters
    > that can be used in domain names, but I assume you know that already
    > if you're familiar with RFC 2822.


    Yes, but that's just where my confusion starts. If RFC1034/1035 imposes
    further restrictions on the domain part, then why does RFC2822 allow a
    "richer" syntax in the first place? (Besides, the formulation in RFC1035 is
    "The following syntax will result in fewer problems with many applications
    that use domain names...", which sounds like a non-committing recommendation
    at best). Then, unless I have overlooked something, RFC2822 does not impose
    any size limits, while RFC1034/1035 does, RFC2822 does allow a domain
    without any dots, while RFC2821 requires at least one dot. How am I to cope
    with all those contradictory specifications?
     
    Thomas Mlynarczyk, Dec 7, 2004
    #3
  4. Thomas Mlynarczyk

    Philip Ronan Guest

    Thomas Mlynarczyk wrote:

    >> What else do you need?
    >> RFC 1034 (I think) describes further restrictions on the characters
    >> that can be used in domain names, but I assume you know that already
    >> if you're familiar with RFC 2822.

    >
    > Yes, but that's just where my confusion starts. If RFC1034/1035 imposes
    > further restrictions on the domain part, then why does RFC2822 allow a
    > "richer" syntax in the first place?


    I suppose that's because there's no point in having valid domain names
    defined in multiple documents. A valid email address should conform to both.

    > Then, unless I have overlooked something, RFC2822 does not impose
    > any size limits, while RFC1034/1035 does, RFC2822 does allow a domain
    > without any dots, while RFC2821 requires at least one dot. How am I to cope
    > with all those contradictory specifications?


    OK, I'm not that familiar with RFC1034/1035, but I assume they only relate
    to the part of an email address that comes after the "@". A dot would be
    essential because there are no web domains that consist of a single TLD
    (like "com" or "uk").

    AFAIK, there is no limit on the length of the part before the "@"

    Phil

    --
    phil [dot] ronan @ virgin [dot] net
    http://vzone.virgin.net/phil.ronan/
     
    Philip Ronan, Dec 7, 2004
    #4
  5. Also sprach Philip Ronan:

    >> Yes, but that's just where my confusion starts. If RFC1034/1035
    >> imposes further restrictions on the domain part, then why does
    >> RFC2822 allow a "richer" syntax in the first place?

    >
    > I suppose that's because there's no point in having valid domain names
    > defined in multiple documents.


    That makes sense, but then why not a simple reference to the RFC defining
    domain names?

    > A valid email address should conform to both.


    So I must combine all the different specs and use the "greatest common
    denominator"?

    > OK, I'm not that familiar with RFC1034/1035, but I assume they only
    > relate to the part of an email address that comes after the "@". A
    > dot would be essential because there are no web domains that consist
    > of a single TLD (like "com" or "uk").


    What about "localhost"? But it seems, that RFC2821 is contradictory within
    itself. Section 2.3.5 says: "A domain (or domain name) consists of one or
    more dot-separated components." Thus, "my-domain" would be valid. But - same
    RFC(!) - section 4.1.2 defines

    Domain = (sub-domain 1*("." sub-domain)) / address-literal

    In other words, at least one dot is required. Which section am I to believe?

    > AFAIK, there is no limit on the length of the part before the "@"


    RFC2821, section 4.5.3.1 says there is a limit of 64 characters. But then, I
    think it is RFC1035 which explains that mail addresses are converted to
    domain names by making the part before the @ another "subdomain", and
    because of the way such a domain name is encoded, the limit should be 63
    characters (not counting quotes or escape-backslashes). So, again, what am I
    to believe? It's all so confusing :-(
     
    Thomas Mlynarczyk, Dec 7, 2004
    #5
  6. Thomas Mlynarczyk

    Philip Ronan Guest

    Thomas Mlynarczyk wrote:

    > Also sprach Philip Ronan:
    >
    >>> Yes, but that's just where my confusion starts. If RFC1034/1035
    >>> imposes further restrictions on the domain part, then why does
    >>> RFC2822 allow a "richer" syntax in the first place?

    >>
    >> I suppose that's because there's no point in having valid domain names
    >> defined in multiple documents.

    >
    > That makes sense, but then why not a simple reference to the RFC defining
    > domain names?


    I agree I could be made clearer

    >> A valid email address should conform to both.

    >
    > So I must combine all the different specs and use the "greatest common
    > denominator"?


    That sounds like the safest way of doing things, yes.

    >> ... A dot would be essential because there are no web domains
    >> that consist of a single TLD (like "com" or "uk").

    >
    > What about "localhost"? But it seems, that RFC2821 is contradictory within
    > itself. Section 2.3.5 says: "A domain (or domain name) consists of one or
    > more dot-separated components." Thus, "my-domain" would be valid. But - same
    > RFC(!) - section 4.1.2 defines
    >
    > Domain = (sub-domain 1*("." sub-domain)) / address-literal
    >
    > In other words, at least one dot is required. Which section am I to believe?


    My head is starting to hurt now :-(
    You should submit a comment about that.

    >> AFAIK, there is no limit on the length of the part before the "@"

    >
    > RFC2821, section 4.5.3.1 says there is a limit of 64 characters.


    It also says that longer strings may sometimes be required. So if the
    local-part contains more than 64 characters it could still be syntactically
    valid even it it isn't accepted everywhere.

    > But then, I
    > think it is RFC1035 which explains that mail addresses are converted to
    > domain names by making the part before the @ another "subdomain", and
    > because of the way such a domain name is encoded, the limit should be 63
    > characters (not counting quotes or escape-backslashes). So, again, what am I
    > to believe? It's all so confusing :-(


    Now I'm confused too!

    --
    phil [dot] ronan @ virgin [dot] net
    http://vzone.virgin.net/phil.ronan/
     
    Philip Ronan, Dec 8, 2004
    #6
  7. Thomas Mlynarczyk

    Toby Inkster Guest

    Thomas Mlynarczyk wrote:

    > What about "localhost"? But it seems, that RFC2821 is contradictory within
    > itself.


    Many RFCs are.

    From RFC 791 (The Internet Protocol), but applicable to pretty much any
    Internet Standard:
    | In general, an implementation must be conservative in its sending
    | behavior, and liberal in its receiving behavior.

    Bear this in mind and you'll know in your heart which address formats you
    should send and which you should accept. :)

    --
    Toby A Inkster BSc (Hons) ARCS
    Contact Me ~ http://tobyinkster.co.uk/contact
     
    Toby Inkster, Dec 8, 2004
    #7
  8. Also sprach Toby Inkster:

    >> It seems, that RFC2821 is contradictory within itself.

    > Many RFCs are.


    <astonishment level="utter">But aren't they the stuff standards are made
    from? (RFC2821 being a standard itelf, if I'm not wrong.) So shouldn't they
    be as "perfect" as possible?</astonishment>

    > From RFC 791 (The Internet Protocol), but applicable to pretty much
    > any Internet Standard:
    >> In general, an implementation must be conservative in its sending
    >> behavior, and liberal in its receiving behavior.

    > Bear this in mind and you'll know in your heart which address formats
    > you should send and which you should accept. :)


    So if I want a script to syntactically validate an e-mail address entered by
    a user, my only reference is to be RFC2822 including all "obsolete parts"?
    Thus permitting addresses like $@$ or (@(@)@)a@[@@@]? Or would it be
    reasonable to "restrict" the domain part further? (I guess I can ignore the
    comment syntax for the purpose mentioned above, as it cannot be "regexed"
    due to the possible nesting.)
     
    Thomas Mlynarczyk, Dec 9, 2004
    #8
  9. Also sprach Philip Ronan:

    >> So I must combine all the different specs and use the "greatest
    >> common denominator"?


    > That sounds like the safest way of doing things, yes.


    Makes things awfully complicated, though. Especially, as I cannot be sure if
    all those other restrictions would indeed *always* apply.

    [RFC2821 is contradictory within itself]
    > My head is starting to hurt now :-(
    > You should submit a comment about that.


    This RFC is now more than three and a half years old. I can't quite believe
    being the first to see an error in it. I'd rather assume there is a "natural
    explanation" - but where to find it?

    > Now I'm confused too!


    Hey, this was supposed to be the other way round, so that we would now both
    be enlightened and not both confused! :-(
     
    Thomas Mlynarczyk, Dec 9, 2004
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Curt_C [MVP]
    Replies:
    5
    Views:
    372
    Karim
    May 19, 2004
  2. me
    Replies:
    0
    Views:
    456
  3. =?Utf-8?B?TXJGZXo=?=

    Mail message size problem (System.Web.Mail)

    =?Utf-8?B?TXJGZXo=?=, Mar 14, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    642
    =?Utf-8?B?TXJGZXo=?=
    Mar 14, 2005
  4. =?Utf-8?B?TWlja2VCb3k=?=

    Problem with System.Web.Mail.MailMessage and HTML mail

    =?Utf-8?B?TWlja2VCb3k=?=, Jun 15, 2005, in forum: ASP .Net
    Replies:
    1
    Views:
    913
    =?Utf-8?B?TWlja2VCb3k=?=
    Jun 21, 2005
  5. Mike P
    Replies:
    1
    Views:
    567
    =?ISO-8859-1?Q?G=F6ran_Andersson?=
    Mar 1, 2007
Loading...

Share This Page