End-of-sentence punctuation

Discussion in 'XML' started by Torsten Bronger, Dec 26, 2004.

  1. Hallöchen!

    I work on the XML output routines of Texinfo at the moment and have
    to cope with the difference between full stops "." that mean the end
    of a sentence and such that denote an abbreviation. In LaTeX, the
    difference is made automatically by and large, but with "\ " and
    "\@" there are two ways of overriding the default when it fails.

    I though of using the zero-width space ​ immediately after a
    "." for making it an abbreviation, so that the naked full stop is
    always the end of a sentence.

    The docbook2rfc mailinglist suggested an <eos/> (end of sentence)
    element that is complementary to my ​.

    I can't say that I like all this very much. Is there some sort of
    quasi-standard, even if not widely adopted?

    Thank you!

    Tschö,
    Torsten.

    --
    Torsten Bronger, aquisgrana, europa vetus
    Torsten Bronger, Dec 26, 2004
    #1
    1. Advertising

  2. Torsten Bronger <-aachen.de> wrote:

    > I work on the XML output routines of Texinfo at the moment and have
    > to cope with the difference between full stops "." that mean the end
    > of a sentence and such that denote an abbreviation.


    You could use markup for sentences, or markup for abbreviations, or
    both. Would you have some other use for either of them.

    The simplest approach would probably be to use abbreviation markup, for
    example <abbr>e.g.</abbr> (though technically "e.g." is not an
    abbreviation in English but a conventional notation).

    > I though of using the zero-width space ​ immediately after a
    > "." for making it an abbreviation, so that the naked full stop is
    > always the end of a sentence.


    That would be trickery, playing with characters. Besides, the
    zero-width space does not logically change the meaning of a preceding
    full stop character, and its effect on rendering (if passed to a
    rendering engine as such) is largely unpredictable - most fonts don't
    contain a glyph for it.

    It _would_ be imaginable (though probably not wise) to solve the
    problem at character level, if ISO 10646 contained separate characters
    for 'full stop' and 'abbreviation point'. But it doesn't.

    > The docbook2rfc mailinglist suggested an <eos/> (end of sentence)
    > element that is complementary to my ​.


    That's tag-soupish in the HTML tag soup tradition. Whenever you think
    an empty element would solve your problem, you are probably solving the
    wrong problem.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Jukka K. Korpela, Dec 26, 2004
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Chris Leffer

    Regular expression for punctuation

    Chris Leffer, Jul 9, 2003, in forum: ASP .Net
    Replies:
    0
    Views:
    419
    Chris Leffer
    Jul 9, 2003
  2. Chris R. Timmons

    Re: Regular expression for punctuation

    Chris R. Timmons, Jul 10, 2003, in forum: ASP .Net
    Replies:
    0
    Views:
    471
    Chris R. Timmons
    Jul 10, 2003
  3. DBLWizard
    Replies:
    10
    Views:
    737
    Brock Allen
    Apr 2, 2005
  4. dew

    Stripping out punctuation marks

    dew, Feb 6, 2006, in forum: ASP .Net
    Replies:
    1
    Views:
    418
    Nathan Sokalski
    Feb 7, 2006
  5. Kev Jackson
    Replies:
    12
    Views:
    165
    Adam Sanderson
    Jan 12, 2006
Loading...

Share This Page