CanonML: beyond TeX and XML, a lesson also for arrogant stringers?

Discussion in 'XML' started by Juan R., May 5, 2006.

  1. Juan R.

    Juan R. Guest

    In

    [http://canonicalscience.blogspot.com/2006/04/scientific-language-canonml-is.html]

    I presented some generic requirements for a markup language for science
    and mathematics. Basic features of CanonML and ampliations and
    improvements over TeX, SGML, XML or Scheme based encodings are listed
    below. However, let me an incise first. Rememeber how we also saw that
    the mathematics in Distler's blog Musings were being incorrectly
    encoded with simulation of tensors, incorrect structural markup,
    incorrect numerics, etc.

    However one of most fascinating samples is Distler's posting
    "Designing the 5th Dimension"

    [http://golem.ph.utexas.edu/~distler/blog/archives/000635.html]

    There Distler is serving the 5D element of line (ds)^2 as 2s ds. But 2s
    ds is not equal to (ds)^2!!

    Of course this kind of foolish MathML code is not accessible, not
    searchable, etc.

    Some folks have noted the terrible paradox of being proud enough to
    claim that Musings is (in own Distler's words) "the world's most
    technologically-advanced weblog" whereas being unable to correctly
    encode something so simple as the square of a line element.

    Other folks go beyond and carefully note the analogies between Musings
    (Distler is a string theorist) and the own string theory. String,
    superstring, brane, and M theories are popularized in mass media as the
    world's most sophisticated stuff whereas being unable to derive
    something so simple as Coulomb force law in despite of numerous efforts
    in last 40 years. Or it is still poor; string theory is offering us
    clearly wrong output for almost any empirical property of our universe:
    string theory deals with exact supersymmetry and perturbative gravitons
    on a flat static spacetime, which contradicts lessons learned from
    Standard model and General Relativity and, of course, contradicts all
    experimental data.

    The great failure of string theory is not that the "theory" is in a
    permanent anodyne state; the problem -and big one- is that nobody has
    been able to do the theory compatible with our current knowledge.

    You do not need to construct giant accelerators for verifying some
    exotic result such as existence of hidden dimensions or one-dimensional
    tiny vibrating objects. That you need is a theory was compatible with
    all experimental data now available; once achieved this basic goal of
    scientific methodology then you can focus on future experiments and
    exotic possibilities...

    After several "great folks" became so enthusiastic on theoretical
    possibilities for the multiverse opened by the "Landscape", it started
    a popular joke between academicians saying that our universe can be
    roughly characterized as the one that string-brane-M theory cannot
    predict or explain ;-)

    I agree on the parallelism between the technological fiasco of
    Distler's Musings blog and the scientific fiasco of string and M
    theories. As number of lesson arise.

    The first lesson is you may verify details of you are doing. You cannot
    assume points (as Distler is doing in both topics Itex-MathML and in
    string theory) simply waiting that may be true.

    Second lesson is that you may provide better alternatives to available
    ones. If general relativity works fine explaining-predicting a subset
    of phenomena, then the next theory may explain *also* phenomena cannot
    be explained via general relativity: e.g. microscopic phenomena, the CC
    problem...

    String theory cannot compute anomalous perihelion for Mercury, has not
    quantized gravity, and has not solved the CC problem (at the best is
    discussed using anthropic reasoning). This is parallel with IteX
    approach used in Distler's blog. For instance, the correctness and
    accessibility of almost all MathML equations are being served on the
    web are poor than using an old HTML + GIF + ALT model!

    The third lesson to be learned here is that you may be not proud enough
    about your own work, at least not before you achieve success.

    Of course, something as simple as (ds)^2 can be perfectly encoded in
    CanonML. The syntax is similar to TeX one but more powerful and
    sophisticated than presentation MathML 2.0 and with more semantic
    content than Content MathML 2.0. Semantic content will be useful for
    encoding scientific information. For example, the MathML parallel
    markup for energy-mass relationship is

    <semantics>
    <mrow>
    <mi>E</mi>
    <mo>=</mo>
    <mrow>
    <mi>m</mi>
    <mo>&InvisibleTimes;</mo>
    <msup>
    <mi>c</mi>
    <mn>2</mn>
    </msup>
    </mrow>
    </mrow>
    <annotation-xml encoding="MathML-Content">
    <apply>
    <eq/>
    <ci>E</ci>
    <apply>
    <times/>
    <ci>m</ci>
    <apply>
    <power/>
    <ci>c</ci>
    <cn>2</cn>
    </apply>
    </apply>
    </apply>
    </annotation-xml>
    </semantics>

    Terrible and still can be poor!!!

    but using both canonical expressions (modification of SEXPR) and the
    infix formal operator model, we can encode more information with
    easiness of TeX or ASCIIMath syntax. Above ultra-verbose formula is
    encoded in CanonFormal as

    [E \= [m \* [c \** 2] ] ]


    I may add now that I presented some mathematic-formal properties of
    Canonical Meta Language (CanonML) in

    [http://canonicalscience.blogspot.com/2006/04/canonml-mathematical-formal-language.html]

    Now we will see some of the possibilities of CanonML in the area of
    markup languages and why CanonML is better. The comparison with XML is
    general; this implies that any mathematical scientific language based
    in XML -such as CML, CellML, physicsML, MathML, et cetera- is already
    being introduced in the comparison.

    In future postings I will review a journal of mathematics and other of
    physics are using MathML in their articles. We will see that kind of
    incorrect code is being served to the Internet below a hype of cool...


    Tagging]

    CanonML present us several advantages:

    i)
    Syntax is less verbose than XML.

    ii)
    The datument becomes a data structure composed of canonical expressions
    (modification of SEXPR) and can be manipulated that way.

    iii)
    All technology is unified. For instance, due to limitations of XML
    design, w3c folks were obligated to provide non-XML alternatives in
    many recent improvements. XPath is not an XML language; the original
    RDF was so complex that needs of alternative syntaxes, and the same
    about RELAX NG and all that stuff. SVG also introduces a non-XML
    language, etc.

    However, the CanonML version of XPath is based in CanonML itself. You
    can manipulate it as you can manipulate any other canonical expression
    structure.

    iv)
    The metadata model is also better than in XML. XML has been rudely
    critiqued for their inefficient and limited attribute model.

    One of the improvements of CanonML over SXML and other SEXPR inspired
    markup languages is that the tag is marked instead of the text:

    [::para The CanonML language is [::em more] readable than XML]

    Another of improvements over SXML is that CanonML includes special
    syntax for empty tags

    [::group [::tag1] [::tag2]]

    is equivalent to

    [::group \tag1 \tag2]

    I choose the \ notation for empty tags because readability issues and
    because with this notation mathematical formulae look close to TeX, and
    this may simplify the adaptation of TeX users to the new syntax.
    Compare next CanonML fragment

    [a \over 2]

    with (TeX)

    {a \over 2}

    In a future posting, I will focus in mathematics and science and will
    provide detailed comparison with TeX, MathML, XML-MAIDEN, ASCIIMath,
    and others.


    Why not closing tag?]

    - Because redundancy and verbosity of XML. Do you know any
    mathematician or scientist writing (2 + 2 + 2 - 4 + 6/6 - 1) when she
    or he means just the number 2?

    I just write x^2 + bx + c = 0, for instance because mathematics is a
    concise formal system.

    - Because parsering is easier.

    - Because simplify human authoring of datuments.

    - Because cost per MB and server bandwidth requirements.

    - Because in dynamical algorithms the tagname may not be known until
    runtime and, therefore, the closing tagnames may be not forced.

    However, the structural purity of CanonML datuments is better than TeX
    documents and even better than next XHTML 2.0 (which improves structure
    but at cost of backward incompatibility with HTML and XHTML 1.0!!).

    In CanonML there is not specific open markup as LaTeX
    \part, \subsection, \paragraph, \chapter, \subsubsection,
    \subparagraph, and \section. In fact, commands as \subsection or
    \subsubsection are clearly redundant. And what if I need structural
    subsections at fifth level? Is there a \ subsubsubsubsection in LaTeX?

    CanonML offers a virtually infinite number of structural levels for
    your datuments with a single optimised command.

    However, the structure of datument is better still because as was
    explained in a previous posting (see above) CanonML is also a
    formal-mathematical language arising from a "unification" of
    S-expressions, bracketed Dyck language, and Keizer canonical vectors.


    Multi-markup: Non-hierarchical structures]

    CanonML lets us encode overlapping structures.

    Non-hierarchical structures are of great interest in science. For
    instance, the Lewis structure for HF is

    <H/<F/ e e /H> e e e e e e /F>

    and

    [H}[F} e e {H] e e e e e e {F]

    in GODDAG and liminal respectively. However, both approaches present
    well-know difficulties (e.g. disambiguating keys in liminal).

    In CanonML the Lewis structure for HF can be encoded as

    [::H::F e e] [::F e e e e e e]

    where ::H::F is an example of the novel multi-markup concept introduced
    by this language.


    Metadata and attributes]

    CanonML introduces a novel metadata model. Advantages over XML:

    - A bit less verbose.

    - Attribute values can be any expression not just of type string.

    - Attribute keys can be any object. XML does not let an attribute key
    start with a digit or contain angle-brackets.

    - Tagname of an element can be any expression.

    - Attribute keys are optional. In the popular CSV syntax and in all
    major programming languages, field or argument values are given by
    position, not by keyword.

    - Attributes for attributes for attributes for... This lets us to
    denote attributes types for instance.


    "Namespaces"]

    Another of highly critiqued points of XML world is namespaces. CanonML
    avoid usage of namespaces whereas solving the naming conflict for tags.
    In this way one could download mathematical formulas encoded in CanonML
    from a hypothetical international database and introducing them into a
    personal datument without worry about naming conflicts. The MathML URI
    is not needed.


    A new Generic Markup Language]

    The possibilities for CanonML are immense. This new sophisticated meta
    language can be used as hosting language for many different markup
    approaches including Schema, RELAX NG the useful CSS, etc.

    This approach has been listed in the alternatives to XML directory

    [http://www.pault.com/xmlalternatives.html]

    and presented at the terseXML group as

    <blockquote>
    Strong (the only?) attempt on "one markup for several xml processing
    specs" design
    </blockquote>

    [http://groups.yahoo.com/group/tersexml/message/103]


    Source:
    http://canonicalscience.blogspot.com/2006/04/canonml-markup-language-beyond-tex-xml.html


    --

    Juan R.

    Center for CANONICAL |SCIENCE)
     
    Juan R., May 5, 2006
    #1
    1. Advertising

  2. Dirk Van de moortel, May 5, 2006
    #2
    1. Advertising

  3. Juan R.

    Luis Rivera Guest

    Ain't it the TeXHaX version of the Postmodern Generator? Maybe just
    some troll. So ignore it.
     
    Luis Rivera, May 5, 2006
    #3
  4. Folks: Don't feed trolls. If someone isn't saying anything interesting,
    just killfile them. Responding encourages them. It takes two to sustain
    an argument.


    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
     
    Joe Kesselman, May 6, 2006
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Oleg Paraschenko

    TeXML, the XML vocabulary for TeX

    Oleg Paraschenko, Apr 12, 2004, in forum: XML
    Replies:
    0
    Views:
    449
    Oleg Paraschenko
    Apr 12, 2004
  2. Adrienne Boswell

    Funny Font Annecdote and Lesson

    Adrienne Boswell, Jan 7, 2008, in forum: HTML
    Replies:
    13
    Views:
    1,147
    Adrienne Boswell
    Jan 11, 2008
  3. Jonathan Fine
    Replies:
    0
    Views:
    654
    Jonathan Fine
    May 5, 2009
  4. Alf P. Steinbach
    Replies:
    7
    Views:
    422
    Alf P. Steinbach
    Jan 4, 2012
  5. Trans

    Tex <-> XML

    Trans, Oct 8, 2008, in forum: Ruby
    Replies:
    2
    Views:
    87
    hadley wickham
    Oct 9, 2008
Loading...

Share This Page