Re: BBC news story: Judge bans Microsoft Word sales

Discussion in 'XML' started by Piet van Oostrum, Aug 17, 2009.

  1. >>>>> Pete Becker <> (PB) wrote:

    >PB> The cited news article is rather superficial. Be careful about drawing
    >PB> conclusions about how the legal system works from reading such sources.
    >PB> They're often wrong.


    >PB> The patent itself was filed in 1994 (not 1998, as the article says) and
    >PB> issued in 1998. It mentions SGML (the parent of XML) in several places, and
    >PB> says that the method at issue is fundamentally different because it does
    >PB> not put structural information in the data stream. More particularly:


    >PB> Thus, in sharp contrast to the prior art the present
    >PB> invention is based on the practice of separating encoding
    >PB> conventions from the content of a document. The invention
    >PB> does not use embedded metacoding to differentiate the content
    >PB> of the document, but rather, the metacodes of the document are
    >PB> separated from the content and held in distinct storage in a
    >PB> structure called a metacode map, whereas document content is
    >PB> held in a mapped content area. Raw content is an extreme
    >PB> example of mapped content wherein the latter is totally
    >PB> unstructured and has no embedded metacodes in the data stream.


    >PB> That doesn't sound like a description of XML.


    Well, read the whole patent. What they do is process a document with
    embedded markup (like troff, SGML, XML, or maybe even TeX) in such a way
    that inside the program the markup is separated from the plain text. The
    external representation is still the marked up text. So it does apply to
    XML. This is quite a primitive way of parsing the markup. It is just
    scanning the input until you find a tag (called metacode in the patent)
    copying the text before the tag to an output area, and copying the tag
    to a list of tags (called a metacode map in the patent). So compared to
    modern parsing techniques there are two differences: (1) nowaday you
    usually build a parse tree; they have just a degenerate tree (only a
    list). (2) usually the plain text is put in the leaves of the tree; they
    have the text in one contiguous area, and the `parse tree' contains
    pointers or indices to this area.

    The advantage of their structure comes when you need more than one tag
    structure on top of the text: for example when you both have the
    hierarchical XML structure and a structure with lines and pages.

    SGML has the possibility of having more than one structure in the same
    document and that fact is mentioned in the patent.

    The only innovative idea in the patent is this separation because it
    makes it easier to do editing on the document when you have more than
    one structure on top of it. And I don't know how innovative it is
    because once you need to edit a marked up text with more than one (markup)
    structure on top of it, this is quite a logical choice. And moreover
    ideas cannot be patented, so the idea doesn't count (but IANAL).

    Once you have this idea, implementing it is peanuts. You could give this
    to any student that attends a beginner's programming course when they
    have had strings, arrays and loops, and they should be able to solve it.

    So the patent is about the transformation of the marked up text to the
    separated data structure and v.v. and about calculating another
    structure from the first one, plus some minor other things. I find it
    really silly that you can get a patent for this kind of thing.

    I am writing a small Python program that illustrates the patented
    algorithms.
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
     
    Piet van Oostrum, Aug 17, 2009
    #1
    1. Advertising

  2. On Aug 17, 5:31 pm, Piet van Oostrum <> wrote:
    > >>>>> Pete Becker <> (PB) wrote:

    > >PB> The cited news article is rather superficial. Be careful about drawing
    > >PB> conclusions about how the legal system works from reading such sources.
    > >PB> They're often wrong.
    > >PB> The patent itself was filed in 1994 (not 1998, as the article says) and
    > >PB> issued in 1998. It mentions SGML (the parent of XML) in several places, and
    > >PB> says that the method at issue is fundamentally different because it does
    > >PB> not put structural information in the data stream. More particularly:
    > >PB>      Thus, in sharp contrast to the prior art the present
    > >PB>      invention is based on the practice of separating encoding
    > >PB>      conventions from the content of a document. The invention
    > >PB>      does not use embedded metacoding to differentiate the content
    > >PB>      of the document, but rather, the metacodes of the document are
    > >PB>      separated from the content and held in distinct storage in a
    > >PB>      structure called a metacode map, whereas document content is
    > >PB>      held in a mapped content area. Raw content is an extreme
    > >PB>      example of mapped content wherein the latter is totally
    > >PB>      unstructured and has no embedded metacodes in the data stream.
    > >PB> That doesn't sound like a description of XML.

    >
    > Well, read the whole patent. What they do is process a document with
    > embedded markup (like troff, SGML, XML, or maybe even TeX) in such a way
    > that inside the program the markup is separated from the plain text. The
    > external representation is still the marked up text. So it does apply to
    > XML. This is quite a primitive way of parsing the markup. It is just
    > scanning the input until you find a tag (called metacode in the patent)
    > copying the text before the tag to an output area, and copying the tag
    > to a list of tags (called a metacode map in the patent). So compared to
    > modern parsing techniques there are two differences: (1) nowaday you
    > usually build a parse tree; they have just a degenerate tree (only a
    > list). (2) usually the plain text is put in the leaves of the tree; they
    > have the text in one contiguous area, and the `parse tree' contains
    > pointers or indices to this area.
    >
    > The advantage of their structure comes when you need more than one tag
    > structure on top of the text: for example when you both have the
    > hierarchical XML structure and a structure with lines and pages.
    >
    > SGML has the possibility of having more than one structure in the same
    > document and that fact is mentioned in the patent.
    >
    > The only innovative idea in the patent is this separation because it
    > makes it easier to do editing on the document when you have more than
    > one structure on top of it. And I don't know how innovative it is
    > because once you need to edit a marked up text with more than one (markup)
    > structure on top of it, this is quite a logical choice. And moreover
    > ideas cannot be patented, so the idea doesn't count (but IANAL).
    >
    > Once you have this idea, implementing it is peanuts. You could give this
    > to any student that attends a beginner's programming course when they
    > have had strings, arrays and loops, and they should be able to solve it.
    >
    > So the patent is about the transformation of the marked up text to the
    > separated data structure and v.v. and about calculating another
    > structure from the first one, plus some minor other things. I find it
    > really silly that you can get a patent for this kind of thing.
    >
    > I am writing a small Python program that illustrates the patented
    > algorithms.
    > --
    > Piet van Oostrum <>
    > URL:http://pietvanoostrum.com[PGP 8DAE142BE17999C4]
    > Private email:


    Isn't this very very similar to the weave and tangle system used in
    LaTeX/TeX?
     
    Paul Thompson, Aug 19, 2009
    #2
    1. Advertising

  3. > Isn't this very very similar to the weave and tangle system used in
    > LaTeX/TeX?


    That sort of question is precisely why the patent office has started
    experimenting with crowdsourcing the search for prior art. They're
    overworked and underinformed, and they know it, so they've asked the
    rest of us to help. Sorta clever, actually...
     
    Joe Kesselman, Aug 19, 2009
    #3
  4. Piet van Oostrum

    The Magpie Guest

    Paul Thompson wrote:
    >
    > Isn't this very very similar to the weave and tangle system used in
    > LaTeX/TeX?


    Its very, very, similar to a coding stream definition I put together
    in 1995 for an early travel planning aggregator in Manchester, UK (you
    know, the people the travel agents get in touch with to find holidays).
     
    The Magpie, Aug 20, 2009
    #4
  5. >>>>> Paul Thompson <> (PT) wrote:

    >PT> Isn't this very very similar to the weave and tangle system used in
    >PT> LaTeX/TeX?


    It's not on that level. The patent is just about a specific internal
    representation of a marked up text: separate the text and the markup,
    such that the text is just a contiguous string and the markup is in a
    list (array) with pointers to the text. And about the conversion between
    the external representation with the markup embedded and the internal
    representation described above. And about making changes to the text and
    the markup independently. And a few things related to that, for
    example having two or more of these structures on the same text where
    the text will be shared.

    Weave and tangle do more than that. Moreover the patent applies in the
    context of a textprocessor (i.e. an interactive program), and weave and
    tangle are not that.
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
     
    Piet van Oostrum, Aug 20, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Charles A. Lackman
    Replies:
    1
    Views:
    1,403
    smith
    Dec 8, 2004
  2. Falcon2005
    Replies:
    0
    Views:
    360
    Falcon2005
    Feb 6, 2005
  3. Peter Flynn
    Replies:
    0
    Views:
    916
    Peter Flynn
    Aug 17, 2009
  4. pat eyler

    interesting book sales news

    pat eyler, Dec 8, 2005, in forum: Ruby
    Replies:
    0
    Views:
    113
    pat eyler
    Dec 8, 2005
  5. Luis Lavena
    Replies:
    1
    Views:
    145
Loading...

Share This Page