XHTML vs. XPath: @class match?

Discussion in 'XML' started by Ivan Shmakov, Feb 6, 2013.

  1. Ivan Shmakov

    Ivan Shmakov Guest

    Given that the "class" attribute is "a value that is a set of
    space-separated tokens" [1], is there an easy way to match
    elements of a particular class in XPath? Unfortunately,
    //node ()[@class = "foo"] doesn't seem to fit.

    TIA.

    [1] http://www.w3.org/TR/html5/dom.html#classes

    PS. I'm using libxml2 for XPath support.
     
    Ivan Shmakov, Feb 6, 2013
    #1
    1. Advertisements

  2. * Ivan Shmakov wrote in comp.text.xml:
    That works with XPath 2.0 if the attribute is known to be an NMTOKENS
    attribute through schema information, but with XPath 1.0 you have to
    use something like

    contains(' foo ', concat(' ', normalize-space(@class), ' '))

    To account for the various possible cases, with some caveats like the
    definition of white space being different between HTML and XPath.
     
    Bjoern Hoehrmann, Feb 6, 2013
    #2
    1. Advertisements

  3. Ivan Shmakov

    Ivan Shmakov Guest

    ACK, thanks! I guess that with libxml2 I'm stuck to XPath 1.0.
    Which are "(#x20 | #x9 | #xD | #xA)" (as per XML 1.0),
    vs. (#xC | #x20 | #x9 | #xD | #xA) (as per HTML5.)

    Isn't all that bad (especially given that the document I'm
    processing is an XHTML template, to be shipped with the
    application I'm working on); one of them could've been allowing
    the whole host of Unicode whitespace characters, too.

    Do I understand it correctly that explicitly translate'ing &#xC
    to ' ' would be the proper solution? Or perhaps translate ()
    may render normalize-space () unnecessary /in this case/?
    Consider, e. g.:

    contains (' foo ',
    concat (' ', translate (@class, '

    ', ' '), ' '))

    TIA.
     
    Ivan Shmakov, Feb 6, 2013
    #3
  4. * Ivan Shmakov wrote in comp.text.xml:
    You can use translate() in place of normalize-space() to normalize white
    space to simple spaces, but it might not be possible to handle the case
    of U+000C since that's not a valid character in XPath expressions and no
    escaping mechanism exists in XPath 1.0 (and neither is there a function
    to get the Unicode character numbers which would be a possible solution
    otherwise).
     
    Bjoern Hoehrmann, Feb 6, 2013
    #4
  5. Ivan Shmakov

    Ivan Shmakov Guest

    ACK, thanks.

    Though, as I've just found, the whole point is moot, for U+000C
    is not a valid character in XML 1.0, either [1], and thus no
    XHTML document may ever contain one, whether in @class, or any
    other place.

    [1] http://www.w3.org/TR/REC-xml/#charsets
     
    Ivan Shmakov, Feb 6, 2013
    #5
  6. Ivan Shmakov

    Peter Flynn Guest

    Does it also work if the attribute has been declared as IDREFS?

    ///Peter
     
    Peter Flynn, Feb 9, 2013
    #6
  7. * Peter Flynn wrote in comp.text.xml:
    Good question. I gave up researching this when I found that there is no
    constructor function xs:IDREFS defined in the 2.0 specifications, but in
    the 3.0 proposals there is one. So for 3.0 I suspect "yes", but I don't
    know about 2.0.
     
    Bjoern Hoehrmann, Feb 9, 2013
    #7
  8. Ivan Shmakov

    Peter Flynn Guest

    Excellent, thanks. Some of us still have clients using ID/IDREF :)

    ///Peter
     
    Peter Flynn, Feb 17, 2013
    #8
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.