Help deciphering use of square brackets within translate function

Discussion in 'XML' started by johkar, Jun 18, 2009.

  1. johkar

    johkar Guest

    To me, the inner translate function below would find any instance of
    square brackets with a single space separating them and replace it
    with a single underscore but that doesn't seem to be what is
    happening. Do the square brackets have some sort of
    significance...like a regular expression? I didn't think so since
    they were quoted. I think the original developer's intent was to
    replace spaces with underscores so that the element name syntax would
    be valid XML.

    However if the XML contains:

    <rate>FCTR[ ]2</rate> it transforms into <FCTR2>0.04200</FCTR2>
    without any underscore at all.

    <xsl:variable name="tag" select="translate(translate(rateCode/text
    (),'[ ]','_'),'.','__')"/>
    <xsl:element name="{$tag}">
    <xsl:value-of select="rateTag/text()"/>
    </xsl:element>

    Any insight into this would be appreciated.
     
    johkar, Jun 18, 2009
    #1
    1. Advertising

  2. johkar

    johkar Guest

    On Jul 3, 11:27 am, (C. M. Sperberg-McQueen) wrote:
    > johkar<> writes:
    > > To me, the inner translate function below would find any instance of
    > > square brackets with a single space separating them and replace it
    > > with a single underscore but that doesn't seem to be what is
    > > happening.  Do the square brackets have some sort of
    > > significance...like a regular expression?  I didn't think so since
    > > they were quoted.  I think the original developer's intent was to
    > > replace spaces with underscores so that the element name syntax would
    > > be valid XML.

    >
    > The second argument to translate() is interpreted as a set of
    > characters, not as an expression.  The first argument is scanned
    > character by character, and each character is tested to see if it
    > appears anywhere in the second argument.  If it does not appear in the
    > second argument, it's copied to the output string without change.  If
    > it does appear in the second argument at position N, then it's
    > replaced in the output by the character at position N of the third
    > argument.  If the third argument is less than N characters long,
    > the character is omitted from the output.
    >
    > This behavior will be familiar to some people from analogous
    > functions and operations in Snobol, Spitbol, Rexx, and
    > IBM 360 assembler, as well as (I think) some other languages.
    >
    > > However if the XML contains:

    >
    > > <rate>FCTR[   ]2</rate>   it transforms into  <FCTR2>0.04200</FCTR2>
    > > without any underscore at all.

    >
    > > <xsl:variable name="tag" select="translate(translate(rateCode/text
    > > (),'[ ]','_'),'.','__')"/>
    > > <xsl:element name="{$tag}">
    > >    <xsl:value-of select="rateTag/text()"/>
    > > </xsl:element>

    >
    > > Any insight into this would be appreciated.

    >
    > The inner call, to
    >
    >     translate(rateCode/text(),'[ ]','_')
    >
    > should translate left square bracket to underscore and omit any
    > blanks or right square brackets.  In your example, the element is
    > named rate, not rateCode, which means we cannot tell what the
    > first argument of the inner call to translate() is.  If the
    > XML actually contains <rateCode>FCTR[   ]2</rateCode>, then I
    > would expect the call to produce "FCTR_2".  The outer call is then
    >
    >     translate("FCTR_2",'.','__')
    >
    > and since no full stops appear in the first argument, the result
    > would be FCTR_2.  This is what you say you expected, but the
    > reasoning is rather different.
    >
    > If there is a rateCode element in the vicinity, then the result
    > will depend on its content.
    >
    > Are you sure you transcribed both the input and the XSLT
    > correctly here?
    >
    > hth
    >
    > --
    > ****************************************************************
    > * C. M. Sperberg-McQueen, Black Mesa Technologies LLC
    > *http://www.blackmesatech.com
    > *http://cmsmcq.com/mib
    > *http://balisage.net
    > ****************************************************************


    Thanks for the reply, I am still not quite getting why the rateCode is
    getting transformed into FCTR2 without the underscore. You stated
    that rateCode "should translate left square bracket to underscore and
    omit any blanks or right square brackets"...why would you expect
    that...just trying to understand.

    <rateCode>FCTR[ ]2</rateCode>
    <rateTag>0.04200</rateTag>

    transforms into

    <FCTR2>0.04200</FCTR2> without any underscore at all.


    <xsl:variable name="tag" select="translate(translate(rateCode/text
    (),'[ ]','_'),'.','__')"/>
    <xsl:element name="{$tag}">
    <xsl:value-of select="rateTag/text()"/>
    </xsl:element>
     
    johkar, Jul 9, 2009
    #2
    1. Advertising

  3. johkar

    johkar Guest

    On Jul 9, 2:03 pm, johkar <> wrote:
    > On Jul 3, 11:27 am, (C. M. Sperberg-McQueen) wrote:
    >
    >
    >
    >
    >
    > >johkar<> writes:
    > > > To me, the inner translate function below would find any instance of
    > > > square brackets with a single space separating them and replace it
    > > > with a single underscore but that doesn't seem to be what is
    > > > happening.  Do the square brackets have some sort of
    > > > significance...like a regular expression?  I didn't think so since
    > > > they were quoted.  I think the original developer's intent was to
    > > > replace spaces with underscores so that the element name syntax would
    > > > be valid XML.

    >
    > > The second argument to translate() is interpreted as a set of
    > > characters, not as an expression.  The first argument is scanned
    > > character by character, and each character is tested to see if it
    > > appears anywhere in the second argument.  If it does not appear in the
    > > second argument, it's copied to the output string without change.  If
    > > it does appear in the second argument at position N, then it's
    > > replaced in the output by the character at position N of the third
    > > argument.  If the third argument is less than N characters long,
    > > the character is omitted from the output.

    >
    > > This behavior will be familiar to some people from analogous
    > > functions and operations in Snobol, Spitbol, Rexx, and
    > > IBM 360 assembler, as well as (I think) some other languages.

    >
    > > > However if the XML contains:

    >
    > > > <rate>FCTR[   ]2</rate>   it transforms into  <FCTR2>0.04200</FCTR2>
    > > > without any underscore at all.

    >
    > > > <xsl:variable name="tag" select="translate(translate(rateCode/text
    > > > (),'[ ]','_'),'.','__')"/>
    > > > <xsl:element name="{$tag}">
    > > >    <xsl:value-of select="rateTag/text()"/>
    > > > </xsl:element>

    >
    > > > Any insight into this would be appreciated.

    >
    > > The inner call, to

    >
    > >     translate(rateCode/text(),'[ ]','_')

    >
    > > should translate left square bracket to underscore and omit any
    > > blanks or right square brackets.  In your example, the element is
    > > named rate, not rateCode, which means we cannot tell what the
    > > first argument of the inner call to translate() is.  If the
    > > XML actually contains <rateCode>FCTR[   ]2</rateCode>, then I
    > > would expect the call to produce "FCTR_2".  The outer call is then

    >
    > >     translate("FCTR_2",'.','__')

    >
    > > and since no full stops appear in the first argument, the result
    > > would be FCTR_2.  This is what you say you expected, but the
    > > reasoning is rather different.

    >
    > > If there is a rateCode element in the vicinity, then the result
    > > will depend on its content.

    >
    > > Are you sure you transcribed both the input and the XSLT
    > > correctly here?

    >
    > > hth

    >
    > > --
    > > ****************************************************************
    > > * C. M. Sperberg-McQueen, Black Mesa Technologies LLC
    > > *http://www.blackmesatech.com
    > > *http://cmsmcq.com/mib
    > > *http://balisage.net
    > > ****************************************************************

    >
    > Thanks for the reply, I am still not quite getting why the rateCode is
    > getting transformed into FCTR2 without the underscore.  You stated
    > that rateCode "should translate left square bracket to underscore and
    > omit any blanks or right square brackets"...why would you expect
    > that...just trying to understand.
    >
    > <rateCode>FCTR[   ]2</rateCode>
    > <rateTag>0.04200</rateTag>
    >
    > transforms into
    >
    > <FCTR2>0.04200</FCTR2> without any underscore at all.
    >
    > <xsl:variable name="tag" select="translate(translate(rateCode/text
    > (),'[ ]','_'),'.','__')"/>
    > <xsl:element name="{$tag}">
    >    <xsl:value-of select="rateTag/text()"/>
    > </xsl:element>- Hide quoted text -
    >
    > - Show quoted text -


    Also, if rateCode has spaces in it, it is converted to:

    <FCTR2>0.04200</FCTR2>
     
    johkar, Jul 9, 2009
    #3
  4. johkar

    johkar Guest

    On Jul 9, 6:57 pm, (C. M. Sperberg-McQueen) wrote:
    > johkar<> writes:
    > > Thanks for the reply, I am still not quite getting why the rateCode is
    > > getting transformed into FCTR2 without the underscore.  You stated
    > > that rateCode "should translate left square bracket to underscore and
    > > omit any blanks or right square brackets"...why would you expect
    > > that...just trying to understand.

    >
    > As I wrote in my previous note,
    >
    >     The first argument is scanned character by character, and each
    >     character is tested to see if it appears anywhere in the second
    >     argument.  If it does not appear in the second argument, it's
    >     copied to the output string without change.  If it does appear in
    >     the second argument at position N, then it's replaced in the
    >     output by the character at position N of the third argument.  If
    >     the third argument is less than N characters long, the character
    >     is omitted from the output.
    >
    > That was my attempt to explain why I would expect that.  If it
    > didn't help, let me try again.
    >
    > The inner call to translate() has three arguments:
    >
    >   (1) the string "FCTR[   ]2"
    >   (2) the string "[ ]"
    >   (3) the string "_"
    >
    > The function constructs an output string by walking through the input
    > string character by character and possibly adding something to the
    > output string.
    >
    > Character 1:  "F".
    >
    >   Look for an "F" in argument 2.  Find none.
    >   Place "F" in the output string, which is now "F".
    >
    > Character 2:  "C".
    >
    >   Look for a "C" in argument 2.  Find none.
    >   Place "C" in the output string, which is now "FC".
    >
    > Character 3:  "T".
    >
    >   Look for a "T" in argument 2.  Find none.
    >   Place "T" in the output string, which is now "FCT".
    >
    > Character 4:  "R".
    >
    >   Look for an "R" in argument 2.  Find none.
    >   Place "R" in the output string, which is now "FCTR".
    >
    > Character 5:  "[".
    >
    >   Look for an occurrence of "[" in argument 2.  Find one
    >   at position 1.  Look up the character in position 1
    >   of argument 3; find "_".  Add that character ("_")
    >   to the output string, which is now "FCTR_".
    >
    > Character 6:  " ".
    >
    >   Look for an occurrence of " " in argument 2.  Find one
    >   at position 2.  Look up the character in position 2
    >   of argument 3; discover that there isn't one (argument
    >   3 is one character long).  So add nothing to the
    >   output string; it remains "FCTR_".
    >
    > Character 7:  " ".
    >
    >   Look for an occurrence of " " in argument 2.  Find one
    >   at position 2.  Look up the character in position 2
    >   of argument 3; discover that there isn't one (argument
    >   3 is one character long).  So add nothing to the
    >   output string; it remains "FCTR_".
    >
    > Character 8:  " ".
    >
    >   Look for an occurrence of " " in argument 2.  Find one
    >   at position 2.  Look up the character in position 2
    >   of argument 3; discover that there isn't one (argument
    >   3 is one character long).  So add nothing to the
    >   output string; it remains "FCTR_".
    >
    > Character 9:  "]".
    >
    >   Look for an occurrence of "]" in argument 2.  Find one
    >   at position 3.  Look up the character in position 3
    >   of argument 3; discover that there isn't one (argument
    >   3 is one character long).  So add nothing to the
    >   output string; it remains "FCTR_".
    >
    > Character 10:  "2".
    >
    >   Look for an occurrence of "2" in argument 2.  Find none.
    >   So add "2" to the output string, which is now "FCTR_2".
    >
    > End of argument 1: return the output string, now "FCTR_2".
    >
    > The details for characters 5 through 9 should make clear that what
    > translate() may be expected to do with second and third arguments of
    > "[ ]" and "_" is to replace "[" with "_" and delete " " and "]".
    >
    > > <rateCode>FCTR[   ]2</rateCode>
    > > <rateTag>0.04200</rateTag>

    >
    > > transforms into

    >
    > > <FCTR2>0.04200</FCTR2> without any underscore at all.

    >
    > > <xsl:variable name="tag" select="translate(translate(rateCode/text
    > > (),'[ ]','_'),'.','__')"/>
    > > <xsl:element name="{$tag}">
    > >    <xsl:value-of select="rateTag/text()"/>
    > > </xsl:element>

    >
    > Interesting.  When I cut and paste your code fragment into a
    > stylesheet and run it on your input, both xsltproc and Saxon give me
    >
    >  <FCTR_2>0.04200</FCTR_2>
    >
    > not
    >
    >  <FCTR2>0.04200</FCTR2>
    >
    > Similar tests show that the XSLT processors in Safari, Opera,
    > and Firefox all produce "FCTR_2" not "FCTR2" from the input you
    > describe.
    >
    > Two questions:  
    >
    > (1) Are you sure that the code you quote is actually the code that is
    > producing the output you quote?  Try replacing
    >
    >   <xsl:variable name="tag"
    >    select="translate(translate(rateCode/text(),'[ ]','_'),'.','__')"/>
    >
    > with
    >
    >   <xsl:variable name="tag" select="hi_mom"/>
    >
    > to see if your output is still <FCTR2>0.04200</FCTR2> or changes to
    > <hi_mom>0.04200</hi_mom>.
    >
    > (2) If your output does change, indicating that the code you
    > quote really is the code doing the work, then I become curious:
    > What XSLT processor are you using?
    >
    > HTH
    >
    > Michael Sperberg-McQueen
    >
    > --
    > ****************************************************************
    > * C. M. Sperberg-McQueen, Black Mesa Technologies LLC
    > *http://www.blackmesatech.com
    > *http://cmsmcq.com/mib
    > *http://balisage.net
    > ****************************************************************


    Ok, a light bulb has finally gone off thanks to your detailed
    explanation. I appreciate the effort in your reply. I was confused a
    bit regarding which of my tests produced the FCTR2 without the
    underscore. XMLSpy's default processor, Microsoft's MSXML and Xalan
    all produce FCTR_2 using the example given. I was doing some testing
    with multiple spaces using the same argument 2 and it produced FCTR2.
    Sorry for the confusion, but I understand the how and why now. Thanks.
     
    johkar, Jul 10, 2009
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. enrique
    Replies:
    3
    Views:
    12,917
    Alan Moore
    Feb 8, 2005
  2. Tim Couper
    Replies:
    1
    Views:
    431
    Larry Bates
    Dec 11, 2007
  3. martinus
    Replies:
    3
    Views:
    207
    martinus
    Dec 6, 2004
  4. Just Another Victim of the Ambient Morality

    Are square brackets not allowed in a URI?

    Just Another Victim of the Ambient Morality, Aug 14, 2008, in forum: Ruby
    Replies:
    2
    Views:
    159
    ara.t.howard
    Aug 14, 2008
  5. Andreu

    Square brackets

    Andreu, Feb 14, 2010, in forum: Ruby
    Replies:
    7
    Views:
    182
    Andreu
    Feb 16, 2010
Loading...

Share This Page