how to lex javascript for an assert_js system?

Discussion in 'Ruby' started by Phlip, Dec 30, 2006.

  1. Phlip

    Phlip Guest

    Ruboids:

    Someone recently posted this:

    > o There's a difference between syntax checking and verification of
    > functional correctness


    That is indeed why a test case that spot-checks your syntax is less useful
    than a test case that understands your needs. All unit testing
    starts at the former and aims at the latter. Here's an example of
    testing Javascript's syntax:

    ondblclick = div.attributes['ondblclick']
    assert_match /^new Ajax.Updater\("hammy_id"/, ondblclick

    It trivially asserts that a DIV (somewhere) contains an ondblclick
    handler, and that this has a Script.aculo.us Ajax.Updater in it. The
    assertion naturally cannot test that the Updater will indeed update a DIV.

    To get closer to the problem, we might decide to get closer to the
    Javascript. We may need a mock-Javascript system, to evaluate that string.
    It could return the list of nuances commonly called a "Log String Test".
    Here's an example, using a slightly more verbose language:

    public void testPaintGraphicsintint() {
    Mock mockGraphics = new Mock(Graphics.class);
    mockGraphics.expects(once()).method("setColor").with(eq(Color.decode("0x6491EE")));
    mockGraphics.expects(once()).method("setColor").with(same(Color.black));
    mockGraphics.expects(once()).method("drawPolygon");
    mockGraphics.expects(once()).method("drawPolygon");
    hex.paint((Graphics) mockGraphics.proxy());
    mockGraphics.verify();
    }

    From the top, that mocks your graphics display driver, and retains its
    non-retained graphics commands. Then the mockGraphics object
    verifies a certain series of calls, with such-and-so parameters.

    (That is a Log String Test because it's the equivalent of writing commands
    like "setColor" and "drawPolygon" into a log file, and then reading this
    to assert things.)

    That test case indeed fits the ideal of moving away from testing raw
    syntax, and closer to testing semantics. Such a test, for example, could
    more easily ignore extraneous calls, and then check that two dynamic
    polygons did not overlap.

    Now suppose I envision this testage:

    def ondblclick(ypath)
    %(new Ajax.Updater("node",
    "/ctrl/act",
    { asynchronous:true,
    evalScripts:true,
    method:"get",
    parameters:"i_b_a=Parameter" })
    ).gsub("\n", '').squeeze(' ')
    end

    def test_some_js
    js = ondblclick()
    parse = assert_js(js)
    statement = parse.first
    assert_equal 'new Ajax.Updater', statement.get_method
    assert_equal '"node"', statement.get_param(0)
    assert_equal '"/ctrl/act"', statement.get_param(1)
    json = statement.get_param(2)
    assert_equal true, json['evalScripts']
    end

    The goal is the target JS can flex easily - can reorder its Json, or
    change fuzzy details, or add new features - without breaking the tests.
    Ideally, only changes that break project requirements will break tests.

    Now suppose I want to write that assert_js() using less than seven billion
    lines of code.

    The first shortcut is to only parse code we expect. I'm aware that's
    generally against the general philosophy of parsing, but I'm trying to
    sell an application, not a JS parser. That's a private detail. I can
    accept, for example, only parsing the JS emitted by Rails's standard
    gizmos.

    So before getting down to some actual questions, here's the code my
    exquisite parsing skills have thrashed out so far:

    def test_assert_js
    source = 'new Ajax.Updater('+
    '"node", '+
    '"/controller/action", '+
    '{ asynchronous:true, '+
    'evalScripts:true, '+
    'method:"get", '+
    'parameters:"i_b_a=Parameter" })'

    js = assert_js(source)
    assert_equal 'new Ajax.Updater', js.keys.first
    parameters = js.values.first['()']
    assert_equal '"node"', parameters[0]
    assert_equal '"/controller/action"', parameters[1]
    json = parameters[2]['{}']
    assert_equal 'true', json['evalScripts']
    assert_equal '"get"', json['method']
    assert_equal '"i_b_a=Parameter"', json['parameters']
    end

    Now that's good enough for government work, and I could probably upgrade
    the interface to look more like my idealized example...

    ....but the implementation is a mish-mash of redundant
    Regexps and run-on methods:

    Qstr = /^(["](?:(?:\\["])|(?:[^\\"]+))*?["]),?\s*/

    def assert_json(source)
    js = {}
    identifier = /([[:alnum:]_]+):/

    while m = source.match(identifier)
    source = m.post_match
    n = source.match(/^([[:alnum:]_]+),?\s*/)
    n = source.match(Qstr) unless n
    break unless n
    js[m.captures[0]] = n.captures[0]
    source = n.post_match
    end

    return { '{}' => js }
    end

    def assert_js(source)
    js = {}
    qstr = /^(["](?:(?:\\["])|(?:[^\\"]+))*?["]),?\s*/
    json = /^(\{.*\}),?\s*/

    if source =~ /^([^\("]+)(.*)$/
    js[$1] = assert_js($2)
    elsif source =~ /^\((.*)\)$/
    js['()'] = assert_js($1)
    else
    index = 0

    while (m = source.match(qstr)) or
    (m = source.match(json))
    break if m.size < 1

    if source =~ /^\{/
    js[index] = assert_json(m.captures[0])
    else
    js[index] = m.captures[0]
    end

    source = m.post_match
    index += 1
    end
    end

    return js
    end

    Now the questions. Is there some...

    ...way to severely beautify that implementation?
    ...lexing library I could _easily_ throw in?
    ...robust JS Lexer library already out there?
    ...assert_js already out there?

    --
    Phlip
    http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!
    Phlip, Dec 30, 2006
    #1
    1. Advertising

  2. Phlip

    spooq Guest

    On 12/30/06, Phlip <> wrote:
    > Now the questions. Is there some...
    >
    > ...way to severely beautify that implementation?
    > ...lexing library I could _easily_ throw in?
    > ...robust JS Lexer library already out there?
    > ...assert_js already out there?


    Parsing with regexps makes baby Jesus cry. Javascript itself can be
    quite flexible, so it may be possible to do enough with a standard
    interpreter's run-time. Alternatively, have a look at the Mozilla
    projects repository for a real interpreter you could hack on.
    spooq, Jan 2, 2007
    #2
    1. Advertising

  3. Phlip

    Phlip Guest

    spooq wrote:

    > Parsing with regexps makes baby Jesus cry. Javascript itself can be
    > quite flexible, so it may be possible to do enough with a standard
    > interpreter's run-time. Alternatively, have a look at the Mozilla
    > projects repository for a real interpreter you could hack on.


    This question is for an academic paper, so it has even more ridiculous
    constraints on the amount of fun I can have. (And why hasn't the
    industry invented a Lex in a Bottle, using Regexp-like strings and a
    BNF notation?)

    Can I add a parser to the Syntax library? It only does Ruby, XML, and
    YAML so far...

    And, yes, JavaScript was designed to be parsed, unlike some other
    languages...

    --
    Phlip
    Phlip, Jan 2, 2007
    #3
  4. Phlip

    spooq Guest

    On 1/2/07, Phlip <> wrote:
    > spooq wrote:
    >
    > > Parsing with regexps makes baby Jesus cry. Javascript itself can be
    > > quite flexible, so it may be possible to do enough with a standard
    > > interpreter's run-time. Alternatively, have a look at the Mozilla
    > > projects repository for a real interpreter you could hack on.

    >
    > This question is for an academic paper, so it has even more ridiculous
    > constraints on the amount of fun I can have. (And why hasn't the
    > industry invented a Lex in a Bottle, using Regexp-like strings and a
    > BNF notation?)


    Not sure exactly how you want to improve on lex?

    > Can I add a parser to the Syntax library? It only does Ruby, XML, and
    > YAML so far...


    I don't see how that's better than grabbing the Javascript grammar off
    the web in BNF :
    http://www.mozilla.org/js/language/es4/formal/lexer-grammar.html
    http://www.antlr.org/grammar/1153976512034/ecmascriptA3.g
    etc.

    > And, yes, JavaScript was designed to be parsed, unlike some other
    > languages...


    No names need be mentioned... ;)

    http://corion.net/perl-dev/Javascript-PurePerl.html does javascript to
    xml, which seems the best/quickest solution that I've seen in my 2
    minutes of googling. Just write enough perl to put that into a file
    somewhere, and get into a nicer language ASAP ;)
    spooq, Jan 2, 2007
    #4
  5. Phlip

    spooq Guest

    spooq, Jan 2, 2007
    #5
  6. Phlip

    Phlip Guest

    spooq wrote:

    I might go with Javascript-Pure-Perl - see below. The following is just
    wrap-ups.

    > http://lxr.mozilla.org/mozilla/source/js/src/js.c


    > Have a look around line 2315.


    Interesting, but I don't get It. The IT object ... exists at debug
    time, and traces all its calls?

    > Not sure exactly how you want to improve on lex?


    Regexp is itself also a "little language". However, to get to the
    language, we don't need to write a .regex file, compile it with special
    compilers, produce a .c file, compile this, link into it, bind to it,
    yack yack yack, and so on just to use it.

    So, I envision Ruby lines like Lex.new.e(' LetterE -> E | e'). Instead
    of externally compiling the little language, we just host it.

    That is not important for the current project...

    > > Can I add a parser to the Syntax library? It only does Ruby, XML, and
    > > YAML so far...

    >
    > I don't see how that's better than grabbing the Javascript grammar off
    > the web in BNF :


    In theory, I only need a dirt-simple way to spot-check the source; I'm
    not writing a JavaScript interpreter. But it could be better if it
    doesn't force me to externally compile the lexer.

    > http://corion.net/perl-dev/Javascript-PurePerl.html does javascript to
    > xml


    Righteous! The project already uses XML (and unit tests with
    assert_xpath), so that will fit right in!

    Thanks! I honestly would never have thought to try Perl...

    --
    Phlip
    Phlip, Jan 2, 2007
    #6
  7. Phlip

    spooq Guest

    On 1/2/07, Phlip <> wrote:
    > spooq wrote:
    >
    > I might go with Javascript-Pure-Perl - see below. The following is just
    > wrap-ups.
    >
    > > http://lxr.mozilla.org/mozilla/source/js/src/js.c

    >
    > > Have a look around line 2315.

    >
    > Interesting, but I don't get It. The IT object ... exists at debug
    > time, and traces all its calls?


    It just makes a pre-defined object that you can poke at when you run
    scripts in that interpreter. The implication was that you could
    recreate the Ajax.* methods and use them to log.

    > > Not sure exactly how you want to improve on lex?

    >
    > Regexp is itself also a "little language". However, to get to the
    > language, we don't need to write a .regex file, compile it with special
    > compilers, produce a .c file, compile this, link into it, bind to it,
    > yack yack yack, and so on just to use it.


    Lex generates C and lives by the rules of that coding universe. Doing
    stuff at run-time can be difficult and wierd there. Much easier to
    transform to a familiar language and compile and link with exactly the
    same tools you use for the rest of your project. Yack (yacc) is an
    entirely different project ;)

    > So, I envision Ruby lines like Lex.new.e(' LetterE -> E | e'). Instead
    > of externally compiling the little language, we just host it.


    Which would be living by the rules and expectations of the Ruby
    universe. Not that theres anything wrong with that; I happen to quite
    like living there myself. It's just useful to remember there's more
    than one way of doing things.

    > That is not important for the current project...


    Agreed.

    > > > Can I add a parser to the Syntax library? It only does Ruby, XML, and
    > > > YAML so far...

    > >
    > > I don't see how that's better than grabbing the Javascript grammar off
    > > the web in BNF :

    >
    > In theory, I only need a dirt-simple way to spot-check the source; I'm
    > not writing a JavaScript interpreter. But it could be better if it
    > doesn't force me to externally compile the lexer.
    >
    > > http://corion.net/perl-dev/Javascript-PurePerl.html does javascript to
    > > xml

    >
    > Righteous! The project already uses XML (and unit tests with
    > assert_xpath), so that will fit right in!


    Not sure what assert_path is, guess it's a function from some kind of
    test harness.

    > Thanks! I honestly would never have thought to try Perl...


    It's not exactly my first choice either, but any port will do in a
    storm. At least you found one acceptable suggestion in my ramblings :)
    spooq, Jan 2, 2007
    #7
  8. Giles Bowkett, Jan 2, 2007
    #8
  9. Phlip

    Phlip Guest

    To spooq:

    Consider this snip of C++, via Boost/Spirit:

    rule<> LetterE = chr_p('e') | chr_p('E');

    The bad news, of course, is all the excessive chr_p stuff. The good
    news is that's raw C++, not even a string, and it all compiles at
    compile time.

    Giles Bowkett wrote:

    > Sorry, why do you want to do this in the first place? The original
    > post mentioned unit testing, if you want to unit test JavaScript,
    > there are much easier ways.


    It's a secret. You'l see!...

    --
    Phlip
    Phlip, Jan 2, 2007
    #9
  10. Phlip

    spooq Guest

    On 1/2/07, Giles Bowkett <> wrote:
    > Sorry, why do you want to do this in the first place? The original
    > post mentioned unit testing, if you want to unit test JavaScript,
    > there are much easier ways.


    No doubt, I said as much in my first reply, but I elaborated on the
    parsing because that interests me.
    spooq, Jan 3, 2007
    #10
  11. Phlip

    spooq Guest

    On 1/2/07, Phlip <> wrote:
    > To spooq:
    >
    > Consider this snip of C++, via Boost/Spirit:
    >
    > rule<> LetterE = chr_p('e') | chr_p('E');
    >
    > The bad news, of course, is all the excessive chr_p stuff. The good
    > news is that's raw C++, not even a string, and it all compiles at
    > compile time.


    Boost is indeed cool, they push the boundaries of C++ further than
    anyone. I really like their XML parser, much better than that horrid
    Xerces port. C++ != C though, especially when lex was written. :)

    Could you let me know how the perl script is going, either here or off-list?
    spooq, Jan 3, 2007
    #11
  12. Phlip

    Phlip Guest

    spooq wrote:

    > Could you let me know how the perl script is going, either here or off-list?


    Awesome - it works perfectly as a lexer, and it only produces two tiny
    bugs (so far). One is 'new Ajax.Updater' doesn't fly, and you need 'ajax =
    new Ajax.Updater'. The other is you gotta have a ; on the ends of lines.

    So here's a sample test case. I am attempting to test-FIRST Javascript
    (thru Rails). That's much harder than just acceptance-testing it thru the
    existing test rigs, but the rewards will be substantial.

    assert_xpath '/form/textarea' do |textarea|
    assert_js textarea.attributes['onkeydown'] do
    assert_xpath 'Statement[1]' do
    assert_xpath '//Identifier[ @name = "Callee" and . = "editor_keydown" ]'
    end
    end
    end

    assert_xpath asserts that the hidden @xdoc variable can call XPath.first()
    on the given string without returning nil. So you can pack lots of goodies
    into your XPath strings, including queries and string comparisons.

    The first assert_xpath calls after something generated XHTML and then
    loaded it into @xdoc. So we can assert this XHTML contains a FORM
    containing a TEXTAREA.

    assert_js just copies the given Javascript into a temporary file, calls
    jsToXml.pl on it, and loads this into @xdoc.

    The above test case trivially tests that <TEXTAREA
    onkeydown='editor_keydown(event);' ...>. That looks too trivial to
    test-first, but it's the little things that add up into big bugs if you
    don't apply a little rigor to your development process!

    --
    Phlip
    Phlip, Jan 4, 2007
    #12
  13. Phlip

    Ryan Platte Guest

    Phlip wrote:
    > The goal is the target JS can flex easily - can reorder its Json, or
    > change fuzzy details, or add new features - without breaking the tests.
    > Ideally, only changes that break project requirements will break tests.
    >
    > Now suppose I want to write that assert_js() using less than seven billion
    > lines of code.
    >
    > The first shortcut is to only parse code we expect. I'm aware that's
    > generally against the general philosophy of parsing, but I'm trying to
    > sell an application, not a JS parser. That's a private detail. I can
    > accept, for example, only parsing the JS emitted by Rails's standard
    > gizmos.


    ....

    > Now the questions. Is there some...
    >
    > ...way to severely beautify that implementation?


    For Rails apps, by way of not testing the library('s JavaScript
    output): stub out RJS's JavaScriptGenerator ('page' object)? That was
    the first thing I thought when I heard about RJS. Surfing the code, it
    looks quite possible using Mocha if it was desirable. I haven't thought
    through all the ramifications. The OP includes testing the source page
    -- maybe also some work on the APIs to link to and submit forms to Ajax
    actions would permit those to be stubbed out as well.

    Has someone already created such a beast? (Besides Google's GWT?)

    --
    Ryan Platte
    Obtiva Training and Consulting
    Agile, Ruby, Rails, Java Eclipse RCP
    http://obtiva.com/
    Ryan Platte, Jan 4, 2007
    #13
  14. On 1/3/07, spooq <> wrote:
    > On 1/2/07, Giles Bowkett <> wrote:
    > > Sorry, why do you want to do this in the first place? The original
    > > post mentioned unit testing, if you want to unit test JavaScript,
    > > there are much easier ways.

    >
    > No doubt, I said as much in my first reply, but I elaborated on the
    > parsing because that interests me.


    No harm there, just trying to figure out if it's an exercise for the
    challenge itself or to address some obscure flaw in the existing
    techniques.

    --
    Giles Bowkett
    http://www.gilesgoatboy.org
    http://gilesbowkett.blogspot.com
    http://gilesgoatboy.blogspot.com
    Giles Bowkett, Jan 4, 2007
    #14
  15. Phlip

    Phlip Guest

    Giles Bowkett wrote:

    > ... just trying to figure out if it's an exercise for the
    > challenge itself or to address some obscure flaw in the existing
    > techniques.


    Yes.

    --
    Phlip
    Phlip, Jan 4, 2007
    #15
  16. Phlip

    Phlip Guest

    Ryan Platte wrote:

    > For Rails apps, by way of not testing the library('s JavaScript
    > output): stub out RJS's JavaScriptGenerator ('page' object)?


    Ultimately we are up against a "log string test". That's where you call an
    emulator or a real deal of some type, it emits a log of its behavior, and
    you parse into this log.

    You could, for example, take some server program, turn its log level up,
    call a high level function, read the log file as a string, and perform
    Regular Expressions on it to pull out target data. (/Error (.*)/ is a good
    start!;)

    Next, you might write a mock that records the calls sent to it as a sequence
    of data items, like the MockGraphics example I started this thread with.

    Here's the diagnostic when an assert_rjs fails:

    Content did not include:
    $("moose_panel").width = "50%";.
    <"$(\"wiki_panel\").width = \"50%\";\n$(\"mouse_panel\").width =
    \"50%\";\nElement.update(\"mouse_panel\", \"<iframe height=\\\"100%\\\"
    src=\\\"/character/hammy_squirrel\\\" id=\\\"test_frame\\\"
    width=\\\"100%\\\"/>\");\nElement.show(\"mouse_panel\");"> expected to be =~
    </\$\("moose_panel"\)\.width\ =\ "50%";/>.

    Note that's not even perfectly robust for a log string test. I could write
    page['moose_panel'].width = '50%', then a few lines later write
    page['mouse_panel'].width = '50%', and this won't catch the bug: assert_rjs
    :page, 'moose_panel', :width=, '50%'. It would find the first line spelled
    right, not the later line spelled wrong.

    A better Log String Test would snarf each line as it tested.

    Don't get me wrong - assert_rjs is an excellent place to start; it's a
    lowest common denominator that at least matches Rails' incredible talent for
    lean and expressive statements. I would use it first before seeking a way to
    test semantics.

    But that's what a mock RJS jigger would do - test that we called our RJS
    object in such-and-so ways.

    > That was
    > the first thing I thought when I heard about RJS. Surfing the code, it
    > looks quite possible using Mocha if it was desirable. I haven't thought
    > through all the ramifications. The OP includes testing the source page
    > -- maybe also some work on the APIs to link to and submit forms to Ajax
    > actions would permit those to be stubbed out as well.


    I want a test case that fails if two Ajax commands overlap each other and
    blot each other out. That requires emulating DOM, and I really think someone
    smarter than I could do the equivalent in 2% of the lines of code I
    envision.

    > Has someone already created such a beast? (Besides Google's GWT?)


    http://www.google.com/search?domains=code.google.com&sitesearch=code.google.com&q=test

    They seem aware of JUnit. ;-)

    --
    Phlip
    http://www.greencheese.us/ZeekLand <-- NOT a blog!!!
    Phlip, Jan 5, 2007
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Moonlit

    Best lex/yacc for C++?

    Moonlit, Oct 8, 2003, in forum: C++
    Replies:
    18
    Views:
    1,533
    Moonlit
    Oct 14, 2003
  2. Arthur T. Murray

    Re: Parsing English with lex and yacc

    Arthur T. Murray, Jan 23, 2004, in forum: C++
    Replies:
    5
    Views:
    1,309
    Amnon Meyers
    Jan 26, 2004
  3. Alvaro Puente

    YACC-LEX parsing overflow

    Alvaro Puente, Jul 10, 2003, in forum: C Programming
    Replies:
    1
    Views:
    382
    Chris Dollin
    Jul 10, 2003
  4. Vuun Harjnes

    lex grammar

    Vuun Harjnes, Oct 19, 2003, in forum: C Programming
    Replies:
    1
    Views:
    500
    T.M. Sommers
    Oct 19, 2003
  5. macabstract
    Replies:
    7
    Views:
    263
    Richard Cornford
    May 25, 2006
Loading...

Share This Page