Crawl nested data structure, apply code block to each

Discussion in 'Perl Misc' started by Randy Westlund, Apr 13, 2014.

  1. I have a simple problem, but am not sure how to solve it. I'm
    getting data from a MongoDB database with the MongoDB module. This
    returns a BSON (JSON-like) document as a nested data structure with
    arbitrary element types. I'm taking that and building a LaTeX
    document with the Template::Latex module. This is currently
    working, most of the time.

    My problem is that the strings I'm pulling from the database
    sometimes have an '&' in them, which screws up my tabularx sections
    in LaTeX. So I need some way to crawl this data structure and
    escape them. I want to do something like call map on all the scalar
    values found.

    I looked at Data::Nested, but didn't see anything useful for me. Is
    there a module that has a function like this, or a concise way to
    write this myself?
    Randy Westlund, Apr 13, 2014
    1. Advertisements

  2. * Randy Westlund wrote in comp.lang.perl.misc:
    This sounds like the `&` need to be escaped when the Template::Latex
    module combines your template code with the data from the structure.
    Ordinarily there should be ways to indicate how values need to be es-
    caped (consider generating HTML documents from templates, sometimes
    values need to be escaped per HTML rules, sometimes JavaScript rules,
    maybe CSS rules, sometimes even a combination of the rules) and I'd
    suggest looking for that instead of transforming your data this way.
    Bjoern Hoehrmann, Apr 13, 2014
    1. Advertisements

  3. Why don't you escape them on extraction?
    Rainer Weikusat, Apr 13, 2014
  4. Randy Westlund

    John Bokma Guest

    Make your own TT filter?

    IIRC you can chain filters, so you can first run your data to your
    custom filter, then through the latex filter.
    John Bokma, Apr 13, 2014
  5. This looks promising. The remaining obstacle is that when I'm
    building the hash to feed Template::Latex, I'm intentionally
    inserting some ampersands for formatting. So I need to escape some
    of them, but not others. Perhaps for the ones I'm intentionally
    putting there, I'll write them as '&&' and have the filter transform
    it like this:
    '&' > '\&'
    '&&' > '&'

    Of course, then any user data containing '&&' will break it :/
    Randy Westlund, Apr 14, 2014
  6. Then why on earth don't you escape ampersands in the input data before
    putting it in the hash ands insert real 'table format &s' afterwards?
    Rainer Weikusat, Apr 14, 2014
  7. That's why I was trying to figure out how I could crawl the data
    structure, to do it before I inserted stuff. My code is laid out
    like this:

    - get complicated document from MongoDB
    - spend two pages of perl pulling things out of the nested data
    structure, transforming the complicated data structure into a
    complicated mess of LaTex formatting mixed with variables in a
    - feed to template

    The whole thing generates something like an invoice, but with a lot
    of conditional formatting depending on what things are in the DB
    Randy Westlund, Apr 15, 2014
  8. This may be part of the problem. I find that it is generally a good idea
    to delay output conversion (in this case applying LaTeX formatting, but
    the same applies for HTML or just character encoding as long as
    possible, and ideally leave it to your templating engine, output filter,
    or whatever. Otherwise it is too easy to lose track of what still needs
    to be converted and what doesn't (leading to either double-converted
    strings or unconverted input in the output).

    Peter J. Holzer, Apr 15, 2014
  9. Did it already occur to you that this is already "code crawling the
    database", although specialized for your problem? All you need to add is
    an intermediate processing step between

    'get data out of the BSON document'


    'transform data before putting it into the hash'

    How are you acessing the serialized data?
    Rainer Weikusat, Apr 15, 2014
  10. I'm using Data::Diver to pull fields out one at a time. I solved
    the problem by wrapping those calls with my own sub that does some
    simple substitution. It's the obvious solution, but it isn't very
    pretty. This being perl, I was hoping I could find some nice
    declarative way to it, like how map works. I guess in this case
    there isn't one.
    Randy Westlund, Apr 16, 2014
  11. 'map' works by looping over the input list and collecting the results of
    evaluating the 'map expressions' on an output list. You could use that,
    too, by turning this into a multi-pass algorithm which first builds a
    list of keys and values, then uses map to transform that into a list of
    keys and escaped values, than runs whatever your other formatting code
    happens to do on this list and finally, puts the results into a hash. I
    don't quite get why someone would consider this 'a pretty solution',
    especially when comparing it with a single-pass algorithm which performs
    the escaping-step which must be done prior to the other processing so
    that it doesn't escape the wrong ampersands before said 'other
    processing' ever sees the data.

    If Data::Diver was an OO-module, you could subclass that, overload Dive,
    and then, your main 'processing logic' would be independent of the 'data
    extraction logic' in the sense that escaping might or might not be
    performed depending on which kind of 'diver object' is used to extract
    the values. But since it isn't, that's not an option.
    Rainer Weikusat, Apr 16, 2014
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.