Crawl nested data structure, apply code block to each

Randy Westlund · Apr 13, 2014

I have a simple problem, but am not sure how to solve it. I'm
getting data from a MongoDB database with the MongoDB module. This
returns a BSON (JSON-like) document as a nested data structure with
arbitrary element types. I'm taking that and building a LaTeX
document with the Template::Latex module. This is currently
working, most of the time.

My problem is that the strings I'm pulling from the database
sometimes have an '&' in them, which screws up my tabularx sections
in LaTeX. So I need some way to crawl this data structure and
escape them. I want to do something like call map on all the scalar
values found.

I looked at Data::Nested, but didn't see anything useful for me. Is
there a module that has a function like this, or a concise way to
write this myself?

Bjoern Hoehrmann · Apr 13, 2014

* Randy Westlund wrote in comp.lang.perl.misc:

I have a simple problem, but am not sure how to solve it. I'm
getting data from a MongoDB database with the MongoDB module. This
returns a BSON (JSON-like) document as a nested data structure with
arbitrary element types. I'm taking that and building a LaTeX
document with the Template::Latex module. This is currently
working, most of the time.

My problem is that the strings I'm pulling from the database
sometimes have an '&' in them, which screws up my tabularx sections
in LaTeX. So I need some way to crawl this data structure and
escape them. I want to do something like call map on all the scalar
values found.

This sounds like the `&` need to be escaped when the Template::Latex
module combines your template code with the data from the structure.
Ordinarily there should be ways to indicate how values need to be es-
caped (consider generating HTML documents from templates, sometimes
values need to be escaped per HTML rules, sometimes JavaScript rules,
maybe CSS rules, sometimes even a combination of the rules) and I'd
suggest looking for that instead of transforming your data this way.

Rainer Weikusat · Apr 13, 2014

Randy Westlund said:
I have a simple problem, but am not sure how to solve it. I'm
getting data from a MongoDB database with the MongoDB module. This
returns a BSON (JSON-like) document as a nested data structure with
arbitrary element types. I'm taking that and building a LaTeX
document with the Template::Latex module. This is currently
working, most of the time.

My problem is that the strings I'm pulling from the database
sometimes have an '&' in them, which screws up my tabularx sections
in LaTeX. So I need some way to crawl this data structure and
escape them.

Why don't you escape them on extraction?

John Bokma · Apr 13, 2014

Randy Westlund said:
I have a simple problem, but am not sure how to solve it. I'm
getting data from a MongoDB database with the MongoDB module. This
returns a BSON (JSON-like) document as a nested data structure with
arbitrary element types. I'm taking that and building a LaTeX
document with the Template::Latex module. This is currently
working, most of the time.

My problem is that the strings I'm pulling from the database
sometimes have an '&' in them, which screws up my tabularx sections
in LaTeX. So I need some way to crawl this data structure and
escape them. I want to do something like call map on all the scalar
values found.

Make your own TT filter?

http://template-toolkit.org/docs/modules/Template/Filters.html#section_FILTERS

IIRC you can chain filters, so you can first run your data to your
custom filter, then through the latex filter.

Randy Westlund · Apr 14, 2014

Make your own TT filter?

http://template-toolkit.org/docs/modules/Template/Filters.html#section_FILTERS

IIRC you can chain filters, so you can first run your data to your
custom filter, then through the latex filter.

This looks promising. The remaining obstacle is that when I'm
building the hash to feed Template::Latex, I'm intentionally
inserting some ampersands for formatting. So I need to escape some
of them, but not others. Perhaps for the ones I'm intentionally
putting there, I'll write them as '&&' and have the filter transform
it like this:
'&' > '\&'
'&&' > '&'

Of course, then any user data containing '&&' will break it :/

Rainer Weikusat · Apr 14, 2014

Randy Westlund said:
This looks promising. The remaining obstacle is that when I'm
building the hash to feed Template::Latex, I'm intentionally
inserting some ampersands for formatting.

Then why on earth don't you escape ampersands in the input data before
putting it in the hash ands insert real 'table format &s' afterwards?

Randy Westlund · Apr 15, 2014

Then why on earth don't you escape ampersands in the input data before
putting it in the hash ands insert real 'table format &s' afterwards?

That's why I was trying to figure out how I could crawl the data
structure, to do it before I inserted stuff. My code is laid out
like this:

- get complicated document from MongoDB
- spend two pages of perl pulling things out of the nested data
structure, transforming the complicated data structure into a
complicated mess of LaTex formatting mixed with variables in a
hash
- feed to template

The whole thing generates something like an invoice, but with a lot
of conditional formatting depending on what things are in the DB
record.

Peter J. Holzer · Apr 15, 2014

My code is laid out like this:

- get complicated document from MongoDB
- spend two pages of perl pulling things out of the nested data
structure, transforming the complicated data structure into a
complicated mess of LaTex formatting mixed with variables in a
hash
- feed to template

This may be part of the problem. I find that it is generally a good idea
to delay output conversion (in this case applying LaTeX formatting, but
the same applies for HTML or just character encoding as long as
possible, and ideally leave it to your templating engine, output filter,
or whatever. Otherwise it is too easy to lose track of what still needs
to be converted and what doesn't (leading to either double-converted
strings or unconverted input in the output).

hp

Rainer Weikusat · Apr 15, 2014

Randy Westlund said:
That's why I was trying to figure out how I could crawl the data
structure, to do it before I inserted stuff. My code is laid out
like this:

- get complicated document from MongoDB
- spend two pages of perl pulling things out of the nested data
structure, transforming the complicated data structure into a
complicated mess of LaTex formatting mixed with variables in a
hash

Did it already occur to you that this is already "code crawling the
database", although specialized for your problem? All you need to add is
an intermediate processing step between

'get data out of the BSON document'

and

'transform data before putting it into the hash'

How are you acessing the serialized data?

Randy Westlund · Apr 16, 2014

Did it already occur to you that this is already "code crawling the
database", although specialized for your problem? All you need to add is
an intermediate processing step between

'get data out of the BSON document'

and

'transform data before putting it into the hash'

How are you acessing the serialized data?

I'm using Data:

iver to pull fields out one at a time. I solved
the problem by wrapping those calls with my own sub that does some
simple substitution. It's the obvious solution, but it isn't very
pretty. This being perl, I was hoping I could find some nice
declarative way to it, like how map works. I guess in this case
there isn't one.

Rainer Weikusat · Apr 16, 2014

Randy Westlund said:
Randy Westlund <[email protected]> writes:

Click to expand...

[...]

I'm using Data:iver to pull fields out one at a time. I solved
the problem by wrapping those calls with my own sub that does some
simple substitution. It's the obvious solution, but it isn't very
pretty. This being perl, I was hoping I could find some nice
declarative way to it, like how map works.

'map' works by looping over the input list and collecting the results of
evaluating the 'map expressions' on an output list. You could use that,
too, by turning this into a multi-pass algorithm which first builds a
list of keys and values, then uses map to transform that into a list of
keys and escaped values, than runs whatever your other formatting code
happens to do on this list and finally, puts the results into a hash. I
don't quite get why someone would consider this 'a pretty solution',
especially when comparing it with a single-pass algorithm which performs
the escaping-step which must be done prior to the other processing so
that it doesn't escape the wrong ampersands before said 'other
processing' ever sees the data.

If Data:

iver was an OO-module, you could subclass that, overload Dive,
and then, your main 'processing logic' would be independent of the 'data
extraction logic' in the sense that escaping might or might not be
performed depending on which kind of 'diver object' is used to extract
the values. But since it isn't, that's not an option.

I made a blockchain and want to make a cryptocurrency, but my code doesn't verify hash of each block	2	Jun 2, 2024
How to use PDF-lib and how to center each line of texts on the page?	1	Aug 16, 2023
Ideal data structure for nested list format?	29	Sep 4, 2010
How to host data visualization beginner friendly?	1	Aug 10, 2023
Defining a data structure - is there a standard/best practice way?	1	Nov 11, 2013
nested dictionaries and functions in data structures.	0	Jan 7, 2014
which data structure to use?	11	Jan 21, 2014
Having trouble parsing JSON structure with JSON package	1	Jun 17, 2013

Crawl nested data structure, apply code block to each

Randy Westlund

Bjoern Hoehrmann

Rainer Weikusat

John Bokma

Randy Westlund

Rainer Weikusat

Randy Westlund

Peter J. Holzer

Rainer Weikusat

Randy Westlund

Rainer Weikusat

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads