More general programming than perl...

J

Justin C

I will be coding this in perl, but I can't yet get my head
around how I'm going to achieve what I want, maybe people
here can offer suggestions on how I might proceed -
obviously in broad terms, code is a way off at the moment
I think.

I need to prepare a "Latest Products" document, the
contents are coming from a database, I've got to fill the
document with the latest and stop when the document is 24
pages, however, the document runs chronologically from
oldest to newest. I'm trying to work out how I can decide
which item/date to put at the start of the document, so I
don't run out of data before 24 pages, or over-run 24
pages.

Information is broken up into date sections (listing new
products for that day), there are varying amounts of data
for each date, from a few lines to more than a page. There
is a section heading which is larger than a line of data,
and there is a vertical space between sections, so how
many lines I can fit on a page depends on how many
sections there will be.

Every page starts with a section/date heading regardless
of whether it's a continuation of the section on the
previous page or not.

Any suggestions one how I might, programatically, decide
where I should begin my document?



Justin.
 
R

Rainer Weikusat

[...]
I need to prepare a "Latest Products" document, the
contents are coming from a database, I've got to fill the
document with the latest and stop when the document is 24
pages, however, the document runs chronologically from
oldest to newest. I'm trying to work out how I can decide
which item/date to put at the start of the document, so I
don't run out of data before 24 pages, or over-run 24
pages.

[formatting details]
Any suggestions one how I might, programatically, decide
where I should begin my document?

Start with the last entry supposed to appear on the last page, ie, the
most recent one, and work backwards from that until you either run out
of data or have produced 24 pages.
 
R

Rainer Weikusat

Rainer Weikusat said:
[...]
I need to prepare a "Latest Products" document, the
contents are coming from a database, I've got to fill the
document with the latest and stop when the document is 24
pages, however, the document runs chronologically from
oldest to newest. I'm trying to work out how I can decide
which item/date to put at the start of the document, so I
don't run out of data before 24 pages, or over-run 24
pages.

[formatting details]
Any suggestions one how I might, programatically, decide
where I should begin my document?

Start with the last entry supposed to appear on the last page, ie, the
most recent one, and work backwards from that until you either run out
of data or have produced 24 pages.

This is not sufficient on its own in case there is less than 24 pages
worth of data, assuming that a partially filled page may appear at the
end and must not appear at the beginning. This can be solved with a
2-pass algorithm: First, move backward through the data (recording the
space needed for each entry and meta-entry, ie, section header) until 24
pages have been accumulated or there's no more data. Then, move forward
through the entries in order to produce actual pages. This step can be
avoided if there are 24 pages but that's probably not worth the effort.

Possible gotcha: A situation where a lone 'date section heading' appears
at the bottom of a page, followed by the first entry for that day should
probably be avoided.
 
G

gamo

El 30/04/14 13:29, Justin C escribió:
I will be coding this in perl, but I can't yet get my head
around how I'm going to achieve what I want, maybe people
here can offer suggestions on how I might proceed -
obviously in broad terms, code is a way off at the moment
I think.

I need to prepare a "Latest Products" document, the
contents are coming from a database, I've got to fill the
document with the latest and stop when the document is 24
pages, however, the document runs chronologically from
oldest to newest. I'm trying to work out how I can decide
which item/date to put at the start of the document, so I
don't run out of data before 24 pages, or over-run 24
pages.

Information is broken up into date sections (listing new
products for that day), there are varying amounts of data
for each date, from a few lines to more than a page. There
is a section heading which is larger than a line of data,
and there is a vertical space between sections, so how
many lines I can fit on a page depends on how many
sections there will be.

Every page starts with a section/date heading regardless
of whether it's a continuation of the section on the
previous page or not.

Any suggestions one how I might, programatically, decide
where I should begin my document?



Justin.

I assume you can produce 100 pages.

Just produce 25 pages and then do 'intelligent' cuts, like
based on section length, recentness, etc. until it fits in
24 pages.
 
G

George Mpouras

it does not look very difficult task.
You need some sql queries through DBI and create the documents using e.g
the html template .
 
J

Justin C

it does not look very difficult task.
You need some sql queries through DBI and create the documents using e.g
the html template .

Thank you for not answering the question. Telling me what I already
know is very profitable use of both your time and mine. Thank you
for your lack of assistance in this matter.


Justin.
 
J

Justin C

El 30/04/14 13:29, Justin C escribió:

I assume you can produce 100 pages.

Just produce 25 pages and then do 'intelligent' cuts, like
based on section length, recentness, etc. until it fits in
24 pages.


Yes, I can see that's an option. It doesn't seem very
economical though, I can foresee a time when I could
produce 10,000 pages and my CPU and RAM would be occupied
for an unnecessary length of time.

I could run it this way once, and then record where the
cut falls, and refer to that position as my start point
for next time. Then remove the oldest and record a new
cut/starting point.

Thank you for the suggestion.


Justin.
 
J

Justin C

*SKIP*

Please, define "document".

I am not sure it is relevant to the question I asked, but my
document is actually an Excel spreadsheet which will be printed
as a PDF.

The reason is that historically this document has always come
from an Excel file, and now we no longer wish to maintain the
document and instead are putting the data into a DB and intend
to work from that instead. Maybe, at a later date, I'll bypass
the Excel file if it turns out no one actually wants it in that
format, and go straight to PDF.

Justin.
 
S

Steve May

I will be coding this in perl, but I can't yet get my head
around how I'm going to achieve what I want, maybe people
here can offer suggestions on how I might proceed -
obviously in broad terms, code is a way off at the moment
I think.

I need to prepare a "Latest Products" document, the
contents are coming from a database, I've got to fill the
document with the latest and stop when the document is 24
pages, however, the document runs chronologically from
oldest to newest. I'm trying to work out how I can decide
which item/date to put at the start of the document, so I
don't run out of data before 24 pages, or over-run 24
pages.

Information is broken up into date sections (listing new
products for that day), there are varying amounts of data
for each date, from a few lines to more than a page. There
is a section heading which is larger than a line of data,
and there is a vertical space between sections, so how
many lines I can fit on a page depends on how many
sections there will be.

Every page starts with a section/date heading regardless
of whether it's a continuation of the section on the
previous page or not.

Any suggestions one how I might, programatically, decide
where I should begin my document?



Justin.

Seems like this approach might work:

Fill a list with records (hash-refs) sorted new to old. Limit to some
reasonable number knowing that you can always add a few more if needed.

Trial print the list while tracking the record count and page count to
determine how many records you can print from the list.

Take a slice from the list based on how many records needed (above), and
reverse it.

Print.

hth,

Steve
 
R

Rainer Weikusat

[...]
Seems like this approach might work:

Fill a list with records (hash-refs) sorted new to old. Limit to some
reasonable number knowing that you can always add a few more if
needed.

As I already wrote: The pages can be build backwards while accumlating
records. That is, for each new record, the size including a possible
leading 'day sections header' is calculated. If it still fits in front
of the most recently added record on the current page, it is added to
that, otherwise, it becomes the first entry on the next page. This
process is repeated until 24 pages of output have been accumulated.

In case there are less than 24 pages, a 2nd pass can be used to pull
'entries to preceeding pages' so that the last page ends up being
'partially filled' instead of the first page.

This is not really difficult provide one can overcome the notion that
'stuff has to happen forwards' (something even "well known OSS
celebrities" reportedly find difficult :->) and that 'the size' must be
calculated in one go instead of incrementally.
 
J

Justin C

Seems like this approach might work:

Fill a list with records (hash-refs) sorted new to old. Limit to some
reasonable number knowing that you can always add a few more if needed.

Trial print the list while tracking the record count and page count to
determine how many records you can print from the list.

Take a slice from the list based on how many records needed (above), and
reverse it.

Print.

That sounds quite reasonable. Yes, I like that. Thank you Steve.

Justin.
 
S

Steve May

[...]
Seems like this approach might work:

Fill a list with records (hash-refs) sorted new to old. Limit to some
reasonable number knowing that you can always add a few more if
needed.

As I already wrote: The pages can be build backwards while accumlating
records. That is, for each new record, the size including a possible
leading 'day sections header' is calculated. If it still fits in front
of the most recently added record on the current page, it is added to
that, otherwise, it becomes the first entry on the next page. This
process is repeated until 24 pages of output have been accumulated.

In case there are less than 24 pages, a 2nd pass can be used to pull
'entries to preceeding pages' so that the last page ends up being
'partially filled' instead of the first page.

This is not really difficult provide one can overcome the notion that
'stuff has to happen forwards' (something even "well known OSS
celebrities" reportedly find difficult :->) and that 'the size' must be
calculated in one go instead of incrementally.


Yes, though it was not immediately clear to me what you were suggesting
(sometimes I'm a bit thick).

Too, differing explanations/perspectives on solutions are often useful.

At any rate, it seems the OP has some ideas to play with now.

\s
 
S

Steve May

That sounds quite reasonable. Yes, I like that. Thank you Steve.

Justin.

YVW.

Forgot one thing: depending of formatting and exactly how the line
breaks work out, it IS possible that 24 pages forward will turn into 25
(or 23) pages reversed. You might want to double check the final output
page count.

\s
 
E

Eric Pozharski

with said:
I am not sure it is relevant to the question I asked, but my
document is actually an Excel spreadsheet which will be printed
as a PDF.

"Document", being defined, defines "page". From your conversations with
others I'm glad to find out that your "document" looks more like
text/plain (wrapped in application/pdf) then application/pdf itself.
You don't need Excel for this. Or perl -- Excel would be closer to
pages than perl.

*CUT*
 
J

Justin C

"Document", being defined, defines "page". From your conversations with
others I'm glad to find out that your "document" looks more like
text/plain (wrapped in application/pdf) then application/pdf itself.
You don't need Excel for this. Or perl -- Excel would be closer to
pages than perl.

Though we print the document for customers the original Excel
file is also available to download from our web-site, and
some customers are used to it that way, I wouldn't want to
take that away from those that like it that way - who may
have programs that read the file and 'do stuff' with the data.

My preference would be for LaTeX to produce it, because I
know it can do better typesetting than anything else in my
toolbox, but that doesn't get around the desire to still
support those who like the Excel file... maybe two files, a
'pretty' PDF and a plain Excel file. Hmmm.

I'm going to enjoy this!

Justin.
 
J

Justin C

I need to prepare a "Latest Products" document, the
contents are coming from a database, I've got to fill the
document with the latest and stop when the document is 24
pages, however, the document runs chronologically from
oldest to newest. I'm trying to work out how I can decide
which item/date to put at the start of the document, so I
don't run out of data before 24 pages, or over-run 24
pages.

[snip]

Thank you to all who have made suggestions. I have some
interesting things to think about, and you have all helped
me better understand what I have to achieve.



Justin.
 
C

ccc31807

I need to prepare a "Latest Products" document, the
contents are coming from a database, I've got to fill the
document with the latest and stop when the document is 24
pages, however, the document runs chronologically from
oldest to newest. I'm trying to work out how I can decide
which item/date to put at the start of the document, so I
don't run out of data before 24 pages, or over-run 24
pages.

Is 'page' a physical page or a logical page?
What format is your output?

To begin with, I assume that you have your data in some kind of nested hash.. I assume that your keys might be some kind of date that could be sorted, like maybe a Julian date. I also assume that your SQL orders by date and returns a limit of 24. In that case, you would do this:

foreach my $product (sort keys %products)
{
print_page($products{$product});
}

This would work fine if your 'pages' were logical pages.

If your pages were physical pages,and if PDF output was acceptable, it would be easy to use PDF::API2 or similar, which requires you to issue literal line feeds. Count down the lines until you reach the bottom, and start a new page. Quit when you hit 24.

I've had great success recently using Perl to emit TEX source code and thencalling

`pdflatex source.tex`;

If you feel comfortable using LaTeX I think you will be pleasantly surprised at how easy Perl can produce TEX, and this gives you exquisite control over the presentation of your output.

Pert is a great tool for this kind of job. For me, as long as you get your initial data structure right, this ought to be trivial.

CC.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,439
Members
44,829
Latest member
PIXThurman

Latest Threads

Top