More general programming than perl...

Discussion in 'Perl Misc' started by Justin C, Apr 30, 2014.

  1. Justin C

    Justin C Guest

    I will be coding this in perl, but I can't yet get my head
    around how I'm going to achieve what I want, maybe people
    here can offer suggestions on how I might proceed -
    obviously in broad terms, code is a way off at the moment
    I think.

    I need to prepare a "Latest Products" document, the
    contents are coming from a database, I've got to fill the
    document with the latest and stop when the document is 24
    pages, however, the document runs chronologically from
    oldest to newest. I'm trying to work out how I can decide
    which item/date to put at the start of the document, so I
    don't run out of data before 24 pages, or over-run 24
    pages.

    Information is broken up into date sections (listing new
    products for that day), there are varying amounts of data
    for each date, from a few lines to more than a page. There
    is a section heading which is larger than a line of data,
    and there is a vertical space between sections, so how
    many lines I can fit on a page depends on how many
    sections there will be.

    Every page starts with a section/date heading regardless
    of whether it's a continuation of the section on the
    previous page or not.

    Any suggestions one how I might, programatically, decide
    where I should begin my document?



    Justin.

    --
    Justin C, by the sea.
    Justin C, Apr 30, 2014
    #1
    1. Advertising

  2. Justin C <> writes:

    [...]

    > I need to prepare a "Latest Products" document, the
    > contents are coming from a database, I've got to fill the
    > document with the latest and stop when the document is 24
    > pages, however, the document runs chronologically from
    > oldest to newest. I'm trying to work out how I can decide
    > which item/date to put at the start of the document, so I
    > don't run out of data before 24 pages, or over-run 24
    > pages.


    [formatting details]

    > Any suggestions one how I might, programatically, decide
    > where I should begin my document?


    Start with the last entry supposed to appear on the last page, ie, the
    most recent one, and work backwards from that until you either run out
    of data or have produced 24 pages.
    Rainer Weikusat, Apr 30, 2014
    #2
    1. Advertising

  3. Rainer Weikusat <> writes:
    > Justin C <> writes:
    >
    > [...]
    >
    >> I need to prepare a "Latest Products" document, the
    >> contents are coming from a database, I've got to fill the
    >> document with the latest and stop when the document is 24
    >> pages, however, the document runs chronologically from
    >> oldest to newest. I'm trying to work out how I can decide
    >> which item/date to put at the start of the document, so I
    >> don't run out of data before 24 pages, or over-run 24
    >> pages.

    >
    > [formatting details]
    >
    >> Any suggestions one how I might, programatically, decide
    >> where I should begin my document?

    >
    > Start with the last entry supposed to appear on the last page, ie, the
    > most recent one, and work backwards from that until you either run out
    > of data or have produced 24 pages.


    This is not sufficient on its own in case there is less than 24 pages
    worth of data, assuming that a partially filled page may appear at the
    end and must not appear at the beginning. This can be solved with a
    2-pass algorithm: First, move backward through the data (recording the
    space needed for each entry and meta-entry, ie, section header) until 24
    pages have been accumulated or there's no more data. Then, move forward
    through the entries in order to produce actual pages. This step can be
    avoided if there are 24 pages but that's probably not worth the effort.

    Possible gotcha: A situation where a lone 'date section heading' appears
    at the bottom of a page, followed by the first entry for that day should
    probably be avoided.
    Rainer Weikusat, Apr 30, 2014
    #3
  4. Justin C

    gamo Guest

    El 30/04/14 13:29, Justin C escribió:
    > I will be coding this in perl, but I can't yet get my head
    > around how I'm going to achieve what I want, maybe people
    > here can offer suggestions on how I might proceed -
    > obviously in broad terms, code is a way off at the moment
    > I think.
    >
    > I need to prepare a "Latest Products" document, the
    > contents are coming from a database, I've got to fill the
    > document with the latest and stop when the document is 24
    > pages, however, the document runs chronologically from
    > oldest to newest. I'm trying to work out how I can decide
    > which item/date to put at the start of the document, so I
    > don't run out of data before 24 pages, or over-run 24
    > pages.
    >
    > Information is broken up into date sections (listing new
    > products for that day), there are varying amounts of data
    > for each date, from a few lines to more than a page. There
    > is a section heading which is larger than a line of data,
    > and there is a vertical space between sections, so how
    > many lines I can fit on a page depends on how many
    > sections there will be.
    >
    > Every page starts with a section/date heading regardless
    > of whether it's a continuation of the section on the
    > previous page or not.
    >
    > Any suggestions one how I might, programatically, decide
    > where I should begin my document?
    >
    >
    >
    > Justin.
    >


    I assume you can produce 100 pages.

    Just produce 25 pages and then do 'intelligent' cuts, like
    based on section length, recentness, etc. until it fits in
    24 pages.

    --
    http://www.telecable.es/personales/gamo/
    gamo, Apr 30, 2014
    #4
  5. it does not look very difficult task.
    You need some sql queries through DBI and create the documents using e.g
    the html template .
    George Mpouras, Apr 30, 2014
    #5
  6. with <> Justin C wrote:

    *SKIP*
    > I need to prepare a "Latest Products" document, the ...


    Please, define "document".

    *CUT*

    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom
    Eric Pozharski, May 1, 2014
    #6
  7. Justin C

    Justin C Guest

    On 2014-04-30, George Mpouras <> wrote:
    > it does not look very difficult task.
    > You need some sql queries through DBI and create the documents using e.g
    > the html template .


    Thank you for not answering the question. Telling me what I already
    know is very profitable use of both your time and mine. Thank you
    for your lack of assistance in this matter.


    Justin.

    --
    Justin C, by the sea.
    Justin C, May 1, 2014
    #7
  8. Justin C

    Justin C Guest

    On 2014-04-30, gamo <> wrote:
    > El 30/04/14 13:29, Justin C escribió:
    >> I will be coding this in perl, but I can't yet get my head
    >> around how I'm going to achieve what I want, maybe people
    >> here can offer suggestions on how I might proceed -
    >> obviously in broad terms, code is a way off at the moment
    >> I think.
    >>
    >> I need to prepare a "Latest Products" document, the
    >> contents are coming from a database, I've got to fill the
    >> document with the latest and stop when the document is 24
    >> pages, however, the document runs chronologically from
    >> oldest to newest. I'm trying to work out how I can decide
    >> which item/date to put at the start of the document, so I
    >> don't run out of data before 24 pages, or over-run 24
    >> pages.
    >>
    >> Information is broken up into date sections (listing new
    >> products for that day), there are varying amounts of data
    >> for each date, from a few lines to more than a page. There
    >> is a section heading which is larger than a line of data,
    >> and there is a vertical space between sections, so how
    >> many lines I can fit on a page depends on how many
    >> sections there will be.
    >>
    >> Every page starts with a section/date heading regardless
    >> of whether it's a continuation of the section on the
    >> previous page or not.
    >>
    >> Any suggestions one how I might, programatically, decide
    >> where I should begin my document?
    >>
    >>
    >>
    >> Justin.
    >>

    >
    > I assume you can produce 100 pages.
    >
    > Just produce 25 pages and then do 'intelligent' cuts, like
    > based on section length, recentness, etc. until it fits in
    > 24 pages.



    Yes, I can see that's an option. It doesn't seem very
    economical though, I can foresee a time when I could
    produce 10,000 pages and my CPU and RAM would be occupied
    for an unnecessary length of time.

    I could run it this way once, and then record where the
    cut falls, and refer to that position as my start point
    for next time. Then remove the oldest and record a new
    cut/starting point.

    Thank you for the suggestion.


    Justin.

    --
    Justin C, by the sea.
    Justin C, May 1, 2014
    #8
  9. Justin C

    Justin C Guest

    On 2014-05-01, Eric Pozharski <> wrote:
    > with <> Justin C wrote:
    >
    > *SKIP*
    >> I need to prepare a "Latest Products" document, the ...

    >
    > Please, define "document".


    I am not sure it is relevant to the question I asked, but my
    document is actually an Excel spreadsheet which will be printed
    as a PDF.

    The reason is that historically this document has always come
    from an Excel file, and now we no longer wish to maintain the
    document and instead are putting the data into a DB and intend
    to work from that instead. Maybe, at a later date, I'll bypass
    the Excel file if it turns out no one actually wants it in that
    format, and go straight to PDF.

    Justin.

    --
    Justin C, by the sea.
    Justin C, May 1, 2014
    #9
  10. Justin C

    Steve May Guest

    On 04/30/2014 04:29 AM, Justin C wrote:
    > I will be coding this in perl, but I can't yet get my head
    > around how I'm going to achieve what I want, maybe people
    > here can offer suggestions on how I might proceed -
    > obviously in broad terms, code is a way off at the moment
    > I think.
    >
    > I need to prepare a "Latest Products" document, the
    > contents are coming from a database, I've got to fill the
    > document with the latest and stop when the document is 24
    > pages, however, the document runs chronologically from
    > oldest to newest. I'm trying to work out how I can decide
    > which item/date to put at the start of the document, so I
    > don't run out of data before 24 pages, or over-run 24
    > pages.
    >
    > Information is broken up into date sections (listing new
    > products for that day), there are varying amounts of data
    > for each date, from a few lines to more than a page. There
    > is a section heading which is larger than a line of data,
    > and there is a vertical space between sections, so how
    > many lines I can fit on a page depends on how many
    > sections there will be.
    >
    > Every page starts with a section/date heading regardless
    > of whether it's a continuation of the section on the
    > previous page or not.
    >
    > Any suggestions one how I might, programatically, decide
    > where I should begin my document?
    >
    >
    >
    > Justin.
    >


    Seems like this approach might work:

    Fill a list with records (hash-refs) sorted new to old. Limit to some
    reasonable number knowing that you can always add a few more if needed.

    Trial print the list while tracking the record count and page count to
    determine how many records you can print from the list.

    Take a slice from the list based on how many records needed (above), and
    reverse it.

    Print.

    hth,

    Steve
    Steve May, May 1, 2014
    #10
  11. Steve May <> writes:

    [...]

    > Seems like this approach might work:
    >
    > Fill a list with records (hash-refs) sorted new to old. Limit to some
    > reasonable number knowing that you can always add a few more if
    > needed.


    As I already wrote: The pages can be build backwards while accumlating
    records. That is, for each new record, the size including a possible
    leading 'day sections header' is calculated. If it still fits in front
    of the most recently added record on the current page, it is added to
    that, otherwise, it becomes the first entry on the next page. This
    process is repeated until 24 pages of output have been accumulated.

    In case there are less than 24 pages, a 2nd pass can be used to pull
    'entries to preceeding pages' so that the last page ends up being
    'partially filled' instead of the first page.

    This is not really difficult provide one can overcome the notion that
    'stuff has to happen forwards' (something even "well known OSS
    celebrities" reportedly find difficult :->) and that 'the size' must be
    calculated in one go instead of incrementally.
    Rainer Weikusat, May 1, 2014
    #11
  12. Justin C

    Justin C Guest

    On 2014-05-01, Steve May <> wrote:
    > On 04/30/2014 04:29 AM, Justin C wrote:
    >> I will be coding this in perl, but I can't yet get my head
    >> around how I'm going to achieve what I want, maybe people
    >> here can offer suggestions on how I might proceed -
    >> obviously in broad terms, code is a way off at the moment
    >> I think.
    >>
    >> I need to prepare a "Latest Products" document, the
    >> contents are coming from a database, I've got to fill the
    >> document with the latest and stop when the document is 24
    >> pages, however, the document runs chronologically from
    >> oldest to newest. I'm trying to work out how I can decide
    >> which item/date to put at the start of the document, so I
    >> don't run out of data before 24 pages, or over-run 24
    >> pages.
    >>
    >> Information is broken up into date sections (listing new
    >> products for that day), there are varying amounts of data
    >> for each date, from a few lines to more than a page. There
    >> is a section heading which is larger than a line of data,
    >> and there is a vertical space between sections, so how
    >> many lines I can fit on a page depends on how many
    >> sections there will be.
    >>
    >> Every page starts with a section/date heading regardless
    >> of whether it's a continuation of the section on the
    >> previous page or not.
    >>
    >> Any suggestions one how I might, programatically, decide
    >> where I should begin my document?
    >>
    >>
    >>
    >> Justin.
    >>

    >
    > Seems like this approach might work:
    >
    > Fill a list with records (hash-refs) sorted new to old. Limit to some
    > reasonable number knowing that you can always add a few more if needed.
    >
    > Trial print the list while tracking the record count and page count to
    > determine how many records you can print from the list.
    >
    > Take a slice from the list based on how many records needed (above), and
    > reverse it.
    >
    > Print.


    That sounds quite reasonable. Yes, I like that. Thank you Steve.

    Justin.

    --
    Justin C, by the sea.
    Justin C, May 1, 2014
    #12
  13. Justin C

    Steve May Guest

    On 05/01/2014 08:30 AM, Rainer Weikusat wrote:
    > Steve May <> writes:
    >
    > [...]
    >
    >> Seems like this approach might work:
    >>
    >> Fill a list with records (hash-refs) sorted new to old. Limit to some
    >> reasonable number knowing that you can always add a few more if
    >> needed.

    >
    > As I already wrote: The pages can be build backwards while accumlating
    > records. That is, for each new record, the size including a possible
    > leading 'day sections header' is calculated. If it still fits in front
    > of the most recently added record on the current page, it is added to
    > that, otherwise, it becomes the first entry on the next page. This
    > process is repeated until 24 pages of output have been accumulated.
    >
    > In case there are less than 24 pages, a 2nd pass can be used to pull
    > 'entries to preceeding pages' so that the last page ends up being
    > 'partially filled' instead of the first page.
    >
    > This is not really difficult provide one can overcome the notion that
    > 'stuff has to happen forwards' (something even "well known OSS
    > celebrities" reportedly find difficult :->) and that 'the size' must be
    > calculated in one go instead of incrementally.
    >



    Yes, though it was not immediately clear to me what you were suggesting
    (sometimes I'm a bit thick).

    Too, differing explanations/perspectives on solutions are often useful.

    At any rate, it seems the OP has some ideas to play with now.

    \s
    Steve May, May 1, 2014
    #13
  14. Justin C

    Steve May Guest

    On 05/01/2014 08:47 AM, Justin C wrote:
    > On 2014-05-01, Steve May <> wrote:
    >> On 04/30/2014 04:29 AM, Justin C wrote:
    >>> I will be coding this in perl, but I can't yet get my head
    >>> around how I'm going to achieve what I want, maybe people
    >>> here can offer suggestions on how I might proceed -
    >>> obviously in broad terms, code is a way off at the moment
    >>> I think.
    >>>
    >>> I need to prepare a "Latest Products" document, the
    >>> contents are coming from a database, I've got to fill the
    >>> document with the latest and stop when the document is 24
    >>> pages, however, the document runs chronologically from
    >>> oldest to newest. I'm trying to work out how I can decide
    >>> which item/date to put at the start of the document, so I
    >>> don't run out of data before 24 pages, or over-run 24
    >>> pages.
    >>>
    >>> Information is broken up into date sections (listing new
    >>> products for that day), there are varying amounts of data
    >>> for each date, from a few lines to more than a page. There
    >>> is a section heading which is larger than a line of data,
    >>> and there is a vertical space between sections, so how
    >>> many lines I can fit on a page depends on how many
    >>> sections there will be.
    >>>
    >>> Every page starts with a section/date heading regardless
    >>> of whether it's a continuation of the section on the
    >>> previous page or not.
    >>>
    >>> Any suggestions one how I might, programatically, decide
    >>> where I should begin my document?
    >>>
    >>>
    >>>
    >>> Justin.
    >>>

    >>
    >> Seems like this approach might work:
    >>
    >> Fill a list with records (hash-refs) sorted new to old. Limit to some
    >> reasonable number knowing that you can always add a few more if needed.
    >>
    >> Trial print the list while tracking the record count and page count to
    >> determine how many records you can print from the list.
    >>
    >> Take a slice from the list based on how many records needed (above), and
    >> reverse it.
    >>
    >> Print.

    >
    > That sounds quite reasonable. Yes, I like that. Thank you Steve.
    >
    > Justin.
    >


    YVW.

    Forgot one thing: depending of formatting and exactly how the line
    breaks work out, it IS possible that 24 pages forward will turn into 25
    (or 23) pages reversed. You might want to double check the final output
    page count.

    \s
    Steve May, May 1, 2014
    #14
  15. with <> Justin C wrote:
    > On 2014-05-01, Eric Pozharski <> wrote:
    >> with <> Justin C wrote:


    >>> I need to prepare a "Latest Products" document, the ...

    >> Please, define "document".

    > I am not sure it is relevant to the question I asked, but my
    > document is actually an Excel spreadsheet which will be printed
    > as a PDF.


    "Document", being defined, defines "page". From your conversations with
    others I'm glad to find out that your "document" looks more like
    text/plain (wrapped in application/pdf) then application/pdf itself.
    You don't need Excel for this. Or perl -- Excel would be closer to
    pages than perl.

    *CUT*

    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom
    Eric Pozharski, May 2, 2014
    #15
  16. Justin C

    Justin C Guest

    On 2014-05-02, Eric Pozharski <> wrote:
    > with <> Justin C wrote:
    >> On 2014-05-01, Eric Pozharski <> wrote:
    >>> with <> Justin C wrote:

    >
    >>>> I need to prepare a "Latest Products" document, the ...
    >>> Please, define "document".

    >> I am not sure it is relevant to the question I asked, but my
    >> document is actually an Excel spreadsheet which will be printed
    >> as a PDF.

    >
    > "Document", being defined, defines "page". From your conversations with
    > others I'm glad to find out that your "document" looks more like
    > text/plain (wrapped in application/pdf) then application/pdf itself.
    > You don't need Excel for this. Or perl -- Excel would be closer to
    > pages than perl.


    Though we print the document for customers the original Excel
    file is also available to download from our web-site, and
    some customers are used to it that way, I wouldn't want to
    take that away from those that like it that way - who may
    have programs that read the file and 'do stuff' with the data.

    My preference would be for LaTeX to produce it, because I
    know it can do better typesetting than anything else in my
    toolbox, but that doesn't get around the desire to still
    support those who like the Excel file... maybe two files, a
    'pretty' PDF and a plain Excel file. Hmmm.

    I'm going to enjoy this!

    Justin.

    --
    Justin C, by the sea.
    Justin C, May 2, 2014
    #16
  17. Justin C

    Justin C Guest

    On 2014-04-30, Justin C <> wrote:
    >
    > I need to prepare a "Latest Products" document, the
    > contents are coming from a database, I've got to fill the
    > document with the latest and stop when the document is 24
    > pages, however, the document runs chronologically from
    > oldest to newest. I'm trying to work out how I can decide
    > which item/date to put at the start of the document, so I
    > don't run out of data before 24 pages, or over-run 24
    > pages.


    [snip]

    Thank you to all who have made suggestions. I have some
    interesting things to think about, and you have all helped
    me better understand what I have to achieve.



    Justin.

    --
    Justin C, by the sea.
    Justin C, May 2, 2014
    #17
  18. with <> Justin C wrote:

    *SKIP*
    > I'm going to enjoy this!


    No, you won't. It will be terrible pain.

    p.s. I'm doing typesetting for living. And yes, it's texlive and
    friends.

    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom
    Eric Pozharski, May 3, 2014
    #18
  19. Justin C

    ccc31807 Guest

    On Wednesday, April 30, 2014 7:29:47 AM UTC-4, Justin C wrote:
    > I need to prepare a "Latest Products" document, the
    > contents are coming from a database, I've got to fill the
    > document with the latest and stop when the document is 24
    > pages, however, the document runs chronologically from
    > oldest to newest. I'm trying to work out how I can decide
    > which item/date to put at the start of the document, so I
    > don't run out of data before 24 pages, or over-run 24
    > pages.


    Is 'page' a physical page or a logical page?
    What format is your output?

    To begin with, I assume that you have your data in some kind of nested hash.. I assume that your keys might be some kind of date that could be sorted, like maybe a Julian date. I also assume that your SQL orders by date and returns a limit of 24. In that case, you would do this:

    foreach my $product (sort keys %products)
    {
    print_page($products{$product});
    }

    This would work fine if your 'pages' were logical pages.

    If your pages were physical pages,and if PDF output was acceptable, it would be easy to use PDF::API2 or similar, which requires you to issue literal line feeds. Count down the lines until you reach the bottom, and start a new page. Quit when you hit 24.

    I've had great success recently using Perl to emit TEX source code and thencalling

    `pdflatex source.tex`;

    If you feel comfortable using LaTeX I think you will be pleasantly surprised at how easy Perl can produce TEX, and this gives you exquisite control over the presentation of your output.

    Pert is a great tool for this kind of job. For me, as long as you get your initial data structure right, this ought to be trivial.

    CC.
    ccc31807, May 12, 2014
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page