Communication across Perl scripts

Discussion in 'Perl Misc' started by Jean, Oct 11, 2010.

  1. Jean

    Jean Guest

    I am searching for efficient ways of communication across two Perl
    scripts. I have two scripts; Script 1 generates some data. I want my
    script two to be able to access that information. The easiest/dumbest
    way is to write the data generated by script 1 as a file and read it
    later using script 2. Is there any other way than this ? Can I store
    the data in memory and make it available to script two (of-course with
    support from my Linux ) ? Meaning malloc somedata by script 1 and make
    script 2 able to access it.

    There is no guarantee that Script 2 will be run after Script 1. So
    there should be some way to free that memory using a watchdog timer.
     
    Jean, Oct 11, 2010
    #1
    1. Advertising

  2. Jean

    Ted Zlatanov Guest

    On Mon, 11 Oct 2010 09:25:25 -0700 (PDT) Jean <> wrote:

    J> I am searching for efficient ways of communication across two Perl
    J> scripts. I have two scripts; Script 1 generates some data. I want my
    J> script two to be able to access that information. The easiest/dumbest
    J> way is to write the data generated by script 1 as a file and read it
    J> later using script 2. Is there any other way than this ? Can I store
    J> the data in memory and make it available to script two (of-course with
    J> support from my Linux ) ? Meaning malloc somedata by script 1 and make
    J> script 2 able to access it.

    J> There is no guarantee that Script 2 will be run after Script 1. So
    J> there should be some way to free that memory using a watchdog timer.

    Depends on your latency and load requirements.

    If you need speed, shared memory is probably your best bet.

    If you need easy reliable implementation, put the information in files
    (you can notify the reader there's a new file with fam/inotify or
    SIGUSR1). That's not a dumb way as long as you implement it properly
    and it fits your requirements.

    If you need low latency, use a message queue.

    Ted
     
    Ted Zlatanov, Oct 11, 2010
    #2
    1. Advertising

  3. Jean

    Ted Zlatanov Guest

    On Mon, 11 Oct 2010 13:09:06 -0400 Sherm Pendley <> wrote:

    SP> Jean <> writes:
    >> I am searching for efficient ways of communication across two Perl
    >> scripts.


    SP> Options are plentiful. Have a look at "perldoc perlipc" for a good
    SP> overview.

    Unfortunately that page doesn't mention (nor should it) databases,
    message queues, ESBs, loopback network interfaces, etc. Each one of
    those may have distinct advantages over plain IPC, depending on the OS,
    environment, policies, and existing infrastructure.

    Ted
     
    Ted Zlatanov, Oct 11, 2010
    #3
  4. On Mon, 11 Oct 2010 14:17:58 -0500, Ted Zlatanov wrote:

    > If you need low latency, use a message queue.


    Speaking of message queues, what do people recommend on Unix/Linux?

    M4
     
    Martijn Lievaart, Oct 11, 2010
    #4
  5. Jean

    Guest

    On Oct 11, 10:25 am, Jean <> wrote:
    > I am searching for efficient ways of communication across two Perl
    > scripts. I have two scripts; Script 1 generates some data. I want my
    > script two to be able to access that information.


    > There is no guarantee that Script 2 will be run after Script 1. So
    > there should be some way to free that memory using a watchdog timer.


    It sounds like there's no guarantee that either script will overlap
    while running, either. Unless you write your data to a file on disk,
    you'll need another program to act as some sort of broker to manage
    the data you want to share.

    You could try using a third-party broker, or perhaps use an SQL
    database to store your data. ...or you could just write what you want
    to share to disk, to be picked up by Script 2.


    > The easiest/dumbest way is to write the data generated by
    > script 1 as a file and read it later using script 2.


    That may be easiest, but I don't think it's the dumbest. And if
    you use this approach, I highly recommend using the "Storable" module
    (it's a standard module so you should already have it). If you have a
    reference to data in Script 1 (for example, $dataReference), you can
    save it in one line (if you don't count the "use Storable" line), like
    this:

    use Storable qw(lock_nstore lock_retrieve);
    lock_nstore($dataReference, "file_name");

    and then Script 2 can read it in with one line like this:

    use Storable qw(lock_nstore lock_retrieve);
    my $dataReference = lock_retrieve("file_name");

    Now Script 1 and Script 2 should both have a $dataReference that
    refers to identical data.

    Type "perldoc Storable" at the Unix/DOS prompt to read more about
    this module.

    It's hard to get much simpler than this. You might be tempted to
    write your own file-writing and file-reading code, but if you do,
    you'll have to handle your own file locking and your own to/from file
    stream conversions. (And that'll probably take more than just two
    lines of code to implement.)

    If you're good with SQL, you may want to try a DBI module like
    DBD::SQLite. The database for SQLite is stored on disk (so you don't
    need a third-party program to manage the data), and it gives you the
    flexibility in that if you ever have move your shared data to a
    database server, most of the data-sharing code will remain unchanged.

    Also, don't forget to "use strict;" and "use warnings;" if you
    aren't using them already; they'll save you lots of headaches in the
    long run.

    I hope this helps,

    -- Jean-Luc
     
    , Oct 11, 2010
    #5
  6. Jean

    C.DeRykus Guest

    On Oct 11, 9:25 am, Jean <> wrote:
    > I am searching for efficient ways of communication across two Perl
    > scripts. I have two scripts; Script 1 generates some data. I want my
    > script two to be able to access that information. The easiest/dumbest
    > way is to write the data generated by script 1 as a file and read it
    > later using script 2. Is there any other way than this ? Can I store
    > the data in memory and make it available to script two (of-course with
    > support from my Linux ) ? Meaning malloc somedata by script 1 and make
    > script 2 able to access it.
    >
    > There is no guarantee that Script 2 will be run after Script 1. So
    > there should be some way to free that memory using a watchdog timer.


    It sounds like a named pipe (see perlipc) would be
    the easiest, most straightforward solution. (See
    T.Zlatonov's suggestions though for other possible
    non-IPC solutions which, depending on the exact
    scenario, may be a better fit.)

    With a named pipe though, each script just deals
    with the named file for reading or writing while
    the OS takes care of the messy IPC details for
    you. The 2nd script will just block until data
    is available so running order isn't a concern. As
    long as the two scripts are running more or less
    concurrently, I would guess memory use will be
    manageable too since the reader will be draining
    the pipe as the data arrives.

    --
    Charles DeRykus
     
    C.DeRykus, Oct 11, 2010
    #6
  7. Jean wrote:
    > I am searching for efficient ways of communication across two Perl
    > scripts. I have two scripts; Script 1 generates some data. I want my
    > script two to be able to access that information. The easiest/dumbest
    > way is to write the data generated by script 1 as a file and read it
    > later using script 2.


    This is usual not dumb. It is often the best way to do it.
    Intermediate files and shell pipelines are by far the most common way
    for me to do this--I never use anything other than those two unless I
    have a compelling reason. Maybe you have a compelling reason, I don't
    know and you haven't given us enough information to determine.

    (Well, the third default option is to reconsider whether these two
    scripts really need to be different rather than one script. I assume
    you already did that rejected it for some good reason.)

    > Is there any other way than this ? Can I store
    > the data in memory and make it available to script two (of-course with
    > support from my Linux ) ? Meaning malloc somedata by script 1 and make
    > script 2 able to access it.


    There are many ways to do this, and AFAIK they all either leave a lot to
    be desired, or introduce annoying and subtle complexities.

    > There is no guarantee that Script 2 will be run after Script 1. So
    > there should be some way to free that memory using a watchdog timer.


    Can't you control the timing of the execution of your scripts?

    Xho
     
    Xho Jingleheimerschmidt, Oct 12, 2010
    #7
  8. Jean

    Ted Zlatanov Guest

    On Mon, 11 Oct 2010 22:14:29 +0200 Martijn Lievaart <> wrote:

    ML> On Mon, 11 Oct 2010 14:17:58 -0500, Ted Zlatanov wrote:
    >> If you need low latency, use a message queue.


    ML> Speaking of message queues, what do people recommend on Unix/Linux?

    I've heard positive things about http://www.rabbitmq.com/ but haven't
    used it myself. There's a lot of others, see
    http://en.wikipedia.org/wiki/Category:Message-oriented_middleware

    Depending on your needs, TIBCO may fit. It's very popular in the
    financial industry and in my experience has been a pretty good system
    over the last 3 years I've used it. The Perl bindings
    are... well... usable. The major headaches I've had were when the
    process is slow handling incoming data. Unless you write your Perl very
    carefully, it's easy to block and balloon the memory size (because
    TIBCO's queue uses your own application's memory) to multi-gigabyte
    footprints. So forget about database interactions, for instance--you
    have to move them to a separate process and use IPC or file drops.
    Threads (as in "use threads") are probably a bad idea too.

    Ted
     
    Ted Zlatanov, Oct 12, 2010
    #8
  9. Jean

    Ted Zlatanov Guest

    On Mon, 11 Oct 2010 14:25:59 -0700 (PDT) "C.DeRykus" <> wrote:

    CD> With a named pipe though, each script just deals with the named file
    CD> for reading or writing while the OS takes care of the messy IPC
    CD> details for you. The 2nd script will just block until data is
    CD> available so running order isn't a concern. As long as the two
    CD> scripts are running more or less concurrently, I would guess memory
    CD> use will be manageable too since the reader will be draining the
    CD> pipe as the data arrives.

    The only warning I have there is that pipes are pretty slow and have
    small buffers by default in the Linux kernel (assuming Linux). I forget
    exactly why, I think it's due to terminal disciplines or something, I
    didn't dig too much. I ran into this earlier this year.

    So if you have a fast writer pipes can be problematic.

    Ted
     
    Ted Zlatanov, Oct 12, 2010
    #9
  10. On Mon, 11 Oct 2010 19:55:03 -0500, Ted Zlatanov wrote:

    > On Mon, 11 Oct 2010 22:14:29 +0200 Martijn Lievaart <>
    > wrote:
    >
    > ML> On Mon, 11 Oct 2010 14:17:58 -0500, Ted Zlatanov wrote:
    >>> If you need low latency, use a message queue.

    >
    > ML> Speaking of message queues, what do people recommend on Unix/Linux?
    >
    > I've heard positive things about http://www.rabbitmq.com/ but haven't
    > used it myself. There's a lot of others, see
    > http://en.wikipedia.org/wiki/Category:Message-oriented_middleware


    Thanks, I'll look into it.

    M4
     
    Martijn Lievaart, Oct 12, 2010
    #10
  11. "" <> writes:

    > That may be easiest, but I don't think it's the dumbest. And if
    > you use this approach, I highly recommend using the "Storable" module
    > (it's a standard module so you should already have it).


    As long as you just use it for a single host for very temporary files,
    Storable is fine. But I have been bitten by Storable not being
    compatible between versions or different installations one time to
    many to call it 'highly recommended'.

    If you need suport for every possible perl structure then Storable is
    probably the only almost viable solution. But if simple trees of
    hashrefs and arrayrefs is good enough the I consider JSON::XS a better
    choice.


    But it all depends on the exact needs and for the original poster he
    might not come in situations where Storable shows it's nasty sides and
    don't need the extra speed from JSON::XS or the more future-proofe and
    portable format.

    //Makholm
     
    Peter Makholm, Oct 12, 2010
    #11
  12. Jean

    Dr.Ruud Guest

    On 2010-10-11 18:25, Jean wrote:

    > I have two scripts; Script 1 generates some data. I want my
    > script two to be able to access that information. The easiest/dumbest
    > way is to write the data generated by script 1 as a file and read it
    > later using script 2. Is there any other way than this ?


    I normally use a database for that. Script-1 can normally be scaled up
    by making it do things in parallel (by chunking the input in an obvious
    non-inter-dependable way).

    Script-2 can also just be a phase in script-1. Once all children are
    done processing, there normally is a reporting phase.


    > There is no guarantee that Script 2 will be run after Script 1. So
    > there should be some way to free that memory using a watchdog timer.


    When the intermediate data is in temporary database tables, they
    disappear automatically with the close of the connection.

    --
    Ruud
     
    Dr.Ruud, Oct 12, 2010
    #12
  13. On Tue, 12 Oct 2010 10:05:57 +0200, Peter Makholm wrote:

    > "" <> writes:
    >
    >> That may be easiest, but I don't think it's the dumbest. And if
    >> you use this approach, I highly recommend using the "Storable" module
    >> (it's a standard module so you should already have it).

    >
    > As long as you just use it for a single host for very temporary files,
    > Storable is fine. But I have been bitten by Storable not being
    > compatible between versions or different installations one time to many
    > to call it 'highly recommended'.


    Another way might be Data::Dumper.

    M4
     
    Martijn Lievaart, Oct 12, 2010
    #13
  14. >>>>> "Jean" == Jean <> writes:

    Jean> I am searching for efficient ways of communication across two Perl
    Jean> scripts. I have two scripts; Script 1 generates some data. I want my
    Jean> script two to be able to access that information.

    Look at DBM::Deep for a trivial way to store structured data, including
    having transactions so the data will change "atomically".

    And despite the name... DBM::Deep has no XS components... so it can even
    be installed in a hosted setup with limited ("no") access to compilers.

    Disclaimer: Stonehenge paid for part of the development of DBM::Deep,
    because yes, it's *that* useful.

    print "Just another Perl hacker,"; # the original

    --
    Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
    <> <URL:http://www.stonehenge.com/merlyn/>
    Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
    See http://methodsandmessages.posterous.com/ for Smalltalk discussion
     
    Randal L. Schwartz, Oct 12, 2010
    #14
  15. Jean

    Guest

    On Oct 12, 2:05 am, Peter Makholm <> wrote:
    >
    > As long as you just use it for a single host for very temporary files,
    > Storable is fine. But I have been bitten by Storable not being
    > compatible between versions or different installations one time to
    > many to call it 'highly recommended'.



    I was under the impression that Storable::nstore() was cross-
    platform compatible (as opposed to Storable::store(), which isn't).
    "perldoc Storable" has this to say about it:

    > You can also store data in network order to allow easy
    > sharing across multiple platforms, or when storing on a
    > socket known to be remotely connected. The routines to
    > call have an initial "n" prefix for *network*, as in
    > "nstore" and "nstore_fd".


    Unfortunately, it doesn't really specify the extent of what was
    meant by "multiple platforms". I always thought that meant any
    platform could read data written out by nstore(), but since I've never
    tested it, I can't really be sure.

    When you said you were "bitten" by Storable, were you using
    Storable::store(), or Storable::nstore()?

    -- Jean-Luc
     
    , Oct 12, 2010
    #15
  16. On 2010-10-12 00:59, Ted Zlatanov <> wrote:
    > On Mon, 11 Oct 2010 14:25:59 -0700 (PDT) "C.DeRykus" <> wrote:
    > CD> With a named pipe though, each script just deals with the named file
    > CD> for reading or writing while the OS takes care of the messy IPC
    > CD> details for you. The 2nd script will just block until data is
    > CD> available so running order isn't a concern. As long as the two
    > CD> scripts are running more or less concurrently, I would guess memory
    > CD> use will be manageable too since the reader will be draining the
    > CD> pipe as the data arrives.
    >
    > The only warning I have there is that pipes are pretty slow and have
    > small buffers by default in the Linux kernel (assuming Linux).


    Hmm. On my system (a 1.86 GHz Core2 - not ancient, but not the latest
    and greatest, either) I can transfer about 800 MB/s through a pipe at
    32 kB buffer size. For larger buffers it gets a bit slower, but a buffer
    size of 1MB is still quite ok.

    You may confuse that with other systems. Windows pipes have a reputation
    for being slow. Traditionally Unix pipes were restricted to a rather
    small buffer (8 or 10 kB). I do think Linux pipes become synchronous for
    large writes, though.

    > I forget exactly why, I think it's due to terminal disciplines or
    > something, I didn't dig too much.


    Unix pipes have nothing to do with terminals. Originally they were
    implemented as files, BSD 4.x reimplemented them on top of Unix sockets.
    I don't now how Linux implements them, but I'm quite sure that no
    terminals are involved, and certainly no terminal disciplines.
    Are you confusing them with ptys, perhaps?

    > I ran into this earlier this year.


    Can you dig up the details?

    hp
     
    Peter J. Holzer, Oct 12, 2010
    #16
  17. Jean

    Bart Lateur Guest

    Randal L. Schwartz wrote:

    >Look at DBM::Deep for a trivial way to store structured data, including
    >having transactions so the data will change "atomically".
    >
    >And despite the name... DBM::Deep has no XS components... so it can even
    >be installed in a hosted setup with limited ("no") access to compilers.
    >
    >Disclaimer: Stonehenge paid for part of the development of DBM::Deep,
    >because yes, it's *that* useful.


    Ouch. DBM::Deep is buggy, in my experience.

    I don't know the exact circumstances, but when using it to cache the XML
    contents of user home nodes on Perlmonks, I regularly get crashes in it.
    It has something to do with changing size of the data, IIRC from larger
    than 8k to below 8k. But I could have gotten these details wrong, as it
    has been many since I last tried it.

    --
    Bart.
     
    Bart Lateur, Oct 13, 2010
    #17
  18. Jean

    paul Guest

    On Oct 13, 2:07 pm, Bart Lateur <> wrote:
    > Randal L. Schwartz wrote:
    > >Look at DBM::Deep for a trivial way to store structured data, including
    > >having transactions so the data will change "atomically".

    >
    > >And despite the name... DBM::Deep has no XS components... so it can even
    > >be installed in a hosted setup with limited ("no") access to compilers.

    >
    > >Disclaimer: Stonehenge paid for part of the development of DBM::Deep,
    > >because yes, it's *that* useful.

    >
    > Ouch. DBM::Deep is buggy, in my experience.
    >
    > I don't know the exact circumstances, but when using it to cache the XML
    > contents of user home nodes on Perlmonks, I regularly get crashes in it.
    > It has something to do with changing size of the data, IIRC from larger
    > than 8k to below 8k. But I could have gotten these details wrong, as it
    > has been many since I last tried it.
    >
    > --
    >         Bart.


    you can try Named pipes, as a special type of file that allows for
    interprocess communication
    .. by using the "mknod" command you can create a name pile file, for
    one process can open for reading
    another for writing.
     
    paul, Oct 13, 2010
    #18
  19. Jean

    Ted Zlatanov Guest

    On Tue, 12 Oct 2010 22:24:47 +0200 "Peter J. Holzer" <> wrote:

    PJH> On 2010-10-12 00:59, Ted Zlatanov <> wrote:
    >> On Mon, 11 Oct 2010 14:25:59 -0700 (PDT) "C.DeRykus" <> wrote:

    CD> With a named pipe though, each script just deals with the named file
    CD> for reading or writing while the OS takes care of the messy IPC
    CD> details for you. The 2nd script will just block until data is
    CD> available so running order isn't a concern. As long as the two
    CD> scripts are running more or less concurrently, I would guess memory
    CD> use will be manageable too since the reader will be draining the
    CD> pipe as the data arrives.
    >>
    >> The only warning I have there is that pipes are pretty slow and have
    >> small buffers by default in the Linux kernel (assuming Linux).


    PJH> Hmm. On my system (a 1.86 GHz Core2 - not ancient, but not the latest
    PJH> and greatest, either) I can transfer about 800 MB/s through a pipe at
    PJH> 32 kB buffer size. For larger buffers it gets a bit slower, but a buffer
    PJH> size of 1MB is still quite ok.

    Hmm, sorry for stating that badly.

    The biggest problem is that pipes *block* normally. So even if your
    reader is slow only once in a while, as long as you're using the default
    buffer (which is small), your writer will block too. In my situation
    (the writer was receiving data from TIBCO) that was deadly.

    I meant to say that but somehow it turned into "pipes are slow" between
    brain and keyboard. Sorry.

    PJH> You may confuse that with other systems. Windows pipes have a
    PJH> reputation for being slow.

    Yes, on Windows we had even more trouble for many reasons. But I was
    only talking about Linux so I won't take that bailout :)

    >> I forget exactly why, I think it's due to terminal disciplines or
    >> something, I didn't dig too much.


    PJH> Unix pipes have nothing to do with terminals. Originally they were
    PJH> implemented as files, BSD 4.x reimplemented them on top of Unix sockets.
    PJH> I don't now how Linux implements them, but I'm quite sure that no
    PJH> terminals are involved, and certainly no terminal disciplines.
    PJH> Are you confusing them with ptys, perhaps?

    Probably. I was on a tight deadline and the pipe approach simply did
    not work, so I couldn't investigate in more detail. There's a lot more
    resiliency in a file drop approach, too: if either side dies, the other
    one is not affected. There is no leftover mess like with shared memory,
    either. So I've been pretty happy with the file drop.

    Ted
     
    Ted Zlatanov, Oct 14, 2010
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ChrisH
    Replies:
    5
    Views:
    365
    Irmen de Jong
    Sep 14, 2004
  2. Tim Golden
    Replies:
    1
    Views:
    424
    Cameron Laird
    Sep 14, 2004
  3. Chris
    Replies:
    9
    Views:
    970
  4. Darren Dunham

    Sharing variables across scripts.

    Darren Dunham, Oct 13, 2003, in forum: Perl Misc
    Replies:
    1
    Views:
    96
    Tassilo v. Parseval
    Oct 13, 2003
  5. Replies:
    13
    Views:
    574
    Anno Siegel
    Sep 10, 2007
Loading...

Share This Page