Config::CONFIG['SHELL'] - windows/*nix/mac

Discussion in 'Ruby' started by Ara.T.Howard, Aug 19, 2004.

  1. Ara.T.Howard

    Ara.T.Howard Guest

    i'm working on a clustering system that runs jobs submitted to an nfs mounted
    queue from n feeding nodes. currently it's linux only and i use the following
    to ensure a users job is executed on the remote node with the environment they
    are accustomed to

    cmd = 'ls -ltar'

    pipe = IO.pipe

    unless((cid = fork))
    pipe.last.close
    STDIN.reopen pipe.first
    exec 'bash --login'
    else
    pipe.first.close
    pipe.last.puts cmd
    end


    therefore there job executes in a login shell and their 'normal' environment
    is there. how would one go about this one windows? obviously the fork has to
    go but i'm getting around that by opening up a pipe to another ruby process
    using IO.popen and sending a little ruby program down the pipe in order to be
    able to fork/exec the job. this is being done so i can make the child ruby
    process do things like write me the pid back, redirect stdin/stdout, etc. -
    all in a portable way...

    thoughts?

    -a
    --
    ===============================================================================
    | EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
    | PHONE :: 303.497.6469
    | A flower falls, even though we love it;
    | and a weed grows, even though we do not love it.
    | --Dogen
    ===============================================================================
     
    Ara.T.Howard, Aug 19, 2004
    #1
    1. Advertising

  2. Have you considered using DRb, instead of raw pipes, to coordinate the
    work on Windows? Assuming this is more of a load-balancing system
    than, say, a massively-parallel cluster environment, the overhead of
    DRb marshalling and unmarshalling shouldn't be a big deal, and you
    could probably just make your "front" object a simple job controller,
    which could spawn processes, pass them input, and send output back to
    the client.

    Just a thought.

    --
    Lennon
    rcoder.net
     
    Lennon Day-Reynolds, Aug 19, 2004
    #2
    1. Advertising

  3. Ara.T.Howard

    Ara.T.Howard Guest

    On Fri, 20 Aug 2004, Lennon Day-Reynolds wrote:

    > Have you considered using DRb, instead of raw pipes, to coordinate the
    > work on Windows? Assuming this is more of a load-balancing system
    > than, say, a massively-parallel cluster environment, the overhead of
    > DRb marshalling and unmarshalling shouldn't be a big deal, and you
    > could probably just make your "front" object a simple job controller,
    > which could spawn processes, pass them input, and send output back to
    > the client.
    >
    > Just a thought.



    my system has n feeding procsses 'competing' the process jobs from an nfs
    mounted priority queue. this obviously involves some sort of nfs db and/or
    locking. this is all provided by sqlite and some other classes i've written
    (lockfile on raa). the advantages this has are

    - no single point of failure. if one node stay up the sytem continues

    - no networking needed (well NFS but that hardly counts). this is important
    because it means not ports - and that means no sysads. one thing i am
    striving for is that a user should be able to set a cluster up by simply
    running a peice of userland code pointing at an nfs mounted directory in
    under five minutes. the niche this aims for is something less complicated
    that sun grid engine, or other systems which use daemons to communicate
    jobs and to schedule them, and something more that simply spawn jobs by
    spawning ssh sessions all over the place. to my knowledge there is no
    such system. and it's tremendously useful in a scientific setting where
    one often just wants to throw 30 nodes at a list of jobs right NOW.

    i considered drb for a really long time and it has the following disadvantages
    that i can see

    - must open ports. since sept. 11th. we have only ssh. period. ssh
    tunneling is an options but absolutely crazy when once starts considering
    how to keep ssh-agent running across reboots (we must use passpharases)
    without embedding passwords (forbidden and checked for here). plus the
    number of ssh tunnels needed is n^2 - this gets riduculous when you have
    30 nodes!

    - if you have a scheduler you have a single point of failure. if all nodes
    can operate as the scheduler you need some sort of distributed locking
    protocl. you could use the filesystem and nfs safe locks here. if you
    have nfs safe locks you do not need drb and can simply put the queue in an
    nfs safe db (sqlite) and coordiante all actions via the filesystem.

    of course, you could start using something like a tuple space to
    coordinate - but again you have a single point of failure...

    i cannot see how one can either

    - elimnate a single point of failure using drb

    - make the system decentralized (all daemons are servants) without
    requiring some form of locking - thereby eliminating the need for drb
    in the first place

    - code like it is already written - condor, sge (sun grid engine) and they
    have LOTS of problems. scheduling is tough.

    if you have suggestions i'm all ears.

    also, i should point out that virtually every scienfic cluster in our building
    already relies on nfs and locking to some degree so while it's true that the
    nfs server itself is a single point of failure (and network of course) this
    things are already inherent in the system and my code adds no MORE points of
    failure.

    the system must run in the face of problems or i come in on weekends! ;-( so
    i will not willingly introduce single points of failure into the system. the
    present only require only that syads come in on the weekend - not me - so i'd
    like to keep it that way.


    in short i would LOVE to use drb for many reasons, but cannot come up with a
    fault tolerant way to deal with ssh tunneling, scheduling, and locking that
    does make nfs mounted work queues a simpler solution in the process.

    thoughts?

    -a
    --
    ===============================================================================
    | EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
    | PHONE :: 303.497.6469
    | A flower falls, even though we love it;
    | and a weed grows, even though we do not love it.
    | --Dogen
    ===============================================================================
     
    Ara.T.Howard, Aug 19, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. aum
    Replies:
    3
    Views:
    368
    Grant Edwards
    Nov 15, 2005
  2. Thomas Ploch
    Replies:
    2
    Views:
    637
    Thomas Ploch
    Mar 14, 2007
  3. Thomas Ploch
    Replies:
    2
    Views:
    467
    Larry Bates
    Mar 14, 2007
  4. Replies:
    2
    Views:
    252
  5. Andrew Thompson

    Mac/*nix test of JWS splash

    Andrew Thompson, Jan 11, 2010, in forum: Java
    Replies:
    6
    Views:
    408
    John B. Matthews
    Jan 12, 2010
Loading...

Share This Page