how to simulate tar filename substitution across pipedsubprocess.Popen() calls?

Discussion in 'Python' started by jkn, Nov 8, 2012.

  1. jkn

    jkn Guest

    Hi All
    i am trying to build up a set of subprocess.Ponen calls to
    replicate the effect of a horribly long shell command. I'm not clear
    how I can do one part of this and wonder if anyone can advise. I'm on
    Linux, fairly obviously.

    I have a command which (simplified) is a tar -c command piped through
    to xargs:

    tar -czvf myfile.tgz -c $MYDIR mysubdir/ | xargs -I '{}' sh -c "test -
    f $MYDIR/'{}'"

    (The full command is more complicated than this; I got it from a shell
    guru).

    IIUC, when called like this, the two occurences of '{}' in the xargs
    command will get replaced with the file being added to the tarfile.

    Also IIUC, I will need two calls to subprocess.Popen() and use
    subprocess.stdin on the second to receive the output from the first.
    But how can I achive the substitution of the '{}' construction across
    these two calls?

    Apologies if I've made any howlers in this description - it's very
    likely...

    Cheers
    J^n
    jkn, Nov 8, 2012
    #1
    1. Advertising

  2. jkn

    Hans Mulder Guest

    Re: how to simulate tar filename substitution across piped subprocess.Popen()calls?

    On 8/11/12 19:05:11, jkn wrote:
    > Hi All
    > i am trying to build up a set of subprocess.Ponen calls to
    > replicate the effect of a horribly long shell command. I'm not clear
    > how I can do one part of this and wonder if anyone can advise. I'm on
    > Linux, fairly obviously.
    >
    > I have a command which (simplified) is a tar -c command piped through
    > to xargs:
    >
    > tar -czvf myfile.tgz -c $MYDIR mysubdir/ | xargs -I '{}' sh -c "test -
    > f $MYDIR/'{}'"
    >
    > (The full command is more complicated than this; I got it from a shell
    > guru).
    >
    > IIUC, when called like this, the two occurences of '{}' in the xargs
    > command will get replaced with the file being added to the tarfile.
    >
    > Also IIUC, I will need two calls to subprocess.Popen() and use
    > subprocess.stdin on the second to receive the output from the first.
    > But how can I achive the substitution of the '{}' construction across
    > these two calls?


    That's what 'xargs' will do for you. All you need to do, is invoke
    xargs with arguments containing '{}'. I.e., something like:

    cmd1 = ['tar', '-czvf', 'myfile.tgz', '-c', mydir, 'mysubdir']
    first_process = subprocess.Popen(cmd1, stdout=subprocess.PIPE)

    cmd2 = ['xargs', '-I', '{}', 'sh', '-c', "test -f %s/'{}'" % mydir]
    second_process = subprocess.Popen(cmd2, stdin=first_process.stdout)

    > Apologies if I've made any howlers in this description - it's very
    > likely...


    I think the second '-c' argument to tar should have been a '-C'.

    I'm not sure I understand what the second command is trying to
    achieve. On my system, nothing happens, because tar writes the
    names of the files it is adding to stderr, so xargs receives no
    input at all. If I send the stderr from tar to the stdin of
    xargs, then it still doesn't seem to do anything sensible.

    Perhaps your real xargs command is more complicated and more
    sensible.



    Hope this helps,

    -- HansM
    Hans Mulder, Nov 9, 2012
    #2
    1. Advertising

  3. jkn

    jkn Guest

    Hi Hans
    thanks a lot for your reply:

    > That's what 'xargs' will do for you.  All you need to do, is invoke
    > xargs with arguments containing '{}'.  I.e., something like:
    >
    > cmd1 = ['tar', '-czvf', 'myfile.tgz', '-c', mydir, 'mysubdir']
    > first_process = subprocess.Popen(cmd1, stdout=subprocess.PIPE)
    >
    > cmd2 = ['xargs', '-I', '{}', 'sh', '-c', "test -f %s/'{}'" % mydir]
    > second_process = subprocess.Popen(cmd2, stdin=first_process.stdout)
    >


    Hmm - that's pretty much what I've been trying. I will have to
    experiment a bit more and post the results in a bit more detail.

    > > Apologies if I've made any howlers in this description - it's very
    > > likely...

    >


    > I think the second '-c' argument to tar should have been a '-C'.


    You are correct, thanks. Serves me right for typing the simplified
    version in by hand. I actually use the equivalent "--directory=..." in
    the actual code.

    > I'm not sure I understand what the second command is trying to
    > achieve.  On my system, nothing happens, because tar writes the
    > names of the files it is adding to stderr, so xargs receives no
    > input at all.  If I send the stderr from tar to the stdin of
    > xargs, then it still doesn't seem to do anything sensible.


    That's interesting ... on my system, and all others that I know about,
    the file list goes to stdout.

    > Perhaps your real xargs command is more complicated and more
    > sensible.


    Yes, in fact the output from xargs is piped to a third process. But I
    realise this doesn't alter the result of your experiment; the xargs
    process should filter a subset of the files being fed to it.

    I will experiment a bit more and hopefully post some results. Thanks
    in the meantime...

    Regards
    Jon N
    jkn, Nov 12, 2012
    #3
  4. jkn

    jkn Guest

    slight followup ...

    I have made some progress; for now I'm using subprocess.communicate to
    read the output from the first subprocess, then writing it into the
    secodn subprocess. This way I at least get to see what is
    happening ...

    The reason 'we' weren't seeing any output from the second call (the
    'xargs') is that as mentioned I had simplified this. The actual shell
    command was more like (in python-speak):

    "xargs -I {} sh -c \"test -f %s/{} && md5sum %s/{}\"" % (mydir, mydir)

    ie. I am running md5sum on each tar-file entry which passes the 'is
    this a file' test.

    My next problem; how to translate the command-string clause

    "test -f %s/{} && md5sum %s/{}" # ...

    into s parameter to subprocss.Popen(). I think it's the command
    chaining '&&' which is tripping me up...

    Cheers
    J^n
    jkn, Nov 12, 2012
    #4
  5. jkn

    Hans Mulder Guest

    Re: how to simulate tar filename substitution across piped subprocess.Popen()calls?

    On 12/11/12 16:36:58, jkn wrote:
    > slight followup ...
    >
    > I have made some progress; for now I'm using subprocess.communicate to
    > read the output from the first subprocess, then writing it into the
    > secodn subprocess. This way I at least get to see what is
    > happening ...
    >
    > The reason 'we' weren't seeing any output from the second call (the
    > 'xargs') is that as mentioned I had simplified this. The actual shell
    > command was more like (in python-speak):
    >
    > "xargs -I {} sh -c \"test -f %s/{} && md5sum %s/{}\"" % (mydir, mydir)
    >
    > ie. I am running md5sum on each tar-file entry which passes the 'is
    > this a file' test.
    >
    > My next problem; how to translate the command-string clause
    >
    > "test -f %s/{} && md5sum %s/{}" # ...
    >
    > into s parameter to subprocss.Popen(). I think it's the command
    > chaining '&&' which is tripping me up...


    It is not really necessary to translate the '&&': you can
    just write:

    "test -f '%s/{}' && md5sum '%s/{}'" % (mydir, mydir)

    , and xargs will pass that to the shell, and then the shell
    will interpret the '&&' for you: you have shell=False in your
    subprocess.Popen call, but the arguments to xargs are -I {}
    sh -c "....", and this means that xargs ends up invoking the
    shell (after replacing the {} with the name of a file).

    Alternatively, you could translate it as:

    "if [ -f '%s/{}' ]; then md5sum '%s/{}'; fi" % (mydir, mydir)

    ; that might make the intent clearer to whoever gets to
    maintain your code.


    Hope this helps,

    -- HansM
    Hans Mulder, Nov 12, 2012
    #5
  6. jkn

    Rebelo Guest

    Dana Äetvrtak, 8. studenoga 2012. 19:05:12 UTC+1, korisnik jkn napisaoje:
    > Hi All
    >
    > i am trying to build up a set of subprocess.Ponen calls to
    >
    > replicate the effect of a horribly long shell command. I'm not clear
    >
    > how I can do one part of this and wonder if anyone can advise. I'm on
    >
    > Linux, fairly obviously.
    >
    > J^n


    You should try to do it in pure python, avoiding shell altogether.
    The first step would be to actually write what it is you want to do.

    To filter files you want to add to tar file check tarfile (http://docs.python.org/2/library/tarfile.html?highlight=tar#module-tarfile),
    specifically :
    TarFile.add(name, arcname=None, recursive=True, exclude=None, filter=None)
    which takes filter paramter :
    "If filter is specified it must be a function that takes a TarInfo object argument and returns the changed TarInfo object. If it instead returns None the TarInfo object will be excluded from the archive."
    Rebelo, Nov 12, 2012
    #6
  7. jkn

    jkn Guest

    Hi Hans

    On Nov 12, 4:36 pm, Hans Mulder <> wrote:
    > On 12/11/12 16:36:58, jkn wrote:
    >
    >
    >
    >
    >
    >
    >
    >
    >
    > > slight followup ...

    >
    > > I have made some progress; for now I'm using subprocess.communicate to
    > > read the output from the first subprocess, then writing it into the
    > > secodn subprocess. This way I at least get to see what is
    > > happening ...

    >
    > > The reason 'we' weren't seeing any output from the second call (the
    > > 'xargs') is that as mentioned I had simplified this. The actual shell
    > > command was more like (in python-speak):

    >
    > > "xargs -I {} sh -c \"test -f %s/{} && md5sum %s/{}\"" % (mydir, mydir)

    >
    > > ie. I am running md5sum on each tar-file entry which passes the 'is
    > > this a file' test.

    >
    > > My next problem; how to translate the command-string clause

    >
    > >     "test -f %s/{} && md5sum %s/{}" # ...

    >
    > > into s parameter to subprocss.Popen(). I think it's the command
    > > chaining '&&' which is tripping me up...

    >
    > It is not really necessary to translate the '&&': you can
    > just write:
    >
    >     "test -f '%s/{}' && md5sum '%s/{}'" % (mydir, mydir)
    >
    > , and xargs will pass that to the shell, and then the shell
    > will interpret the '&&' for you: you have shell=False in your
    > subprocess.Popen call, but the arguments to xargs are -I {}
    > sh -c "....", and this means that xargs ends up invoking the
    > shell (after replacing the {} with the name of a file).
    >
    > Alternatively, you could translate it as:
    >
    >     "if [ -f '%s/{}' ]; then md5sum '%s/{}'; fi" % (mydir, mydir)
    >
    > ; that might make the intent clearer to whoever gets to
    > maintain your code.


    Yes to both points; turns out that my problem was in building up the
    command sequence to subprocess.Popen() - when to use, and not use,
    quotes etc. It has ended up as (spelled out in longhand...)


    xargsproc = ['xargs']

    xargsproc.append('-I')
    xargsproc.append("{}")

    xargsproc.append('sh')
    xargsproc.append('-c')

    xargsproc.append("test -f %s/{} && md5sum %s/{}" % (mydir,
    mydir))


    As usual, breaking it all down for the purposes of clarification has
    helpd a lot, as has your input. Thanks a lot.

    Cheers
    Jon N
    jkn, Nov 12, 2012
    #7
  8. jkn

    jkn Guest

    On Nov 12, 4:58 pm, Rebelo <> wrote:
    > Dana Äetvrtak, 8. studenoga 2012. 19:05:12 UTC+1, korisnik jkn napisao je:
    >
    > > Hi All

    >
    > >     i am trying to build up a set of subprocess.Ponen calls to

    >
    > > replicate the effect of a horribly long shell command. I'm not clear

    >
    > > how I can do one part of this and wonder if anyone can advise. I'm on

    >
    > > Linux, fairly obviously.

    >
    > >     J^n

    >
    > You should try to do it in pure python, avoiding shell altogether.
    > The first step would be to actually write what it is you want to do.
    >


    Hi Rebelo
    FWIW I intend to do exactly this - but I wanted to duplicate the
    existing shell action beforehand, so that I could get rid of the shell
    command.

    After I've tidied things up, that will be my next step.

    Cheers
    Jon N
    jkn, Nov 12, 2012
    #8
  9. jkn

    Hans Mulder Guest

    Re: how to simulate tar filename substitution across piped subprocess.Popen()calls?

    On 12/11/12 18:22:44, jkn wrote:
    > Hi Hans
    >
    > On Nov 12, 4:36 pm, Hans Mulder <> wrote:
    >> On 12/11/12 16:36:58, jkn wrote:
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>> slight followup ...

    >>
    >>> I have made some progress; for now I'm using subprocess.communicate to
    >>> read the output from the first subprocess, then writing it into the
    >>> secodn subprocess. This way I at least get to see what is
    >>> happening ...

    >>
    >>> The reason 'we' weren't seeing any output from the second call (the
    >>> 'xargs') is that as mentioned I had simplified this. The actual shell
    >>> command was more like (in python-speak):

    >>
    >>> "xargs -I {} sh -c \"test -f %s/{} && md5sum %s/{}\"" % (mydir, mydir)

    >>
    >>> ie. I am running md5sum on each tar-file entry which passes the 'is
    >>> this a file' test.

    >>
    >>> My next problem; how to translate the command-string clause

    >>
    >>> "test -f %s/{} && md5sum %s/{}" # ...

    >>
    >>> into s parameter to subprocss.Popen(). I think it's the command
    >>> chaining '&&' which is tripping me up...

    >>
    >> It is not really necessary to translate the '&&': you can
    >> just write:
    >>
    >> "test -f '%s/{}' && md5sum '%s/{}'" % (mydir, mydir)
    >>
    >> , and xargs will pass that to the shell, and then the shell
    >> will interpret the '&&' for you: you have shell=False in your
    >> subprocess.Popen call, but the arguments to xargs are -I {}
    >> sh -c "....", and this means that xargs ends up invoking the
    >> shell (after replacing the {} with the name of a file).
    >>
    >> Alternatively, you could translate it as:
    >>
    >> "if [ -f '%s/{}' ]; then md5sum '%s/{}'; fi" % (mydir, mydir)
    >>
    >> ; that might make the intent clearer to whoever gets to
    >> maintain your code.

    >
    > Yes to both points; turns out that my problem was in building up the
    > command sequence to subprocess.Popen() - when to use, and not use,
    > quotes etc. It has ended up as (spelled out in longhand...)
    >
    >
    > xargsproc = ['xargs']
    >
    > xargsproc.append('-I')
    > xargsproc.append("{}")
    >
    > xargsproc.append('sh')
    > xargsproc.append('-c')
    >
    > xargsproc.append("test -f %s/{} && md5sum %s/{}" % (mydir,
    > mydir))


    This will break if there are spaces in the file name, or other
    characters meaningful to the shell. If you change if to

    xargsproc.append("test -f '%s/{}' && md5sum '%s/{}'"
    % (mydir, mydir))

    , then it will only break if there are single quotes in the file name.

    As I understand, your plan is to rewrite this bit in pure Python, to
    get rid of any and all such problems.

    > As usual, breaking it all down for the purposes of clarification has
    > helpd a lot, as has your input. Thanks a lot.


    You're welcome.

    -- HansM
    Hans Mulder, Nov 12, 2012
    #9
  10. jkn

    jkn Guest

    Hi Hans

    [...]
    >
    > >         xargsproc.append("test -f %s/{} && md5sum %s/{}" % (mydir,
    > > mydir))

    >
    > This will break if there are spaces in the file name, or other
    > characters meaningful to the shell.  If you change if to
    >
    >         xargsproc.append("test -f '%s/{}' && md5sum '%s/{}'"
    >                              % (mydir, mydir))
    >
    > , then it will only break if there are single quotes in the file name.


    Fair point. As it happens, I know that there are no 'unhelpful'
    characters in the filenames ... but it's still worth doing.

    >
    > As I understand, your plan is to rewrite this bit in pure Python, to
    > get rid of any and all such problems.


    Yep - as mentioned in another reply I wanted first to have something
    which duplicated the current action (which has taken longer than I
    expected), and then rework in a more pythonic way.

    Still, I've learned some things about the subprocess module, and also
    about the shell, so it's been far from wasted time.

    Regards
    Jon N
    jkn, Nov 12, 2012
    #10
  11. Re: how to simulate tar filename substitution across piped subprocess.Popen()calls?

    Am 09.11.2012 02:12 schrieb Hans Mulder:

    > That's what 'xargs' will do for you. All you need to do, is invoke
    > xargs with arguments containing '{}'. I.e., something like:
    >
    > cmd1 = ['tar', '-czvf', 'myfile.tgz', '-c', mydir, 'mysubdir']
    > first_process = subprocess.Popen(cmd1, stdout=subprocess.PIPE)
    >
    > cmd2 = ['xargs', '-I', '{}', 'sh', '-c', "test -f %s/'{}'" % mydir]
    > second_process = subprocess.Popen(cmd2, stdin=first_process.stdout)


    After launching second_process, it might be useful to
    firstprocess.stdout.close(). If you fail to do so, your process is a
    second reader which might break things apart.

    At least, I once hat issues with it; I currently cannot recapitulate
    what these were nor how they could arise; maybe there was just the open
    file descriptor which annoyed me.


    Thomas
    Thomas Rachel, Nov 13, 2012
    #11
  12. Re: how to simulate tar filename substitution across piped subprocess.Popen()calls?

    Am 12.11.2012 19:30 schrieb Hans Mulder:

    > This will break if there are spaces in the file name, or other
    > characters meaningful to the shell. If you change if to
    >
    > xargsproc.append("test -f '%s/{}'&& md5sum '%s/{}'"
    > % (mydir, mydir))
    >
    > , then it will only break if there are single quotes in the file name.


    And if you do mydir_q = mydir.replace("'", "'\\''") and use mydir_q, you
    should be safe...


    Thomas
    Thomas Rachel, Nov 13, 2012
    #12
  13. jkn

    Hans Mulder Guest

    Re: how to simulate tar filename substitution across piped subprocess.Popen()calls?

    On 13/11/12 22:36:47, Thomas Rachel wrote:
    > Am 12.11.2012 19:30 schrieb Hans Mulder:
    >
    >> This will break if there are spaces in the file name, or other
    >> characters meaningful to the shell. If you change if to
    >>
    >> xargsproc.append("test -f '%s/{}'&& md5sum '%s/{}'"
    >> % (mydir, mydir))
    >>
    >> , then it will only break if there are single quotes in the file name.

    >
    > And if you do mydir_q = mydir.replace("'", "'\\''") and use mydir_q, you
    > should be safe...


    The problem isn't single quotes in mydir, but single quotes in the
    files names that 'tar' generates and 'xargs' consumes. In the shell
    script, these names go directly from tar to xargs via a pipe. If the
    OP wants to do your replace, his script would have to read the output
    of tar and do the replace before passing the filenames down a second
    pipe to xargs.

    However, once he does that, it's simpler to cut out xargs and invoke
    "sh" directly. Or even cut out "sh" and "test" and instead use
    os.path.isfile and then call md5sum directly. And once he does that,
    he no longer needs to worry about single quotes.

    The OP has said, he's going to d all that. One step at a time.
    That sounds like a sensible plan to me.


    Hope this helps,

    -- HansM
    Hans Mulder, Nov 14, 2012
    #13
  14. jkn

    jkn Guest

    Hi Hans

    [...]
    >
    >
    > However, once he does that, it's simpler to cut out xargs and invoke
    >
    > "sh" directly. Or even cut out "sh" and "test" and instead use
    >
    > os.path.isfile and then call md5sum directly. And once he does that,
    >
    > he no longer needs to worry about single quotes.
    >


    Yes indeed, using os.path.isfile() and them md5sum directly is my plan ... for reasons of maintainability (by myself) more than anything else.

    >
    >
    > The OP has said, he's going to d all that. One step at a time.
    >
    > That sounds like a sensible plan to me.
    >


    Thanks a lot.

    J^n
    jkn, Nov 18, 2012
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Claudio Grondi
    Replies:
    4
    Views:
    547
    Claudio Grondi
    Aug 20, 2005
  2. Replies:
    2
    Views:
    417
    Michael Hoffman
    Apr 24, 2007
  3. Ray Van Dolson
    Replies:
    0
    Views:
    312
    Ray Van Dolson
    Sep 23, 2009
  4. Ray Van Dolson
    Replies:
    0
    Views:
    794
    Ray Van Dolson
    Sep 25, 2009
  5. benoit Guyon
    Replies:
    2
    Views:
    211
    benoit Guyon
    Jul 26, 2005
Loading...

Share This Page