Tuesday, February 26, 2013

Interactive subprocess communication in #Python

subprocess.Popen class comes with a communicate method, which sends input to a child process, then waits and returns the standard output and error response back to the calling process. 

The limitation of the communicate method is that when it's finished writing input it closes child's stdin decriptor, and from the point of view of the subprocess, it's like receiving the end-of-file character (ctrl+D). This is not always desired, e.g. not when you try to automate a process involving commuication through an interactive shell, such as talking to a secured database or a hardware device via ssh through some bastion host.

Two problems to consider here:
  1. How to avoid blocking on I/O while reading the output?
  2. When the output is large (multiple blocks), how to decide when the transmission of the message has completed and when to return the output?
The first problem is addressed by changing properties of file decriptors for stdout and stderr to be nonblocking. In Python, by default, when a descriptor is not ready to be read from, it will block and the read() will only return when data finally arrives. By making a descriptor non-blocking, it returns the data when the data is ready, otherwise it throws IOError.

One way to work around the second problem that is to use a feature of most interactive shells, i.e. the existence of a prompt to signalize the "ready>" state. This way you know the output is ready and can be returned.
Consider the following snippet of code:
# ipopen.py
import os
import time
import fcntl
import subprocess

class IPopen(subprocess.Popen):

    POLL_INTERVAL = 0.1
    def __init__(self, *args, **kwargs):
        """Construct interactive Popen."""
        keyword_args = {
            'stdin': subprocess.PIPE,
            'stdout': subprocess.PIPE,
            'stderr': subprocess.PIPE,
            'prompt': '> ',
        }
        keyword_args.update(kwargs)
        self.prompt = keyword_args.get('prompt')
        del keyword_args['prompt']
        subprocess.Popen.__init__(self, *args, **keyword_args)
        # Make stderr and stdout non-blocking.
        for outfile in (self.stdout, self.stderr):
            if outfile is not None:
                fd = outfile.fileno()
                fl = fcntl.fcntl(fd, fcntl.F_GETFL)
                fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)

    def correspond(self, text, sleep=0.1):
        """Communicate with the child process without closing stdin."""
        self.stdin.write(text)
        self.stdin.flush()
        str_buffer = ''
        while not str_buffer.endswith(self.prompt):
            try:
                str_buffer += self.stdout.read()
            except IOError:
                time.sleep(sleep)
        return str_buffer

The above is a minimal extension to subprocess.Popen() which allows for exchange of messages with a process, just as you would from its interactive shell:

$ ipython
Python 2.7.3 (default, Aug  1 2012, 05:14:39) 
Type "copyright", "credits" or "license" for more information.

IPython 0.12.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: from ipopen import IPopen

In [2]: gdb = IPopen(['gdb'], prompt='(gdb) ')

In [3]: print gdb.correspond('')
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:

(gdb) 

In [4]: print gdb.correspond('help\n')
List of classes of commands:

aliases -- Aliases of other commands
breakpoints -- Making program stop at certain points
data -- Examining data
files -- Specifying and examining files
internals -- Maintenance commands
obscure -- Obscure features
running -- Running the program
stack -- Examining the stack
status -- Status inquiries
support -- Support facilities
tracepoints -- Tracing of program execution without stopping the program
user-defined -- User-defined commands

Type "help" followed by a class name for a list of commands in that class.
Type "help all" for the list of all commands.
Type "help" followed by command name for full documentation.
Type "apropos word" to search for commands related to "word".
Command name abbreviations are allowed if unambiguous.
(gdb) 

In [5]: 

No comments:

Post a Comment