Section 14.5. Executing Other (Non-Python) Programs

14.5. Executing Other (Non-Python) Programs

We can also execute non-Python programs from within Python. These include binary executables, other shell scripts, etc. All that is required is a valid execution environment, i.e., permissions for file access and execution must be granted, shell scripts must be able to access their interpreter (Perl, bash, etc.), binaries must be accessible (and be of the local machine's architecture).

Finally, the programmer must bear in mind whether our Python script is required to communicate with the other program that is to be executed. Some programs require input, others return output as well as an error code upon completion (or both). Depending on the circumstances, Python provides a variety of ways to execute non-Python programs. All of the functions discussed in this section can be found in the os module. We provide a summary for you in Table 14.6 (where appropriate, we annotate those that are available only for certain platforms) as an introduction to the remainder of this section.

Table 14.6. `os` Module Functions for External Program Execution (Unix only, Windows only)
`os` Module Function	Description
`system(cmd)`	Execute program `cmd` given as string, wait for program completion, and return the exit code (on Windows, the exit code is always 0)
`fork()`	Create a child process that runs in parallel to the parent process [usually used with `exec*()`]; return twice... once for the parent and once for the child
`execl(file, arg0, arg1, ...)`	Execute `file` with argument list `arg0`, `arg1`, etc.
`execv(file,` `arglist)`	Same as `execl()` except with argument vector (list or tuple) `arglist`
`execle(file,` `arg0,` `arg1, ...env)`	Same as `execl()` but also providing environment variable dictionary `env`
`execve(file,` `arglist,` `env)`	Same as `execle()` except with argument vector `arglist`
`execlp(cmd,` `arg0,` `arg1,...)`	Same as `execl()` but search for full file pathname of cmd in user search path
`execvp(cmd, arglist)`	Same as `execlp()` except with argument vector `arglist`
`execlpe`(`cmd,` `arg0,` `arg1,...env)`	Same as `execlp()` but also providing environment variable dictionary `env`
`execvpe`(`cmd,` `arglist,` `env)`	Same as `execvp()` but also providing environment variable dictionary `env`
`spawn*`^[a]`(mode, file, args[, env])`	`spawn*()` family executes path in a new process given `args` as arguments and possibly an environment variable dictionary `env`;`mode` is a magic number indicating various modes of operation
`wait()`	Wait for child process to complete [usually used with `fork()` and `exec*()`]
`waitpid(pid,` `options)`	Wait for specific child process to complete [usually used with `fork()` and `exec*()`]
`popen(cmd,` `mode`=`'`r `',` `buffering=-1)`	Execute `cmd` string, returning a file-like object as a communication handle to the running program, defaulting to read `mode` and default system `buffering`
`startfile`^[b] `(path)`	Execute `path` with its associated application

^[a] spawn*() functions named similarly to exec*() (both families have eight members); spawnv() and spawnve() new in Python 1.5.2 and the other six spawn*() functions new in Python 1.6; also spawnlp(), spawnlpe(), spawnvp() and spawnvpe() are Unix-only.

^[b] New in Python 2.0.

As we get closer to the operating system layer of software, you will notice that the consistency of executing programs, even Python scripts, across platforms starts to get a little dicey. We mentioned above that the functions described in this section are in the os module. Truth is, there are multiple os modules. For example, the one for Unix-based systems (i.e., Linux, MacOS X, Solaris, *BSD, etc.) is the posix module. The one for Windows is nt (regardless of which version of Windows you are running; DOS users get the dos module), and the one for old MacOS is the mac module. Do not worry, Python will load the correct module when you call import os. You should never need to import a specific operating system module directly.

Before we take a look at each of these module functions, we want to point out for those of you using Python 2.4 and newer, there is a subprocess module that pretty much can substitute for all of these functions. We will show you later on in this chapter how to use some of these functions, then at the end give the equivalent using the subprocess.Popen class and subprocess.call() function.

14.5.1. `os.system()`

The first function on our list is system(), a rather simplistic function that takes a system command as a string name and executes it. Python execution is suspended while the command is being executed. When execution has completed, the exit status will be given as the return value from system() and Python execution resumes.

system() preserves the current standard files, including standard output, meaning that executing any program or command displaying output will be passed on to standard output. Be cautious here because certain applications such as common gateway interface (CGI) programs will cause Web browser errors if output other than valid Hypertext Markup Language (HTML) strings are sent back to the client via standard output. system() is generally used with commands producing no output, some of which include programs to compress or convert files, mount disks to the system, or any other command to perform a specific task that indicates success or failure via its exit status rather than communicating via input and/or output. The convention adopted is an exit status of 0 indicating success and non-zero for some sort of failure.

For the purpose of providing an example, we will execute two commands that do have program output from the interactive interpreter so that you can observe how system() works.

        >>> import os         >>> result = os.system('cat /etc/motd')         Have a lot of fun...         >>> result         0         >>> result = os.system('uname -a')         Linux solo 2.2.13 #1 Mon Nov 8 15:08:22 CET 1999 i586 unknown         >>> result         0

You will notice the output of both commands as well as the exit status of their execution, which we saved in the result variable. Here is an example executing a DOS command:

>>> import os >>> result = os.system('dir') Volume in drive C has no label Volume Serial Number is 43D1-6C8A Directory of C:\WINDOWS\TEMP .              <DIR>        01-08-98  8:39a . ..             <DIR>        01-08-98  8:39a ..          0 file(s)              0 bytes          2 dir(s)     572,588,032 bytes free >>> result 0

14.5.2. `os.popen()`

The popen() function is a combination of a file object and the system() function. It works in the same way as system() does, but in addition, it has the ability to establish a one-way connection to that program and then to access it like a file. If the program requires input, then you would call popen() with a mode of 'w' to "write" to that command. The data that you send to the program will then be received through its standard input. Likewise, a mode of 'r' will allow you to spawn a command, then as it writes to standard output, you can read that through your file-like handle using the familiar read*() methods of file object. And just like for files, you will be a good citizen and close() the connection when you are finished.

In one of the system() examples we used above, we called the Unix uname program to give us some information about the machine and operating system we are using. That command produced a line of output that went directly to the screen. If we wanted to read that string into a variable and perform internal manipulation or store that string to a log file, we could, using popen(). In fact, the code would look like the following:

        >>> import os         >>> f = os.popen('uname -a')         >>> data = f.readline()         >>> f.close()         >>> print data,         Linux solo 2.2.13 #1 Mon Nov 8 15:08:22 CET 1999 i586 unknown

As you can see, popen() returns a file-like object; also notice that readline(), as always, preserves the NEWLINE character found at the end of a line of input text.

14.5.3. `os.fork()`,`os.exec()`,`os.wait()`

Without a detailed introduction to operating systems theory, we present a light introduction to processes in this section. fork() takes your single executing flow of control known as a process and creates a "fork in the road," if you will. The interesting thing is that your system takes both forksmeaning that you will have two consecutive and parallel running programs (running the same code no less because both processes resume at the next line of code immediately succeeding the fork() call).

The original process that called fork() is called the parent process, and the new process created as a result of the call is known as the child process. When the child process returns, its return value is always zero; when the parent process returns, its return value is always the process identifier (aka process ID, or PID) of the child process (so the parent can keep tabs on all its children). The PIDs are the only way to tell them apart, too!

We mentioned that both processes will resume immediately after the call to fork(). Because the code is the same, we are looking at identical execution if no other action is taken at this time. This is usually not the intention. The main purpose for creating another process is to run another program, so we need to take divergent action as soon as parent and child return. As we stated above, the PIDs differ, so this is how we tell them apart.

The following snippet of code will look familiar to those who have experience managing processes. However, if you are new, it may be difficult to see how it works at first, but once you get it, you get it.

ret = os.fork()        # spawn 2 processes, both return if ret == 0:           # child returns with PID of 0     child_suite        # child code else:                  # parent returns with child's PID     parent_suite       # parent code

The call to fork() is made in the first line of code. Now both child and parent processes exist running simultaneously. The child process has its own copy of the virtual memory address space and contains an exact replica of the parent's address spaceyes, both processes are nearly identical. Recall that fork() returns twice, meaning that both the parent and the child return. You might ask, how can you tell them apart if they both return? When the parent returns, it comes back with the PID of the child process. When the child returns, it has a return value of 0. This is how we can differentiate the two processes.

Using an if-else statement, we can direct code for the child to execute (i.e., the if clause) as well as the parent (the else clause). The code for the child is where we can make a call to any of the exec*() functions to run a completely different program or some function in the same program (as long as both child and parent take divergent paths of execution). The general convention is to let the children do all the dirty work while the parent either waits patiently for the child to complete its task or continues execution and checks later to see if the child finished properly.

All of the exec*() functions load a file or command and execute it with an argument list (either individually given or as part of an argument list). If applicable, an environment variable dictionary can be provided for the command. These variables are generally made available to programs to provide a more accurate description of the user's current execution environment. Some of the more well-known variables include the user name, search path, current shell, terminal type, localized language, machine type, operating system name, etc.

All versions of exec*() will replace the Python interpreter running in the current (child) process with the given file as the program to execute now. Unlike system(), there is no return to Python (since Python was replaced). An exception will be raised if exec*() fails because the program cannot execute for some reason.

The following code starts up a cute little game called "xbill" in the child process while the parent continues running the Python interpreter. Because the child process never returns, we do not have to worry about any code for the child after calling exec*(). Note that the command is also a required first argument of the argument list.

ret = os.fork() if ret == 0:                   # child code      execvp('xbill', ['xbill']) else:                          # parent code      os.wait()

In this code, you also find a call to wait(). When children processes have completed, they need their parents to clean up after them. This task, known as "reaping a child," can be accomplished with the wait*() functions. Immediately following a fork(), a parent can wait for the child to complete and do the clean-up then and there. A parent can also continue processing and reap the child later, also using one of the wait*() functions.

Regardless of which method a parent chooses, it must be performed. When a child has finished execution but has not been reaped yet, it enters a limbo state and becomes known as a zombie process. It is a good idea to minimize the number of zombie processes in your system because children in this state retain all the system resources allocated in their lifetimes, which do not get freed or released until they have been reaped by the parent.

A call to wait() suspends execution (i.e., waits) until a child process (any child process) has completed, terminating either normally or via a signal. wait() will then reap the child, releasing any resources. If the child has already completed, then wait() just performs the reaping procedure. waitpid() performs the same functionality as wait() with the additional arguments' PID to specify the process identifier of a specific child process to wait for plus options (normally zero or a set of optional flags logically OR'd together).

14.5.4. `os.spawn*()`

The spawn*() family of functions are similar to fork() and exec*() in that they execute a command in a new process; however, you do not need to call two separate functions to create a new process and cause it to execute a command. You only need to make one call with the spawn*() family. With its simplicity, you give up the ability to "track" the execution of the parent and child processes; its model is more similar to that of starting a function in a thread. Another difference is that you have to know the magic mode parameter to pass to spawn*().

On some operating systems (especially embedded real-time operating systems [RTOs]), spawn*() is much faster than fork(). (Those where this is not the case usually use copy-on-write tricks.) Refer to the Python Library Reference Manual for more details (see the Process Management section of the manual on the os module) on the spawn*() functions. Various members of the spawn*() family were added to Python between 1.5 and 1.6 (inclusive).

14.5.5. `subprocess` Module

After Python 2.3 came out, work was begun on a module named popen5. The naming continued the tradition of all the previous popen*() functions that came before, but rather than continuing this ominous trend, the module was eventually named subprocess, with a class named Popen that has functionality to centralize most of the process-oriented functions we have discussed so far in this chapter. There is also a convenience function named call() that can easily slide into where os.system () lives. The subprocess module made its debut in Python 2.4. Below is an example of what it can do:

Replacing `os.system()`

Linux Example:

  >>> from subprocess import call   >>> import os   >>> res = call(('cat', '/etc/motd'))   Linux starship 2.4.18-1-686 #4 Sat Nov 29 10:18:26 EST 2003 i686 GNU/Linux   >>> res   0

Win32 Example:

   >>> res = call(('dir', r'c:\windows\temp'), shell=True)     Volume in drive C has no label.     Volume Serial Number is F4C9-1C38     Directory of c:\windows\temp    03/11/2006  02:08 AM    <DIR>          .    03/11/2006  02:08 AM    <DIR>          ..    02/21/2006  08:45 PM               851 install.log    02/21/2006  07:02 PM               444 tmp.txt                   2 File(s)          1,295 bytes                   3 Dir(s)  55,001,104,384 bytes free

Replacing `os.popen()`

The syntax for creating an instance of Popen is only slightly more complex than calling the os.popen() function:

        >>> from subprocess import Popen, PIPE         >>> f = Popen(('uname', '-a'), stdout=PIPE).stdout         >>> data = f.readline()         >>> f.close()         >>> print data,         Linux starship 2.4.18-1-686 #4 Sat Nov 29 10:18:26 EST 2003 i686         GNU/Linux         >>> f = Popen('who', stdout=PIPE).stdout         >>> data = [ eachLine.strip() for eachLine in f ]         >>> f.close()         >>> for eachLine in data:         ...  print eachLine         ...         wesc     console  Mar 11 12:44         wesc     ttyp1    Mar 11 16:29         wesc     ttyp2    Mar 11 16:40  (192.168.1.37)         wesc     ttyp3    Mar 11 16:49  (192.168.1.37)         wesc     ttyp4    Mar 11 17:51  (192.168.1.34)

14.5.6. Related Functions

Table 14.7 lists some of the functions (and their modules) that can perform some of the tasks described.

Table 14.7. Various Functions for File Execution
File Object Attribute	Description
`os/popen2.popen2`^[a]`()`	Executes a file and open file read and write access from (`stdout`) and to (`stdin`) the newly created running program
`os/popen2.popen3`^[a]`()`	Executes a file and open file read and write access from (`stdout` and `stderr`) and (`stdin`) to the newly created running program
`os/popen2.popen4` ^[b]`()`	Executes a file and open file read and write access from (`stdout`and `stderr` combined) and (`stdin`) to the newlycreated running program
`commands.getoutput()`	Executes a file in a subprocess, returns all output as a string
`subprocess.call`^[c]`()`	Convenience function that creates a `subprocess.Popen`, waits for the command to complete, then returns the status code; like `os.system()` but is a more flexible alternative

^[a] New to os module in Python 2.0.

^[b] New (to os and popen2 modules) in Python 2.0.

^[c] New in Python 2.4

14.5. Executing Other (Non-Python) Programs

Table 14.6. os Module Functions for External Program Execution (Unix only, Windows only)

14.5.1. os.system()

14.5.2. os.popen()

14.5.3. os.fork(),os.exec*(),os.wait*()

14.5.4. os.spawn*()

14.5.5. subprocess Module

Replacing os.system()

Replacing os.popen()