System Heaven
System Heaven
System Heaven is an easy and secure way of running subprocesses. It is more convenient that subprocess
, particularly when pipes are involved. And it’s more secure than os.system
because it never uses a shell underneath and it’s not vulnerable to shell injections.
The problems with os.system
While os.system
is simple and convenient, it has several downsides:
-
Insecure: As long as the command to execute is fixed,
os.system
, as its underlying callsystem
, just works. But as soon as some part of it is variable, there’s a risk of injection. Escaping helps, but it’s a headache, and it’ll break sooner or later. -
Inconsistent: The behavior of
os.system
depends of the underlying shell and the operating system it’s running on. -
Limited: The standard output and error output of the command can not be captured with
os.system
. The return value ofos.system
is usually the exit status of the command, but it again depends of the operating system and the interpreter.
The problem with subprocess
The subprocess
module actually does its job rather well. It solves successfully the issues with os.system
and even integrates the capabilites of os.spawn
and the old Popen
. Its main downside is that some things, particularly pipes, can get cumbersome very quickly.
For instance, these shell backquotes:
output=`dmesg | grep hda`
become the following:
from subprocess import Popen, PIPE
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
The way of System Heaven
The core of System Heaven is the fuction sh
. The name of the function, and of the library, is intentionally based on that of the common Unix shells. But System Heaven’s sh
is not a shell, nor it ever uses one underneath.
The most basic call looks like this:
from systemheaven import sh
sh("wget http://example.com/something.zip")
We can, however, use variables, which add flexibility while avoiding shell injections:
sh("wget $url", url="http://example.com/something.zip")
You may wonder why isn’t that just as vulnerable to injections as os.system
. The difference stems from the fact that System Heaven does an intentionally limited parsing of its command line and variables become one single element of the list that is passed to Popen
.
Even if a variable has spaces or other characters that would be special to a shell, it’s safe to use it with System Heaven:
dir = "Some Docs ; rm -rf /"
sh("ls -l $dir", dir=dir) # totally safe
That would produce, at worst, a No such file or directory from ls
. This behavior is similar to what module sqlite3
does with the question marks to avoid SQL injections.
A variable can become more than one argument if, instead of a string, you pass an iterable. For instance:
sh("sha256sum $files", files=["file1.iso", "file2.iso"])
or better:
from glob import glob
sh("sha256sum $files", files=glob("*.iso"))
Pipelines can be built very easily with |
:
sh("ls $dir | grep txt", dir="/home/user")
The longer the pipeline, the better it feels to use sh
:
sh("ls dir | grep a | grep b | grep c")
If you want to read the output of a command, put a |
at the end:
count = sh("ls | wc -l |").stdout.read()
With this capability, the pipe of the previous section becomes:
output = sh('dmesg | grep hda |').stdout.read()
To send input to a command, put the |
in front of the line:
cmd = sh('| grep py$')
for filename in os.listdir():
cmd.stdin.write(filename.encode('utf8'))
You can put |
in both the beginning and the end of the line. But be careful of potential deadlocks.
Some design choices were driven by this goal. In particular, System Heaven avoids any string manipulation, as explained in the next sections.
Quoting & Escaping
There are no quoting mechanisms in System Heaven. Neither "
, nor '
, nor \
have any special meaning. Anywhere you feel inclined to use quoting or escaping, use a variable.
For instance, if you say:
sh('ls -l "Some Documents"') # may not do what you're thinking
that will become something like:
subprocess.Popen(['ls', '-l', '"Some', 'Documents"'], ...)
and you will be asking ls
to list two directories: "Some
and Documents"
. Probably not what you originally meant. A similar thing would happend with ls -l Some\ Documents
. That would list two directories, the first one of which happens to end with a backslash.
In any of those cases, say the following instead:
sh('ls -l $dir', dir='Some Documents')
Variables
Variables only work when used as a whole word. That means that a $
sign in the middle or the end of a word will be passed untouched to the subprocess. For instance, the following:
sh('ls -l /media/$user', user="ubuntu") # won't do what it seems
will become:
subprocess.call(['ls', '-l', '/media/$user'])
and the user
argument will just go unused.
Similarly there’s no ${var}
syntax. You could use it, but then your kwargs would need a "{var}"
key. System Heaven’s parsing is intentionally limited and transparent: a variable is a word that starts with $
and the rest of the word is the name of the variable –without restrictions– and will be looked up, untouched, in kwargs.
The right way to do the thing above is:
sh('ls -l $dir', dir="/media/" + user)
String manipulation is the responsibility of the caller.
Special Variables
Some keyword arguments, like _cwd
and _env
are reserved for tweaking the behavior of sh
. Avoid using variables starting with _
.
Examples
-
The following Bash pipeline:
count=$(ls | wc -l)
would become this System Heaven invocation:
count = sh("ls | wc -l |").stdout.read()
-
System Heaven variables can be iterables. This Bash line:
count=$(ls *.txt | wc -l)
becomes
from glob import glob count = sh("ls $files | wc -l |", files=glob("*.txt")).stdout.read()
-
You can change the current working directory with
_cwd
. This Bash line:cd /home/user count=$(ls *.txt | wc -l)
becomes:
from glob import glob count = sh("ls $files | wc -l |", files=glob("*.txt"), _cwd="/home/user" ).stdout.read()