Python 3 Scripting for System Administrators:-Part:3

SuryaChandraRao · February 26, 2019, 12:32pm

Adding Optional parameters

We’ve already handled two of the five requirements we set for this script; let’s continue by adding the optional flags to our parser and then we’ll finish by implementing the real script logic. We need to add a --limit flag with a -l alias.

~/bin/reverse-file

#!/usr/bin/env python3.6

import argparse

parser = argparse.ArgumentParser(description=‘Read a file in reverse’)

parser.add_argument(‘filename’, help=‘the file to read’)

parser.add_argument(’–limit’, ‘-l’, type=int, help=‘the number of lines to read’)

args = parser.parse_args()

print(args)

To specify that an argument is a flag, we need to place two hyphens at the beginning of the flag’s name. We’ve used the type option for add_argument to state that we want the value converted to an integer, and we specified a shorter version of the flag as our second argument.

Here is what args now looks like:

$ reverse-file --limit 5 testing.txt

Namespace(filename=‘testing.txt’, limit=5)

Next, we’ll add a --version flag. This one will be a little different because we’re going to use the actionoption to specify a string to print out when this flag is received:

~/bin/reverse-file

#!/usr/bin/env python3.6

import argparse

parser = argparse.ArgumentParser(description=‘Read a file in reverse’)

parser.add_argument(‘filename’, help=‘the file to read’)

parser.add_argument(’–limit’, ‘-l’, type=int, help=‘the number of lines to read’)

parser.add_argument(’–version’, ‘-v’, action=‘version’, version=’%(prog)s 1.0’)

args = parser.parse_args()

print(args)

This uses a built-in action type of version which we’ve found in the documentation.

Here’s what we get when we test out the --version flag:

$ reverse-file --version

reverse-file 1.0

Note: Notice that it carried out the version action and didn’t continue going through the script.

Adding Our Business Logic

We finally get a chance to use our file IO knowledge in a script:

~/bin/reverse-file

#!/usr/bin/env python3.6

import argparse

parser = argparse.ArgumentParser(description=‘Read a file in reverse’)

parser.add_argument(‘filename’, help=‘the file to read’)

parser.add_argument(’–limit’, ‘-l’, type=int, help=‘the number of lines to read’)

parser.add_argument(’–version’, ‘-v’, action=‘version’, version=’%(prog)s 1.0’)

args = parser.parse_args()

with open(args.filename) as f:
lines = f.readlines()
lines.reverse()

if args.limit:
    lines = lines[:args.limit]

for line in lines:
    print(line.strip()[::-1])

Here’s what we get when we test this out on the xmen_base.txt file from our working with files video:

$ reverse-file xmen_base.txt

gnihtemoS

reivaX rosseforP

relwarcthgiN

pohsiB

spolcyC

enirevloW

mrotS

~ $ reverse-file -l 2 xmen_base.txt

gnihtemoS

reivaX rosseforP

Handling Errors with try/except/else/finally

In our reverse-file script, what happens if the filename doesn’t exist? Let’s give it a shot:

$ reverse-file fake.txt

Traceback (most recent call last):

File "/home/user/bin/reverse-file", line 11, in

with open(args.filename) as f:

FileNotFoundError: [Errno 2] No such file or directory: ‘fake.txt’

This FileNotFoundError is something that we can expect to happen quite often and our script should handle this situation. Our parser isn’t going to catch this because we’re technically using the CLI properly, so we need to handle this ourselves. To handle these errors we’re going to utilize the keywords try, except, and else.

~/bin/reverse-file

#!/usr/bin/env python3.6

import argparse

parser = argparse.ArgumentParser(description=‘Read a file in reverse’)

parser.add_argument(‘filename’, help=‘the file to read’)

parser.add_argument(’–limit’, ‘-l’, type=int, help=‘the number of lines to read’)

parser.add_argument(’–version’, ‘-v’, action=‘version’, version=’%(prog)s verison 1.0’)

args = parser.parse_args()

try:

f = open(args.filename)

limit = args.limit

except FileNotFoundError as err:

print(f"Error: {err}")

else:

with f:

lines = f.readlines()
lines.reverse()

    if limit:
        lines = lines[:limit]

    for line in lines:
        print(line.strip()[::-1])

We utilize the try statement to denote that it’s quite possible for an error to happen within it. From there we can handle specific types of errors using the except keyword (we can have more than one). In the event that there isn’t an error, then we want to carry out the code that is in the else block. If we want to execute some code regardless of there being an error or not, we can put that in a finally block at the very end of our t, except for workflow.

Now when we try our script with a fake file, we get a much better response:

$ reverse-file fake.txt

Error: [Errno 2] No such file or directory: ‘fake.txt’

Adding Error Exit Status to reverse-file

When our reverse-file script receives a file that doesn’t exist, we show an error message, but we don’t set the exit status to 1 to be indicative of an error.

$ reverse-file -l 2 fake.txt

Error: [Errno 2] No such file or directory: ‘fake.txt’

~ $ echo $?

0

Let’s use the sys.exit function to accomplish this:

~/bin/reverse-file

#!/usr/bin/env python3.6

import argparse
import sys

parser = argparse.ArgumentParser(description=‘Read a file in reverse’)
parser.add_argument(‘filename’, help=‘the file to read’)
parser.add_argument(’–limit’, ‘-l’, type=int, help=‘the number of lines to read’)
parser.add_argument(’–version’, ‘-v’, action=‘version’, version=’%(prog)s verison 1.0’)

args = parser.parse_args()

try:
f = open(args.filename)
limit = args.limit
except FileNotFoundError as err:
print(f"Error: {err}")
sys.exit(1)
else:
with f:
lines = f.readlines()
lines.reverse()

    if limit:
        lines = lines[:limit]

    for line in lines:
        print(line.strip()[::-1])

Now, if we try our script with a missing file, we will exit with the proper code:

$ reverse-file -l 2 fake.txt

Error: [Errno 2] No such file or directory: ‘fake.txt’

$ echo $?

1

Executing Shell Commands With subprocess.run

For working with external processes, we’re going to experiment with the subprocess module from the REPL. The main function that we’re going to work with is the subprocess.run function, and it provides us with a lot of flexibility:

>>> import subprocess

>>> proc = subprocess.run([‘ls’, ‘-l’])

total 20

drwxrwxr-x. 2 user user 54 Jan 28 15:36 bin

drwxr-xr-x. 2 user user 6 Jan 7 2015 Desktop

-rw-rw-r–. 1 user user 44 Jan 26 22:16 new_xmen.txt

-rw-rw-r–. 1 user user 98 Jan 26 21:39 read_file.py

-rw-rw-r–. 1 user user 431 Aug 6 2015 VNCHOWTO

-rw-rw-r–. 1 user user 61 Jan 28 14:11 xmen_base.txt

-rw-------. 1 user user 68 Mar 18 2016 xrdp-chansrv.log

>>> proc

CompletedProcess(args=[‘ls’, ‘-l’], returncode=0)

Our proc variable is a CompletedProcess object, and this provides us with a lot of flexibility. We have access to the returncode attribute on our proc variable to ensure that it succeeded and returned a 0 to us. Notice that the ls command was executed and printed to the screen without us specifying to print anything. We can get around this by capturing STDOUT using a subprocess.PIPE.

>>> proc = subprocess.run(

… [‘ls’, ‘-l’],

… stdout=subprocess.PIPE,

… stderr=subprocess.PIPE,

… )

>>> proc

CompletedProcess(args=[‘ls’, ‘-l’], returncode=0, stdout=b’total 20\ndrwxrwxr-x. 2 user user 54 Jan 28 15:36 bin\ndrwxr-xr-x. 2 user user 6 Jan 7 2015 Desktop\n-rw-rw-r–. 1 user user 44 Jan 26 22:16 new_xmen.txt\n-rw-rw-r–. 1 user user 98 Jan 26 21:39 read_file.py\n-rw-rw-r–. 1 user user 431 Aug 6 2015 VNCHOWTO\n-rw-rw-r–. 1 user user 61 Jan 28 14:11 xmen_base.txt\n-rw-------. 1 user user 68 Mar 18 2016 xrdp-chansrv.log\n’, stderr=b’’)

>>> proc.stdout

b’total 20\ndrwxrwxr-x. 2 user user 54 Jan 28 15:36 bin\ndrwxr-xr-x. 2 user user 6 Jan 7 2015 Desktop\n-rw-rw-r–. 1 user user 44 Jan 26 22:16 new_xmen.txt\n-rw-rw-r–. 1 user user 98 Jan 26 21:39 read_file.py\n-rw-rw-r–. 1 user user 431 Aug 6 2015 VNCHOWTO\n-rw-rw-r–. 1 user user 61 Jan 28 14:11 xmen_base.txt\n-rw-------. 1 user user 68 Mar 18 2016 xrdp-chansrv.log\n’

Now that we’ve captured the output to attributes on our proc variable, we can work with it from within our script and determine whether or not it should ever be printed. Take a look at this string that is prefixed with a bcharacter. It is because it is a bytes object and not a string. The bytes type can only contain ASCII characters and won’t do anything special with escape sequences when printed. If we want to utilize this value as a string, we need to explicitly convert it using the bytes.decode method.

>>> print(proc.stdout)

b’total 20\ndrwxrwxr-x. 2 user user 54 Jan 28 15:36 bin\ndrwxr-xr-x. 2 user user 6 Jan 7 2015 Desktop\n-rw-rw-r–. 1 user user 44 Jan 26 22:16 new_xmen.txt\n-rw-rw-r–. 1 user user 98 Jan 26 21:39 read_file.py\n-rw-rw-r–. 1 user user 431 Aug 6 2015 VNCHOWTO\n-rw-rw-r–. 1 user user 61 Jan 28 14:11 xmen_base.txt\n-rw-------. 1 user user 68 Mar 18 2016 xrdp-chansrv.log\n’

>>> print(proc.stdout.decode())

total 20

drwxrwxr-x. 2 user user 54 Jan 28 15:36 bin

drwxr-xr-x. 2 user user 6 Jan 7 2015 Desktop

-rw-rw-r–. 1 user user 44 Jan 26 22:16 new_xmen.txt

-rw-rw-r–. 1 user user 98 Jan 26 21:39 read_file.py

-rw-rw-r–. 1 user user 431 Aug 6 2015 VNCHOWTO

-rw-rw-r–. 1 user user 61 Jan 28 14:11 xmen_base.txt

-rw-------. 1 user user 68 Mar 18 2016 xrdp-chansrv.log

>>>

Intentionally Raising Errors

The subprocess.run function will not raise an error by default if you execute something that returns a non-zero exit status. Here’s an example of this:

>>> new_proc = subprocess.run([‘cat’, ‘fake.txt’])

cat: fake.txt: No such file or directory

>>> new_proc

CompletedProcess(args=[‘cat’, ‘fake.txt’], returncode=1)

In this situation, we might want to raise an error, and if we pass the check argument to the function, it will raise a subprocess.CalledProcessError if something goes wrong:

>>> error_proc = subprocess.run([‘cat’, ‘fake.txt’], check=True)

cat: fake.txt: No such file or directory

Traceback (most recent call last):

File "", line 1, in

File "/usr/local/lib/python3.6/subprocess.py", line 418, in run

output=stdout, stderr=stderr)

subprocess.CalledProcessError: Command ‘[‘cat’, ‘fake.txt’]’ returned non-zero exit status 1.

>>>

Python 2 Compatible Functions

If you’re interested in writing code with the subprocess module that will still work with Python 2, then you cannot use the subprocess.run function because it’s only in Python 3. For this situation, you’ll want to look into using subprocess.call and subprocess.check_output.

Note: we need the words file to exist at /usr/share/dict/words for this video. This can be installed via:

$ sudo yum install -y words

Our contains Script

To dig into list comprehensions, we’re going to write a script that takes a word that then returns all of the values in the “words” file on our machine that contain the word. Our first step will be writing the script using standard iteration, and then we’re going to refactor our script to utilize a list comprehension.

~/bin/contains

#!/usr/bin/env python3.6

import argparse

parser = argparse.ArgumentParser(description=‘Search for words including partial word’)
parser.add_argument(‘snippet’, help=‘partial (or complete) string to search for in words’)

args = parser.parse_args()
snippet = args.snippet.lower()

with open(’/usr/share/dict/words’) as f:
words = f.readlines()

matches = []

for word in words:
if snippet in word.lower():
matches.append(word)

print(matches)

Let’s test out our first draft of the script to make sure that it works:

$ chmod u+x bin/contains

$ contains Keith

[‘Keith\n’, ‘Keithley\n’, ‘Keithsburg\n’, ‘Keithville\n’]

Utilizing a List Comprehension

This portion of our script is pretty standard:

~/bin/contains (partial)

words = open(’/usr/share/dict/words’).readlines()

matches = []

for word in words:

if snippet in word.lower():

matches.append(word)

print(matches)

We can rewrite that chunk of our script as one or two lines using a list comprehension:

~/bin/contains (partial)

words = open(’/usr/share/dict/words’).readlines()

print([word for word in words if snippet in word.lower()])

We can take this even further by removing the ‘\n’ from the end of each “word” we return:

~/bin/contains (partial)

words = open(’/usr/share/dict/words’).readlines()

print([word.strip() for word in words if snippet in word.lower()])

Final Version

Here’s the final version of our script that works (nearly) the same as our original version:

~/bin/contains

#!/usr/bin/env python3.6

import argparse

parser = argparse.ArgumentParser(description=‘Search for words including partial word’)

parser.add_argument(‘snippet’, help=‘partial (or complete) string to search for in words’)

args = parser.parse_args()

snippet = args.snippet.lower()

words = open(’/usr/share/dict/words’).readlines()

print([word.strip() for word in words if snippet in word.lower()])

Here’s our output:

$ contains Keith

[‘Keith’, ‘Keithley’, ‘Keithsburg’, ‘Keithville’]

Generating Random Test Data

To write our receipt reconciliation tool, we need to have some receipts to work with as we’re testing out our implementation. We’re expecting receipts to be JSON files that contain some specific data and we’re going to write a script that will create some receipts for us.

We’re working on a system that requires some local paths, so let’s put what we’re doing in a receiptsdirectory:

$ mkdir -p receipts/new

$ cd receipts

The receipts that haven’t been reconciled will go in the new directory, so we’ve already created that. Let’s create a gen_receipts.py file to create some unreconciled receipts when we run it:

~/receipts/gen_receipts.py

import random

import os

import json

count = int(os.getenv("FILE_COUNT") or 100)

words = [word.strip() for word in open(’/usr/share/dict/words’).readlines()]

for identifier in range(count):

amount = random.uniform(1.0, 1000)

content = {

‘topic’: random.choice(words),
‘value’: “%.2f” % amount
}
with open(f’./new/receipt-{identifier}.json’, ‘w’) as f:
json.dump(content, f)

We’re using the json.dump function to ensure that we’re writing out valid JSON (we’ll read it in later). random.choice allows us to select one item from an iterable (str, tuple, or list). The function random.uniform gives us a float between the two bounds specified. This code does show us how to create a range, which takes a starting number and an ending number and can be iterated through the values between.

Now we can run our script using the python3.6 command:

$ FILE_COUNT=10 python3.6 gen_receipts.py

$ ls new/

receipt-0.json receipt-2.json receipt-4.json receipt-6.json receipt-8.json

receipt-1.json receipt-3.json receipt-5.json receipt-7.json receipt-9.json

$ cat new/receipt-0.json

{"topic": "microceratous", "value": "918.67"}

Creating a Directory If It Doesn’t Exist
Before we start doing anything with the receipts, we want to have a processed directory to move them to so that we don’t try to process the same receipt twice. Our script can be smart enough to create this directory for us if it doesn’t exist when we first run the script. We’ll use the os.mkdir function; if the directory already exists we can catch the OSError that is thrown:
~/receipts/process_receipts.py
import os

try:
os.mkdir("./processed")
except OSError:
print("‘processed’ directory already exists")
Collecting the Receipts to Process
From the shell, we’re able to collect files based on patterns, and that’s useful. For our purposes, we want to get every receipt from the new directory that matches this pattern:
receipt-[0-9]*.json
That pattern translates to receipt-, followed by any number of digits, and ending with a .json file extension. We can achieve this exact result using the glob.glob function.

~/receipts/process_receipts.py (partial)

receipts = glob.glob(’./new/receipt-[0-9]*.json’)
subtotal = 0.0
Part of processing the receipts will entail adding up all of the values, so we’re going to start our script with a subtotal of 0.0.
Reading JSON, Totaling Values, and Moving Files
The remainder of our script is going to require us to do the following:
1. Iterate over the receipts
2. Reading each receipt’s JSON
3. Totaling the value of the receipts
4. Moving each receipt file to the processed directory after we’re finished with it
We used the json.dump function to write out a JSON file, and we can use the opposite function json.loadto read a JSON file. The contents of the file will be turned into a dictionary that we can us to access its keys. We’ll add the value to the subtotal before finally moving the file using shutil.move. Here’s our final script:

~/receipts/process_receipts.py
import glob
import os
import shutil
import json

try:
os.mkdir("./processed")
except OSError:
print("‘processed’ directory already exists")

Get a list of receipts

receipts = glob.glob(’./new/receipt-[0-9]*.json’)
subtotal = 0.0

for path in receipts:
with open(path) as f:
content = json.load(f)
subtotal += float(content[‘value’])
name = path.split(’/’)[-1]
destination = f"./processed/{name}"
shutil.move(path, destination)
print(f"moved ‘{path}’ to ‘{destination}’")

print(“Receipt subtotal: $%.2f” % subtotal)
Let’s add some files that don’t match our pattern to the new directory before running our script:
touch new/receipt-other.json new/receipt-14.txt new/random.txt
Finally, let’s run our script twice and see what we get:

$ python3.6 process_receipts.py

moved ‘./new/receipt-0.json’ to ‘./processed/receipt-0.json’

moved ‘./new/receipt-1.json’ to ‘./processed/receipt-1.json’

moved ‘./new/receipt-2.json’ to ‘./processed/receipt-2.json’

moved ‘./new/receipt-3.json’ to ‘./processed/receipt-3.json’

moved ‘./new/receipt-4.json’ to ‘./processed/receipt-4.json’

moved ‘./new/receipt-5.json’ to ‘./processed/receipt-5.json’

moved ‘./new/receipt-6.json’ to ‘./processed/receipt-6.json’

moved ‘./new/receipt-7.json’ to ‘./processed/receipt-7.json’

moved ‘./new/receipt-8.json’ to ‘./processed/receipt-8.json’

moved ‘./new/receipt-9.json’ to ‘./processed/receipt-9.json’

Receipt subtotal: $6932.04

$ python3.6 process_receipts.py

‘processed’ directory already exists

Receipt subtotal: $0.00

More Specific Patterns Using Regular Expressions (The re Module)

Occasionally, we need to be very specific about string patterns that we use, and sometimes those are just not doable with basic globbing. As an exercise in this, let’s change our process_receipts.py file to only return even numbered files (regardless of length). Let’s generate some more receipts and try to accomplish this from the REPL:

$ FILE_COUNT=20 python3.6 gen_receipts.py

$ python3.6

>>> import glob

>>> receipts = glob.glob(’./new/receipt-[0-9]*[24680].json’)

>>> receipts.sort()

>>> receipts

[’./new/receipt-10.json’, ‘./new/receipt-12.json’, ‘./new/receipt-14.json’, ‘./new/receipt-16.json’, ‘./new/receipt-18.json’]

That glob was pretty close, but it didn’t give us the single-digit even numbers. Let’s try now using the re( R egular E xpression) module’s match function, the glob.iglob function, and a list comprehension:

>>> import re

>>> receipts = [f for f in glob.iglob(’./new/receipt-[0-9].json’) if re.match(’./new/receipt-[0-9][02468].json’, f)]

>>> receipts

[’./new/receipt-0.json’, ‘./new/receipt-2.json’, ‘./new/receipt-4.json’, ‘./new/receipt-6.json’, ‘./new/receipt-8.json’, ‘./new/receipt-10.json’, ‘./new/receipt-12.json’, ‘./new/receipt-14.json’, ‘./new/receipt-16.json’, ‘./new/receipt-18.json’]

We’re using the glob.iglob function instead of the standard glob function because we knew we were going to iterate through it and make modifications at the same time. This iterator allows us to avoid fitting the whole expanded glob.glob list into memory at one time.

Regular Expressions are a pretty big topic, but once you’ve learned them, they are incredibly useful in scripts and also when working with tools like grep. The re module gives us quite a few powerful ways to use regular expressions in our python code.

Improved String Replacement

One actual improvement that we can make to our process_receipts.py file is that we can use a single function call to go from our path variable to the destination that we want. This section:

~/receipts/process_receipts.py (partial)

name = path.split(’/’)[-1]

destination = f"./processed/{name}"

Becomes this using the str.replace method:

destination = path.replace(‘new’, ‘processed’)

This is a useful refactoring to make because it makes the intention of our code more clear.

Working With Numbers Using math

Depending on how we want to process the values of our receipts, we might want to manipulate the numbers that we are working with by rounding; going to the next highest integer, or the next lowest integer. These sort of “rounding” actions are pretty common, and some of them require the math module:

>>> import math

>>> math.ceil(1.1)

2

>>> math.floor(1.1)

1

>>> round(1.1111111111, 2)

1.11

We can utilize the built-in round function to clean up the printing of the subtotal at the end of the script. Here’s the final version of process_receipts.py:

~/receipts/process_receipts.py

import glob

import os

import shutil

import json

try:

os.mkdir("./processed")

except OSError:

print("‘processed’ directory already exists")

subtotal = 0.0

for path in glob.iglob(’./new/receipt-[0-9]*.json’):
with open(path) as f:
content = json.load(f)
subtotal += float(content[‘value’])
destination = path.replace(‘new’, ‘processed’)
shutil.move(path, destination)
print(f"moved ‘{path}’ to ‘{destination}’")

print(f"Receipt subtotal: ${round(subtotal, 2)}")

BONUS: Truncate Float Without Rounding

I mentioned in the video that you can do some more complicated math to print a number to a specified number of digits without rounding. Here’s an example a function that would do the truncation (for those curious):

>>> import math

>>> def ftruncate(f, ndigits=None):

… if ndigits and (ndigits > 0):

… multiplier = 10 ** ndigits

… num = math.floor(f * multiplier) / multiplier

… else:

… num = math.floor(f)

… return num

>>> num = 1.5441020468646993

>>> ftruncate(num)

1

>>> ftruncate(num, 2)

1.54

>>> ftruncate(num, 8)

1.54410204

Viewing Installed Packages

We can check out your installed packages using the list subcommand:

$ pip3.6 list

DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.

pip (9.0.1)

setuptools (28.8.0)

You may have gotten a deprecation warning. To fix that, let’s create a $HOME/.config/pip/pip.conf file:

$ mkdir -p ~/.config/pip

$ vim ~/.config/pip/pip.conf

~/.config/pip/pip.conf

[list]

format=columns

Now if we use list we’ll get a slightly different result:
$ pip3.6 list
Package Version

pip 9.0.1
setuptools 28.8.0