Installing New Packages
Later in this course, we’ll be using the boto3 package to interact with AWS S3. Let’s use that as an example package to install using the install subcommand:
$ pip3.6 install boto3
…
PermissionError: [Errno 13] Permission denied: ‘/usr/local/lib/python3.6/site-packages/jmespath’
Since we installed Python 3.6 into /usr/local, it’s meant to be usable by all users, but we can only add or remove packages if we’re root (or via sudo).
$ sudo pip3.6 install boto3
Managing Required Packages with requirements.txt
If we have a project that relies on boto3, we probably want to keep track of that dependency somewhere, and pip can facilitate this through a “requirements file” traditionally called requirements.txt. If we’ve already installed everything manually, then we can dump the current dependency state using the freezesubcommand that pip provides.
$ pip3.6 freeze
boto3==1.5.22
botocore==1.8.36
docutils==0.14
jmespath==0.9.3
python-dateutil==2.6.1
s3transfer==0.1.12
six==1.11.0
$ pip3.6 freeze > requirements.txt
Now we can use this file to tell pip what to install (or uninstall) using the -r flag to either command. Let’s uninstall these packages from the global site-packages:
$ sudo pip3.6 uninstall -y -r requirements.txt
Installing Packages Local To Our User
We need to use sudo to install packages globally, but sometimes we only want to install a package for ourselves, and we can do that by using the --user flag to the install command. Let’s reinstall boto3 so that it’s local to our user by using our requirements.txt file:
$ pip3.6 install --user -r requirements.txt
$ pip3.6 list --user
$ pip3.6 uninstall boto3
Virtualenv or Venv
Virtualenvs allow you to create sandboxed Python environments. In Python 2, you need to install the virtualenv package to do this, but with Python 3 it’s been worked in under the module name of venv.
To create a virtualenv, we’ll use the following command:
$ python3.6 -m venv [PATH FOR VIRTUALENV]
The -m flag loads a module as a script, so it looks a little weird, but “python3.6 -m venv” is a stand-alone tool. This tool can even handle its own flags.
Let’s create a directory to store our virtualenvs called venvs. From here we create an experiment virtualenv to see how they work.
$ mkdir venvs
$ python3.6 -m venv venvs/experiment
Virtualenvs are local Python installations with their own site-packages, and they do absolutely nothing for us by default. To use a virtualenv, we need to activate it. We do this by sourcing an activate file in the virtualenv’s bin directory:
$ source venvs/experiment/bin/activate
(experiment) ~ $
Notice that our prompt changed to indicate to us what virtualenv is active. This is part of what the activatescript does. It also changes our $PATH:
(experiment) ~ $ echo $PATH
/home/user/venvs/experiment/bin:/home/user/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/user/.local/bin:/home/user/bin
(experiment) ~ $ which python
~/venvs/experiment/bin/python
(experiment) ~ $ python --version
Python 3.6.4
(experiment) ~ $ pip list
Package Version
pip 9.0.1
setuptools 28.8.0
(experiment) ~ $ deactivate
$ which python
/usr/bin/python
With the virtualenv activated, the python and pip binaries point to the local Python 3 variations, so we don’t need to append the 3.6 to all of our commands. To remove the virtualenv’s contents from our $PATH, we will utilize the deactivate script that the virtualenv provided.
Creating a Weather Script
We’re going to write up the start of a script that can provide us with weather information using data from openweathermap.org. For this video, we’re going to be installing another package called requests. This is a nice package for making web requests from Python and one of the most used Python packages. You will need to get your API key from OpenWeatherMap to follow along with this video.
Let’s start off by activating the experiment virtualenv that we created in the previous video. Install the package and set an environment variable with an API key:
$ source ~/venvs/experiment/bin/activate
(experiment) $ pip install requests
(experiment) $ export OWM_API_KEY=[YOUR API KEY]
Create a new script called weather:
~/bin/weather
#!/usr/bin/env python3.6
import os
import requests
import sys
from argparse import ArgumentParser
parser = ArgumentParser(description=‘Get the current weather information’)
parser.add_argument(‘zip’, help=‘zip/postal code to get the weather for’)
parser.add_argument(’–country’, default=‘us’, help=‘country zip/postal belongs to, default is “us”’)
args = parser.parse_args()
api_key = os.getenv(‘OWM_API_KEY’)
if not api_key:
print(“Error: no ‘OWM_API_KEY’ provided”)
sys.exit(1)
url = f"http://api.openweathermap.org/data/2.5/weather?zip={args.zip},{args.country}&appid={api_key}"
res = requests.get(url)
if res.status_code != 200:
print(f"Error talking to weather provider: {res.status_code}")
sys.exit(1)
print(res.json())
Notice that we were able to use the requests package in the same way that we would any package from the standard library.
Let’s try it out:
(experiment) $ chmod u+x ~/bin/weather
(experiment) $ weather 45891
(experiment) ~ $ weather 45891
{‘coord’: {‘lon’: -84.59, ‘lat’: 40.87}, ‘weather’: [{‘id’: 801, ‘main’: ‘Clouds’, ‘description’: ‘few clouds’, ‘icon’: ‘02d’}], ‘base’: ‘stations’, ‘main’: {‘temp’: 282.48, ‘pressure’: 1024, ‘humidity’: 84, ‘temp_min’: 282.15, ‘temp_max’: 283.15}, ‘visibility’: 16093, ‘wind’: {‘speed’: 1.5, ‘deg’: 210}, ‘clouds’: {‘all’: 20}, ‘dt’: 1517169240, ‘sys’: {‘type’: 1, ‘id’: 1029, ‘message’: 0.0043, ‘country’: ‘US’, ‘sunrise’: 1517143892, ‘sunset’: 1517179914}, ‘id’: 0, ‘name’: ‘Van Wert’, ‘cod’: 200}
Making weather Work Regardless of the Active Virtualenv
Currently, our weather script will only work if the experiment virtualenv is active since no other Python has requests installed. We can get around this by changing the shebang to point to the specific python within our virtualenv:
Make this script work regardless of active python by using this as the shebang:
#!/home/$USER/venvs/experiment/python
You’ll need to substitute in your actual username for $USER. Here’s what the script looks like on a cloud server with the username of user:
~/bin/weather
#!/home/user/venvs/experiment/bin/python
import os
import requests
import sys
from argparse import ArgumentParser
parser = ArgumentParser(description=‘Get the current weather information’)
parser.add_argument(‘zip’, help=‘zip/postal code to get the weather for’)
parser.add_argument(’–country’, default=‘us’, help=‘country zip/postal belongs to, default is “us”’)
args = parser.parse_args()
api_key = os.getenv(‘OWM_API_KEY’)
if not api_key:
print(“Error: no ‘OWM_API_KEY’ provided”)
sys.exit(1)
url = f"http://api.openweathermap.org/data/2.5/weather?zip={args.zip},{args.country}&appid={api_key}"
res = requests.get(url)
if res.status_code != 200:
print(f"Error talking to weather provider: {res.status_code}")
sys.exit(1)
print(res.json())
Now if we deactivate and use the script it will still work:
(experiment) $ deactivate
$ weather 45891
{‘coord’: {‘lon’: -84.59, ‘lat’: 40.87}, ‘weather’: [{‘id’: 801, ‘main’: ‘Clouds’, ‘description’: ‘few clouds’, ‘icon’: ‘02d’}], ‘base’: ‘stations’, ‘main’: {‘temp’: 282.48, ‘pressure’: 1024, ‘humidity’: 84, ‘temp_min’: 282.15, ‘temp_max’: 283.15}, ‘visibility’: 16093, ‘wind’: {‘speed’: 1.5, ‘deg’: 210}, ‘clouds’: {‘all’: 20}, ‘dt’: 1517169240, ‘sys’: {‘type’: 1, ‘id’: 1029, ‘message’: 0.0035, ‘country’: ‘US’, ‘sunrise’: 1517143892, ‘sunset’: 1517179914}, ‘id’: 0, ‘name’: ‘Van Wert’, ‘cod’: 200}
Take is as a challenge to build on this example to make a more full-featured weather CLI.
The Project
We have many database servers that we manage, and we want to create a single tool that we can use to easily back up the databases to either AWS S3 or locally. We would like to be able to:
1. Specify the database URL to backup.
2. Specify a “driver” (local or s3)
3. Specify the backup “destination”. This will be a file path for local and a bucket name for s3.
4. Depending on the “driver”, create a local backup of the database or upload the backup to an S3 bucket.
Setting up PostgreSQL Lab Server
Before we begin, we’re going to need to need a PostgreSQL database to work with. The code repository for this course contains a db_setup.sh script that we’ll use on a CentOS 7 cloud server to create and run our database. Create a “CentOS 7” cloud server and run the following on it:
$ curl -o db_setup.sh https://raw.githubusercontent.com/linuxacademy/content-python3-sysadmin/master/helpers/db_setup.sh
$ chmod +x db_setup.sh
$ ./db_setup.sh
You will be prompted for your sudo password and for the username and password you’d like to use to access the database.
Installing The Postgres 9.6 Client
On our development machines, we’ll need to make sure that we have the Postgres client installed. The version needs to be 9.6.6.
On Red-hat systems we’ll use the following:
$ sudo yum install pgdg-centos96-9.6-3.noarch.rpm epel-release
$ sudo yum update
$ sudo yum install postgresql96
On debian systems, the equivalent would be:
$ sudo apt-get install postgres-client-9.6
Test connection from Workstation
Let’s make sure that we can connect to the PostgreSQL server from our development machine by running the following command:
*Note: You’ll need to substitute in your database user’s values for [USERNAME], [PASSWORD], and [SERVER_IP].
$ psql postgres://[USERNAME]:[PASSWORD]@[SERVER_IP]:80/sample -c "SELECT count(id) FROM employees;"
Creating the Repo and Virtualenv
Since we’re building a project that will likely be more than a single file, we’re going to create a full project complete with source control and dependencies. We’ll start by creating the directory to hold our project, and we’re going to place this in a code directory:
$ rm ~/requirements.txt
$ mkdir -p ~/code/pgbackup
$ cd ~/code/pgbackup
We’ve talked about pip and virtualenvs, and how they allow us to manage our dependency versions. For a development project, we will leverage a new tool to manage our project’s virtualenv and install dependencies. This tool is called pipenv. Let’s install pipenv for our user and create a Python 3 virtualenv for our project:
$ pip3.6 install --user pipenv
$ pipenv --python $(which python3.6)
Rather than creating a requirements.txt file for us, pipenv has created a Pipfile that it will use to store virtualenv and dependency information. To activate our new virtualenv, we use the command pipenv shell, and to deactivate it we use exit instead of deactivate.
Next, let’s set up git as our source control management tool by initializing our repository. We’ll also add a .gitignore file from GitHub so that we don’t later track files that we don’t mean to.
$ git init
$ curl https://raw.githubusercontent.com/github/gitignore/master/Python.gitignore -o .gitignore
Sketch out the README.rst
One great way to start planning out a project is to start by documenting it from the top level. This is the documentation that we would give to someone who wanted to know how to use the tool but didn’t care about creating the tool. This approach is sometimes called “README Driven Development”. Whenever we write documentation in a Python project, we should be using reStructuredText. We use this specific markup format because there are tools in the Python ecosystem that can read this text and render documentation in a standardized way. Here’s our READEME.rst file:
~/code/pgbackup/README.rst
pgbackup
========
CLI for backing up remote PostgreSQL databases locally or to AWS S3.
Preparing for Development
-
Ensure
pip
andpipenv
are installed -
Clone repository:
git clone [email protected]:example/pgbackup
-
cd
into repository -
Fetch development dependencies
make install
-
Activate virtualenv:
pipenv shell
Usage
Pass in a full database URL, the storage driver, and destination.
S3 Example w/ bucket name:
::
$ pgbackup postgres://[email protected]:5432/db_one --driver s3 backups
Local Example w/ local path:
::
$ pgbackup postgres://[email protected]:5432/db_one --driver local /var/local/db_one/backups
Running Tests
Run tests locally using make
if virtualenv is active:
::
$ make
If virtualenv isn’t active then use:
::
$ pipenv run make
Our Initial Commit
Now that we’ve created our README.rst file to document what we plan on doing with this project, we’re in a good position to stage our changes and make our first commit:
$ git add --all .
$ git commit -m ‘Initial commit’
Create Package Layout
There are a few specific places that we’re going to put code in this project:
1. In a src/pgbackup directory. This is where our project’s business logic will go.
2. In a tests directory. We’ll put automated tests here.
We’re not going to write the code that goes in these directories just yet, but we are going to create them and put some empty files in so that we can make a git commit that contains these directories. In our src/pgbackupdirectory, we’ll use a special file called init.py, but in our tests directory, we’ll use a generically named, hidden file.
(pgbackup-E7nj_BsO) $ mkdir -p src/pgbackup tests
(pgbackup-E7nj_BsO) $ touch src/pgbackup/init.py tests/.keep
Writing Our setup.py
One of the requirements for an installable Python package is a setup.py file at the root of the project. In this file, we’ll utilize setuptools to specify how our project is to be installed and define its metadata. Let’s write out this file now:
~/code/pgbackup/setup.py
from setuptools import setup, find_packages
with open(‘README.rst’, encoding=‘UTF-8’) as f:
readme = f.read()
setup(
name=‘pgbackup’,
version=‘0.1.0’,
description=‘Database backups locally or to AWS S3.’,
long_description=readme,
author=‘Keith’,
author_email=‘[email protected]’,
packages=find_packages(‘src’),
package_dir={’’: ‘src’},
install_requires=[]
)
For the most part, this file is metadata, but the packages, package_dir, and install_requiresparameters of the setup function define where setuptools will look for our source code and what other packages need to be installed for our package to work.
To make sure that we didn’t mess up anything in our setup.py, we’ll install our package as a development package using pip.
(pgbackup-E7nj_BsO) $ pip install -e .
Obtaining file:///home/user/code/pgbackup
Installing collected packages: pgbackup
Running setup.py develop for pgbackup
Successfully installed pgbackup
It looks like everything worked, and we won’t need to change our setup.py for awhile. For the time being, let’s uninstall pgbackup since it doesn’t do anything yet:
(pgbackup-E7nj_BsO) $ pip uninstall pgbackup
Uninstalling pgbackup-0.1.0:
/home/user/.local/share/virtualenvs/pgbackup-E7nj_BsO/lib/python3.6/site-packages/pgbackup.egg-link
Proceed (y/n)? y
Successfully uninstalled pgbackup-0.1.0
Makefile
In our README.rst file, we mentioned that to run tests we wanted to be able to simply run make from our terminal. To do that, we need to have a Makefile. We’ll create a second make task that can be used to setup the virtualenv and install dependencies using pipenv also. Here’s our Makefile:
~/code/pgbackup/Makefile
.PHONY: default install test
default: test
install:
pipenv install --dev --skip-lock
test:
PYTHONPATH=./src pytest
This is a great spot for us to make a commit:
(pgbackup-E7nj_BsO) $ git add --all .
(pgbackup-E7nj_BsO) $ git commit -m ‘Structure project with setup.py and Makefile’
[master 1c0ed72] Structure project with setup.py and Makefile
4 files changed, 26 insertions(+)
create mode 100644 Makefile
create mode 100644 setup.py
create mode 100644 src/pgbackup/init.py
create mode 100644 tests/.keep
Installing pytest
For this course, we’re using pytest as our testing framework. It’s a simple tool, and although there is a unit testing framework built into Python, I think that pytest is a little easier to understand. Before we can use it though, we need to install it. We’ll use pipenv and specify that this is a “dev” dependency:
(pgbackup-E7nj_BsO) $ pipenv install --dev pytest
…
Adding pytest to Pipfile’s [dev-packages]…
Locking [dev-packages] dependencies…
Locking [packages] dependencies…
Updated Pipfile.lock (5c8539)!
Now the line that we wrote in our Makefile that utilized the pytest, CLI will work.
Writing Our First Tests
The first step of TDD is writing a failing test. In our case, we’re going to go ahead and write a few failing tests. Using pytest, our tests will be functions with names that start with test_. As long as we name the functions properly, the test runner should find and run them.
We’re going to write three tests to start:
1. A test that shows that the CLI fails if no driver is specified.
2. A test that shows that the CLI fails if there is no destination value given.
3. A test that shows, given a driver and a destination, that the CLI’s returned Namespace has the proper values set.
At this point, we don’t even have any source code files, but that doesn’t mean that we can’t write code that demonstrates how we would like our modules to work. The module that we want is called cli, and it should have a create_parser function that returns an ArgumentParser configured for our desired use.
Let’s write some tests that exercise cli.create_parser and ensure that our ArgumentParser works as expected. The name of our test file is important; make sure that the file starts with test_. This file will be called test_cli.py.
~/code/pgbackup/tests/test_cli.py
import pytest
from pgbackup import cli
url = "postgres://bob:[email protected]:5432/db_one"
def test_parser_without_driver():
"""
Without a specified driver the parser will exit
"""
with pytest.raises(SystemExit):
parser = cli.create_parser()
parser.parse_args([url])
def test_parser_with_driver():
"""
The parser will exit if it receives a driver
without a destination
"""
parser = cli.create_parser()
with pytest.raises(SystemExit):
parser.parse_args([url, "–driver", "local"])
def test_parser_with_driver_and_destination():
"""
The parser will not exit if it receives a driver
with a destination
"""
parser = cli.create_parser()
args = parser.parse_args([url, "–driver", "local", "/some/path"])
assert args.driver == "local"
assert args.destination == "/some/path"
Running Tests
Now that we’ve written a few tests, it’s time to run them. We’ve created our Makefile already, so let’s make sure our virtualenv is active and run them:
$ pipenv shell
(pgbackup-E7nj_BsO) $ make
PYTHONPATH=./src pytest
======================================= test session starts =======================================
platform linux – Python 3.6.4, pytest-3.3.2, py-1.5.2, pluggy-0.6.0
rootdir: /home/user/code/pgbackup, inifile:
collected 0 items / 1 errors
============================================= ERRORS ==============================================
_______________________________ ERROR collecting tests/test_cli.py ________________________________
ImportError while importing test module ‘/home/user/code/pgbackup/tests/test_cli.py’.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/test_cli.py:3: in
from pgbackup import cli
E ImportError: cannot import name ‘cli’
!!! Interrupted: 1 errors during collection !!!
===================================== 1 error in 0.11 seconds =====================================
make: *** [test] Error 2
We get an ImportError from our test file because there is no module in pgbackup named cli. This is awesome because it tells us what our next step is. We need to create that file.
Moving Through Failing Tests
Our current test failure is from there not being a cli.py file within the src/pgbackup directory. Let’s do just enough to move onto the next error:
(partial make output)
(pgbackup-E7nj_BsO) $ touch src/pgbackup/cli.py
(pgbackup-E7nj_BsO) $ make
PYTHONPATH=./src pytest
======================================= test session starts =======================================
platform linux – Python 3.6.4, pytest-3.3.2, py-1.5.2, pluggy-0.6.0
rootdir: /home/user/code/pgbackup, inifile:
collected 3 items
tests/test_cli.py FFF [100%]
============================================ FAILURES =============================================
___________________________________ test_parser_without_driver ____________________________________
def test_parser_without_driver():
"""
Without a specified driver the parser will exit
"""
with pytest.raises(SystemExit):
parser = cli.create_parser()
E AttributeError: module ‘pgbackup.cli’ has no attribute ‘create_parser’
tests/test_cli.py:12: AttributeError
…
Now we’re getting an AttributeError because there is not an attribute/function called create_parser. Let’s implement a version of that function that creates an ArgumentParser that hasn’t been customized:
~/code/pgbackup/src/pgbackup/cli.py
from argparse import ArgumentParser
def create_parser():
parser = ArgumentParser()
return parser
Once again, let’s run our tests:
(partial make output)
(pgbackup-E7nj_BsO) $ make
…
self = ArgumentParser(prog=‘pytest’, usage=None, description=None, formatter_class=, conflict_handler=‘error’, add_help=True)
status = 2
message = ‘pytest: error: unrecognized arguments: postgres://bob:[email protected]:5432/db_one --driver local /some/path\n’
def exit(self, status=0, message=None):
if message:
self._print_message(message, _sys.stderr)
_sys.exit(status)
E SystemExit: 2
/usr/local/lib/python3.6/argparse.py:2376: SystemExit
-------------------------------------- Captured stderr call ---------------------------------------
usage: pytest [-h]
pytest: error: unrecognized arguments: postgres://bob:[email protected]:5432/db_one --driver local /some/path
=============================== 1 failed, 2 passed in 0.14 seconds ================================
Interestingly, two of the tests succeeded. Those two tests were the ones that expected there to be a SystemExit error. Our tests sent unexpected output to the parser (since it wasn’t configured to accept arguments), and that caused the parser to error. This demonstrates why it’s important to write tests that cover a wide variety of use cases. If we hadn’t implemented the third test to ensure that we get the expected output on success, then our test suite would be green!