- Application structure
- Third-party packages
This part is about
- how to organize your code into a package structure
- the installation of third party packages
- preparing to give your own code away to others.
Since the landscape of Python packaging tools is evolving, the main focus of this section is on some general code organization principles that will prove useful no matter what tools you later use to give code away or manage dependencies.
If writing a larger program, you don’t really want to organize it as a large of collection of standalone files at the top level. Here is how you can organize the files in hierarchy.
Any Python source file is a module.
# foo.py def grok(a): ... def spam(b): ...
import statement loads and executes a module.
# program.py import foo a = foo.grok(2) b = foo.spam('Hello') ...
Packages vs Modules
For larger collections of code, it is common to organize modules into a package.
# From this pcost.py report.py fileparse.py # To this porty/ __init__.py pcost.py report.py fileparse.py
You pick a name and make a top-level directory.
porty in the example above (picking this name is the most important first step).
__init__.py file to the directory. It may be empty.
Put your source files into the directory.
Using a package
A package serves as a namespace for imports.
This means that there are now multilevel imports.
import porty.report port = porty.report.read_portfolio('port.csv') # Or from porty import report port = report.read_portfolio('portfolio.csv') from porty.report import read_portfolio port = read_portfolio('portfolio.csv')
There are two main problems with this approach.
- imports between files in the same package break.
- main scripts placed inside the package break.
Imports between files in the same package must now include the package name in the import.
Remember the structure.
porty/ __init__.py pcost.py report.py fileparse.py
Modified import example:
# report.py from porty import fileparse def read_portfolio(filename): return fileparse.parse_csv(...)
These imports are absolute, not relative.
# report.py import fileparse # BREAKS. fileparse not found
Use relative imports inside a package
Instead of directly using the package name, you can use
. to refer to the current package.
# report.py from . import fileparse def read_portfolio(filename): return fileparse.parse_csv(...)
from . import modname makes it easy to rename the package.
Problem: Main Scripts
Running a package submodule as a main script breaks.
bash $ python porty/pcost.py # BREAKS
Reason: You are running Python on a single file and Python doesn’t see the rest of the package structure correctly (
sys.path is wrong).
All imports break.
Solution: run your program in a different way, using the
bash $ python -m porty.pcost # WORKS
The primary purpose of these files is to stitch modules together.
Example: consolidating functions
# porty/__init__.py from .pcost import portfolio_cost from .report import portfolio_report
This makes names appear at the top-level when importing.
from porty import portfolio_cost portfolio_cost('portfolio.csv')
instead of using the multilevel imports.
from porty import pcost pcost.portfolio_cost('portfolio.csv')
Another solution for scripts
As mentioned, you now need to use -m package.module to run scripts within your package.
bash % python3 -m porty.pcost portfolio.csv
There is another alternative: Write a new top-level script.
#!/usr/bin/env python3 # pcost.py import porty.pcost import sys porty.pcost.main(sys.argv)
This script lives outside the package. For example, looking at the directory structure:
pcost.py # top-level-script porty/ # package directory __init__.py pcost.py ...
Code organization and file structure is key to the maintainability of an application.
There is no “one-size fits all” approach for Python.
However, one structure that works for a lot of problems is something like this.
porty-app/ README.txt script.py # SCRIPT porty/ # LIBRARY CODE __init__.py pcost.py report.py fileparse.py
porty-app is a container for everything else –- documentation, top-level scripts, examples, etc.
Again, top-level scripts (if any) need to exist outside the code package. One level up.
Python has a large library of built-in modules.
There are even more third party modules. Check them in the Python Package Index or PyPi. Or just do a Google search for a specific topic.
How to handle third-party dependencies is an ever-evolving topic with Python. This section merely covers the basics to help you wrap your brain around how it works.
The Module Search Path
sys.path is a directory that contains the list of all directories checked by the import statement. Look at it:
>>> import sys >>> sys.path ... look at the result ...
If you import something and it’s not located in one of those directories, you will get an ImportError exception.
Standard Library Modules
Modules from Python’s standard library usually come from a location such as
/usr/local/lib/python3.6. You can find out for certain by trying a short test:
>>> import re >>> re <module 're' from '/usr/local/lib/python3.6/re.py'>
Simply looking at a module in the REPL is a good debugging tip to know about. It will show you the location of the file.
Third party modules are usually located in a dedicated
site-packages directory. You’ll see it if you perform the same steps as above:
>>> import numpy >>> numpy <module 'numpy' from '/usr/local/lib/python3.6/site-packages/numpy/__init__.py'>
Again, looking at a module is a good debugging tip if you’re trying to figure out why something related to
import isn’t working as expected.
The most common technique for installing a third-party module is to use
pip. For example:
bash % python3 -m pip install packagename
This command will download the package and install it in the
- You may be using an installation of Python that you don’t directly control.
- A corporate approved installation
- You’re using the Python version that comes with the OS.
- You might not have permission to install global packages in the computer.
- There might be other dependencies.
Use virtual environment!
A common solution to package installation issues is to create a so-called “virtual environment” for yourself. Naturally, there is no “one way” to do this–in fact, there are several competing tools and techniques. However, if you are using a standard Python installation, you can try typing this:
bash % python -m venv mypython
After a few moments of waiting, you will have a new directory
mypython that’s your own little Python install. Within that directory you’ll find a
bin/ directory (Unix) or a
Scripts/ directory (Windows). If you run the
activate script found there, it will “activate” this version of Python, making it the default python command for the shell. For example:
bash % source mypython/bin/activate (mypython) bash %
From here, you can now start installing Python packages for yourself. For example:
(mypython) bash % python -m pip install pandas
For the purposes of experimenting and trying out different packages, a virtual environment will usually work fine. If, on the other hand, you’re creating an application and it has specific package dependencies, that is a slightly different problem.
Handling Third-Party Dependencies in Your Application
If you have written an application and it has specific third-party dependencies, one challange concerns the creation and preservation of the environment that includes your code and the dependencies.
The current (2020) recommendation is to use Poetry.
Refer to the Python Packaging User Guide for the most up-to-date guide.
At some point you might want to give your code to someone else, possibly just a co-worker. This section gives the most basic technique of doing that. For more detailed information, consult the Python Packaging User Guide.
setup.py file to the top-level of your project directory.
# setup.py import setuptools setuptools.setup( name="porty", version="0.0.1", author="Your Name", author_email="email@example.com", description="Practical Python Code", packages=setuptools.find_packages(), )
If there are additional files associated with your project, specify them with a
MANIFEST.in file. For example:
# MANIFEST.in include *.csv
MANIFEST.in file in the same directory as
Creating a source distribution
To create a distribution of your code, use the
setup.py file. For example:
bash % python setup.py sdist
This will create a
.zip file in the directory
dist/. That file is something that you can now give away to others.
Installing your code
Others can install your Python code using
pip in the same way that they do for other packages. They simply need to supply the file created in the previous step. For example:
bash % python -m pip install porty-0.0.1.tar.gz
The steps above describe the absolute most minimal basics of creating a package of Python code that you can give to another person. In reality, it can be much more complicated depending on third-party dependencies, whether or not your application includes foreign code (i.e., C/C++), and so forth. We’ve only taken a tiny first step.
Refer to the official guide to see how to upload your package to PyPi.
For a deeper discussion and selection of virtual environment, application dependency management tools, check another post dedicated to this topic.