In this post you will learn how to import code from repositories. We will review the purpose of an init file and tricks to standardize imports across every notebook.
Create a Repository
To learn how to write import statements, let’s start by creating an example repository of modules with functions that you want to use across many projects.
In your home folder create the following directory structure:
testing
├── myrepo
├── modules
└── background_info.py
background_info.py is your first module. Within that file add the function myname.
def myname(name):
print(f"My name is {name}")
If you want to run that function in python you have to import it.
In your testing directory (the same destination as your myrepo directory), run python
import myrepo.modules.background_info as info
info.myname('Stephanie')
- import statement: The structure of the import is the same as writing a path to the module. Since I am running python where
myrepofolder exists, the path to the filebackground_info.pyismyrepo/modules/background_info.py. When I write an import statement, I can take the path and replace the/with.and remove the.py - as: Including
asafter an import statement allows me to reference the module with a shorter name or an alias. Without theasin this example, my next line of code would have to bemyrepo.modules.background_info.myname('Stephanie')
Init files
Now let’s say you want to add another module to your repo called simple_math.py so your directory structure is now:
testing
├── myrepo
├── modules
└── background_info.py
└── simple_math.py
In the simple_math.py file you include the function add_one:
def add_one(number):
return number + 1
Let’s say you want to import all modules in the myrepo functions with one import statement so you try this:
import myrepo.modules as myrepo
myrepo.background_info.myname('Stephanie')
AttributeError: module 'myrepo.modules' has no attribute 'background_info'
Unfortunately you get an AttributeError because you need an __init__.py file. The __init__.py file can include import statements that will be run each time the directory is imported. For example create an __init__.py file here:
testing
├── myrepo
├── modules
└── __init__.py
└── background_info.py
└── simple_math.py
In the __init__.py file include these import statements:
from myrepo.modules import background_info
from myrepo.modules import simple_math
Now if you try to re-run your this import statement:
import myrepo.modules as myrepo
myrepo.background_info.myname('Stephanie')
myrepo.simple_math.add_one(2)
You will get My name is Stephanie and 3. To reiterate, if you did not have the init file you would have to run:
import myrepo.modules as myrepo
from myrepo.modules import background_info
from myrepo.modules import simple_math
myrepo.background_info.myname('Stephanie')
myrepo.simple_math.add_one(2)
Running your repo code at any destination
Up until now you have run python from the testing directory where your myrepo folder exists. If you try to run python in a different folder you will get the error: ModuleNotFoundError: No module named 'myrepo'. If you want to access your code from anywhere you need to add the location of the testing directory (where your myrepo lives) to your python path.
To include the testing directory in your python path, find the place where you define your python path and add it. In my .bash_profile (located in my home directory) I will add:
export PYTHONPATH='$PYTHONPATH:/Users/stephanie/testing
Now you should be able to run the myrepo code from anywhere on your computer:
import myrepo.modules as myrepo
myrepo.background_info.myname('Stephanie')
myrepo.simple_math.add_one(2)
Tips and Tricks
Packages are all modules but modules are not all packages (type(myrepo)). Any module that has a __path__ attribute is a package. In the example above run myrepo.__path__ and it will print out the full path to the code.
To examine the contents of your python path you can run import sys then sys.path
If you run the same import statements for every project, you can create a .txt file with all the imports and load the file at the top of every notebook. For example if you always run:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import myrepo.modules as myrepo
You can put all of these imports in an import_setup.txt file. At the top of your notebook you can run %load import_setup.txt. The first time you run the cell it will load the contents. In order to run the import statements you must run the cell a second time.


