How to create a test database: Pytest + pytest-postgresql + SQLAlchemy
We need to interactively test our code in the development cycle and don’t want to accidentally wipe out a production database. Here a test database comes to the rescue. This article walks you through the basics step by step.
A few words about SQLAlchemy
If you use python and need to insert data into databases, this is a great library to make your life easy. For example, when you write data to different tables with foreign key relationships, you need to make sure the sequence of SQL statements is right. Using SQLAlchemy, the library will handle it for you.
Two approaches
There are two ways of creating a test database. You can either build it in:
- A docker image.
- A local virtual environment.
For the first option, here is a great example. This post will focus on the second approach.
Steps by Steps (Examples based on Mac)
- Dependency requirements. First, make sure you install
PostgreSQL
usingbrew install postgresql
. Thepytest-postgresql
library requires PostgreSQL versions >= 10. - Now install all the other dependencies. Here I am using poetry to manage the virtual environment. If you deploy the code to production, you need to build
psycopg2
from source. The binary file is only for test and development as recommend by the library.
[tool.poetry.dependencies]
python = "^3.8"
pytest = "^7.1.3"
pytest-postgresql = "^4.1.1"
psycopg = "^3.1.3"
SQLAlchemy = "^1.4.42"
black = "^22.10.0"
isort = "^5.10.1"
psycopg2-binary = "^2.9.3"
3. Once you set up the virtual environment, now we can create some code. Here is the code structure.
my_postgresql/
pyproject.toml
poetry.lock
my_module/
__init__.py
sql_schema.py
tests/
__init__.py
test_sql_schema.py
4. Let’s define the schema in the sql_schema.py
module. Here we have a simple person class that represents a table in the database.
from sqlalchemy import INTEGER, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.types import VARCHAR
Base = declarative_base()
class Person(Base):
__tablename__ = "person"
ID = Column("id", INTEGER, primary_key=True)
Name = Column("name", VARCHAR(20), nullable=False)
def __repr__(self):
return "<Person(Name='%s')>" % (self.Name)
5. Next, we can create a test file named test_sql_schema.py
. Since Halloween is around the corner, let’s generated some test data using the three witches’ names from HocusPocus.
import logging
import pytest
from pytest_postgresql import factories
from pytest_postgresql.janitor import DatabaseJanitor
from sqlalchemy import create_engine, select
from sqlalchemy.orm.session import sessionmaker
from my_module import sql_schema
logging.basicConfig()
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
test_db = factories.postgresql_proc(port=None, dbname="test_db")
@pytest.fixture(scope="session")
def db_session(test_db):
"""Session for SQLAlchemy."""
pg_host = test_db.host
pg_port = test_db.port
pg_user = test_db.user
pg_password = test_db.password
pg_db = test_db.dbname
with DatabaseJanitor(
pg_user, pg_host, pg_port, pg_db, test_db.version, pg_password
):
connection_str = f"postgresql+psycopg2://{pg_user}:@{pg_host}:{pg_port}/{pg_db}"
engine = create_engine(connection_str)
with engine.connect() as con:
sql_schema.Base.metadata.create_all(con)
logger.info("yielding a sessionmaker against the test postgres db.")
yield sessionmaker(bind=engine, expire_on_commit=False)
@pytest.fixture(scope="module")
def create_test_data():
"""Let's create the test data with the three witches names."""
names = ["Winifred", "Sarah", "Mary"]
test_objs = []
for idx, name in zip(range(3), names):
test_objs.append(sql_schema.Person(ID=idx, Name=name))
return test_objs
def test_persons(db_session, create_test_data):
s = db_session()
for obj in create_test_data:
s.add(obj)
s.commit()
logger.info("Added test data to the database.")
query_result = s.execute(select(sql_schema.Person)).all()
s.close()
assert create_test_data[0].Name in str(query_result)
What just happened in this module? The function db_session()
creates a temporary Postgres database during the pytest run time. You need to use DatabaseJanitor in order to work with SQLAlchemy. The function create_test_data
generates the python Person objects, and the last function test_persons
inserts the objects into the test database and queries the data back.
5. Now with everything in place, you can run pytest
in your terminal and see all the test cases pass.
A few departing thoughts
- In production, you should use
scoped_session
from sqlalchemy.orm to control thread safety. Here I am using sessionmaker directly for simple test purposes. - If
sessionmaker
is instantiated in other modules, use the pytestmonkeypatch
to overwrite it for testing. Be careful to align both themonkeypatch
andpostgresql_proc
pytest fixture scope to thesession
level. Otherwise, it won’t work. - Testing is important and worthy of spending efforts.