Photo by Alexander Sinn on Unsplash

How to create a test database: Pytest + pytest-postgresql + SQLAlchemy

Mingzhi Yu

--

We need to interactively test our code in the development cycle and don’t want to accidentally wipe out a production database. Here a test database comes to the rescue. This article walks you through the basics step by step.

A few words about SQLAlchemy

If you use python and need to insert data into databases, this is a great library to make your life easy. For example, when you write data to different tables with foreign key relationships, you need to make sure the sequence of SQL statements is right. Using SQLAlchemy, the library will handle it for you.

Two approaches

There are two ways of creating a test database. You can either build it in:

  1. A docker image.
  2. A local virtual environment.

For the first option, here is a great example. This post will focus on the second approach.

Steps by Steps (Examples based on Mac)

  1. Dependency requirements. First, make sure you install PostgreSQL using brew install postgresql. The pytest-postgresql library requires PostgreSQL versions >= 10.
  2. Now install all the other dependencies. Here I am using poetry to manage the virtual environment. If you deploy the code to production, you need to build psycopg2 from source. The binary file is only for test and development as recommend by the library.
[tool.poetry.dependencies]
python = "^3.8"
pytest = "^7.1.3"
pytest-postgresql = "^4.1.1"
psycopg = "^3.1.3"
SQLAlchemy = "^1.4.42"
black = "^22.10.0"
isort = "^5.10.1"
psycopg2-binary = "^2.9.3"

3. Once you set up the virtual environment, now we can create some code. Here is the code structure.

my_postgresql/
pyproject.toml
poetry.lock
my_module/
__init__.py
sql_schema.py
tests/
__init__.py
test_sql_schema.py

4. Let’s define the schema in the sql_schema.py module. Here we have a simple person class that represents a table in the database.

from sqlalchemy import INTEGER, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.types import VARCHAR

Base = declarative_base()


class Person(Base):
__tablename__ = "person"

ID = Column("id", INTEGER, primary_key=True)
Name = Column("name", VARCHAR(20), nullable=False)

def __repr__(self):
return "<Person(Name='%s')>" % (self.Name)

5. Next, we can create a test file named test_sql_schema.py. Since Halloween is around the corner, let’s generated some test data using the three witches’ names from HocusPocus.

import logging

import pytest
from pytest_postgresql import factories
from pytest_postgresql.janitor import DatabaseJanitor
from sqlalchemy import create_engine, select
from sqlalchemy.orm.session import sessionmaker

from my_module import sql_schema

logging.basicConfig()
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

test_db = factories.postgresql_proc(port=None, dbname="test_db")


@pytest.fixture(scope="session")
def db_session(test_db):
"""Session for SQLAlchemy."""
pg_host = test_db.host
pg_port = test_db.port
pg_user = test_db.user
pg_password = test_db.password
pg_db = test_db.dbname

with DatabaseJanitor(
pg_user, pg_host, pg_port, pg_db, test_db.version, pg_password
):
connection_str = f"postgresql+psycopg2://{pg_user}:@{pg_host}:{pg_port}/{pg_db}"
engine = create_engine(connection_str)
with engine.connect() as con:
sql_schema.Base.metadata.create_all(con)
logger.info("yielding a sessionmaker against the test postgres db.")

yield sessionmaker(bind=engine, expire_on_commit=False)


@pytest.fixture(scope="module")
def create_test_data():
"""Let's create the test data with the three witches names."""
names = ["Winifred", "Sarah", "Mary"]
test_objs = []
for idx, name in zip(range(3), names):
test_objs.append(sql_schema.Person(ID=idx, Name=name))

return test_objs


def test_persons(db_session, create_test_data):
s = db_session()
for obj in create_test_data:
s.add(obj)
s.commit()
logger.info("Added test data to the database.")

query_result = s.execute(select(sql_schema.Person)).all()
s.close()

assert create_test_data[0].Name in str(query_result)

What just happened in this module? The function db_session() creates a temporary Postgres database during the pytest run time. You need to use DatabaseJanitor in order to work with SQLAlchemy. The function create_test_data generates the python Person objects, and the last function test_persons inserts the objects into the test database and queries the data back.

5. Now with everything in place, you can run pytest in your terminal and see all the test cases pass.

A few departing thoughts

  1. In production, you should use scoped_session from sqlalchemy.orm to control thread safety. Here I am using sessionmaker directly for simple test purposes.
  2. If sessionmaker is instantiated in other modules, use the pytest monkeypatch to overwrite it for testing. Be careful to align both the monkeypatch and postgresql_proc pytest fixture scope to the session level. Otherwise, it won’t work.
  3. Testing is important and worthy of spending efforts.

Happy testing!

--

--

Mingzhi Yu
Mingzhi Yu

Responses (2)