Writing Playwright tests for a Datasette Plugin#

I really like Playwright for writing automated tests for web applications using a headless browser. It’s pretty easy to install and run, and it works well in GitHub Actions.

Today I integrated Playwright into the tests for one of my Datasette plugins for the first time. I based my work off Alex Garcia’s tests for datasette-comments.

I added Playwright to my datasette-search-all plugin as part of issue #19. Here’s what I did.

Playwright as a test dependency#

I ended up needing two new test dependencies to get Playwright running: pytest-playwright and nest-asyncio (for reasons explained later).

I added those to my setup.py file like this:

1
    extras_require={
2
        "test": ["pytest", "pytest-asyncio", "sqlite-utils", "nest-asyncio"],
3
        "playwright": ["pytest-playwright"]
4
    },

I decided to make playwright part of its own group, so that I could avoid running Playwright tests by default due to the size of the extra browser dependency.

If I was using pyproject.toml for this project I would add this instead:

1
[project.optional-dependencies]
2
test = ["pytest", "pytest-asyncio", "sqlite-utils", "nest-asyncio"]
3
playwright = ["pytest-playwright"]

With either of these patterns in place, the new dependencies can be installed like this:

1
pip install -e '.[test,playwright]'

Running a localhost server for the tests#

I decided to use a pytest fixture to start a localhost server running for the duration of the test. The simplest version of that (wait_until_responds from Alex’s datasette-comments) looks like this:

1
import pytest
2
import sqlite3
3
from subprocess import Popen, PIPE
4
import sys
5
import time
6
import httpx
7

8
@pytest.fixture(scope="session")
9
def ds_server(tmp_path_factory):
10
    tmpdir = tmp_path_factory.mktemp("tmp")
11
    db_path = str(tmpdir / "data.db")
12
    db = sqlite3.connect(db_path)
13
    db.execute("""
14
        create table foo (
15
            id integer primary key,
16
            bar text
17
        )
18
    """)
19
    process = Popen(
20
        [
21
            sys.executable,
22
            "-m",
23
            "datasette",
24
            "--port",
25
            "8126",
26
            str(db_path),
27
        ],
28
        stdout=PIPE,
29
    )
30
    wait_until_responds(
31
        "http://localhost:8126/"
32
    )
33
    yield "http://localhost:8126"
34
    process.terminate()
35
    process.wait()
36

37

38
def wait_until_responds(url, timeout=5.0):
39
    start = time.time()
40
    while time.time() - start < timeout:
41
        try:
42
            httpx.get(url)
43
            return
44
        except httpx.ConnectError:
45
            time.sleep(0.1)
46
    raise AssertionError("Timed out waiting for {} to respond".format(url))

The ds_server fixture creates a SQLite database in a temporary directory, runs Datasette against it using subprocess.Popen() and then waits for the server to respond to a request. Then it yields the URL to that server - that yielded value will become available to any test that uses that fixture.

Note that ds_server is marked as @pytest.fixture(scope="session"). This means that the fixture will be excuted just once per test session and re-used by each test. Without the scope="session" the server will be started and then terminated once per test, which is a lot slower.

See Session-scoped temporary directories in pytest for an explanation of the tmp_path_factory fixture.

Here’s what a basic test then looks like (in tests/test_playwright.py):

1
try:
2
    from playwright import sync_api
3
except ImportError:
4
    sync_api = None
5
import pytest
6

7
@pytest.mark.skipif(sync_api is None, reason="playwright not installed")
8
def test_homepage(ds_server):
9
    with sync_api.sync_playwright() as playwright:
10
        browser = playwright.chromium.launch()
11
        page = browser.new_page()
12
        page.goto(ds_server + "/")
13
        assert page.title() == "Datasette: data"

Within that test, the full Python Playwright API is available for interacting with the server and running assertions. Since it’s running in a real headless Chromium instance all of the JavaScript will be executed as well.

I’m using a except ImportError pattern here such that my tests won’t fail if Playwright has not been installed. The @pytest.mark.skipif decorator causes the test to be marked as skipped if the module was not imported.

Running the tests#

With this module in place, running the tests is like any other pytest invocation:

1
pytest

Or run them specifically like this:

1
pytest tests/test_playwright.py
2
# or
3
pytest -k test_homepage

Refactoring for cleaner code#

After some experimentation I ended up with this pattern instead:

1
try:
2
    from playwright import sync_api
3
except ImportError:
4
    sync_api = None
5
import pytest
6
import nest_asyncio
7

8
nest_asyncio.apply()
9

10
pytestmark = pytest.mark.skipif(sync_api is None, reason="playwright not installed")
11

12

13
def test_ds_server(ds_server, page):
14
    page.goto(ds_server + "/")
15
    assert page.title() == "Datasette: data"
16
    # It should have a search form
17
    assert page.query_selector('form[action="/-/search"]')
18

19
def test_search(ds_server, page):
20
    page.goto(ds_server + "/-/search?q=cleo")
21
    # Should show search results, after fetching them
22
    assert page.locator("table tr th:nth-child(1)").inner_text() == "rowid"
23
    # ... assertions continue

There are two new tricks in here:

I’m using the pytestmark = pytest.mark.skipif() pattern to apply that skipif decorator to every test in this file, without needing to repeat it.
I’m using the page fixture provided by pytest-playwright. This gives me a new page object for each test, without me needing to call the with sync_api.sync_playwright() as playwright boilerplate every time.

One catch with the page fixture is when I first started using it I got this error:

1
This event loop is already running

After some digging around I found a solution in this issue, which was to apply nest_asyncio.apply() at the start of the module.

Running this in GitHub Actions#

I updated my .github/workflows/test.yml workflow to look like this:

1
name: Test
2

3
on: [push, pull_request]
4

5
permissions:
6
  contents: read
7

8
jobs:
9
  test:
10
    runs-on: ubuntu-latest
11
    strategy:
12
      matrix:
13
        python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
14
    steps:
15
    - uses: actions/checkout@v3
16
    - name: Set up Python ${{ matrix.python-version }}
17
      uses: actions/setup-python@v4
18
      with:
19
        python-version: ${{ matrix.python-version }}
20
        cache: pip
21
        cache-dependency-path: setup.py
22
    - name: Cache Playwright browsers
23
      uses: actions/cache@v3
24
      with:
25
        path: ~/.cache/ms-playwright/
26
        key: ${{ runner.os }}-browsers
27
    - name: Install dependencies
28
      run: |
29
        pip install '.[test,playwright]'
30
        playwright install
31
    - name: Run tests
32
      run: |
33
        pytest

This workflow configures caching for Playwright browsers, to ensure that playwright install only downloads the browser binaries the first time the workflow is executed.