Check spelling using codespell#
Today I discovered codespell via this Rich commit. codespell
is a really simple spell checker that can be run locally or incorporated into a CI flow.
codespell is designed to run against source code. Instead of using a dictionary of correctly spelled words it instead uses a dictionary of known common spelling mistakes, derived from English Wikipedia (defined here). This makes it less likely to be confused by variable or function names, while still being able to spot spelling mistakes in comments.
Basic usage:
1pip install codespell2codespell3# Or point it at a folder, or files in that folder:4codespell docs/*.rst
This outputs any spelling errors it finds in those files. I got this the first time I ran it against the Datasette documentation:
1docs/authentication.rst:63: perfom ==> perform2docs/authentication.rst:76: perfom ==> perform3docs/changelog.rst:429: repsonse ==> response4docs/changelog.rst:503: permissons ==> permissions5docs/changelog.rst:717: compatibilty ==> compatibility6docs/changelog.rst:1172: browseable ==> browsable7docs/deploying.rst:191: similiar ==> similar8docs/internals.rst:434: Respons ==> Response, respond9docs/internals.rst:440: Respons ==> Response, respond10docs/internals.rst:717: tha ==> than, that, the11docs/performance.rst:42: databse ==> database12docs/plugin_hooks.rst:667: utilites ==> utilities13docs/publish.rst:168: countainer ==> container14docs/settings.rst:352: inalid ==> invalid15docs/sql_queries.rst:406: preceeded ==> preceded, proceeded
You can create a file of additional words that it should ignore and pass that using the --ignore-words
option:
1codespell docs/*.rst --ignore-words docs/codespell-ignore-words.txt
Since I don’t have any words in that file yet I added one fake word, so my file looks like this:
1AddWordsToIgnoreHere
Each ignored word should be on a separate line.
I added it to my GitHub Actions CI like this:
1name: Check spelling in documentation2
3on: [push, pull_request]4
5jobs:6 spellcheck:7 runs-on: ubuntu-latest8 steps:9 - uses: actions/checkout@v210 - name: Set up Python ${{ matrix.python-version }}11 uses: actions/setup-python@v212 with:13 python-version: 3.914 - uses: actions/cache@v215 name: Configure pip caching16 with:17 path: ~/.cache/pip18 key: ${{ runner.os }}-pip-spellcheck19 restore-keys: |20 ${{ runner.os }}-pip-spellcheck21 - name: Install dependencies22 run: |23 pip install codespell24 - name: Check spelling25 run: codespell docs/*.rst --ignore-words docs/codespell-ignore-words.txt
Now any push or pull request will have the spell checker applied to it, and will fail if any new incorrectly spelled words are detected.
Here’s the full PR where I added this to Datasette, and the commit where I added this to sqlite-utils
.