103 words
1 minute
Fixing broken text encodings with sqlite-transform and ftfy

Fixing broken text encodings with sqlite-transform and ftfy#

I was working with a database table that included values that were clearly in the wrong character encoding - values like this:

Rue Léopold I

I used my sqlite-transform tool with the ftfy Python library to fix that by running the following:

Terminal window
sqlite-transform lambda chiens.db espaces-pour-chiens-et-espaces-interdits-aux-chiens namefr \
--code 'ftfy.fix_encoding(value)' \
--import ftfy

That’s the database file, the table and the column, then --code and --import to specify the transformation.

Since I had installed sqlite-transform using pipx install sqlite-transform I needed to first install the ftfy library into the correct virtual environment. The recipe for doing that is:

Terminal window
pipx inject sqlite-transform ftfy
Fixing broken text encodings with sqlite-transform and ftfy
https://mranv.pages.dev/posts/fixing-broken-text-encodings-with-sqlite-transform-and-ftfy/
Author
Anubhav Gain
Published at
2024-06-23
License
CC BY-NC-SA 4.0