Seeing which functions are unique to a specific SQLite / Datasette instance
In reading Scraping JSON, HTML, and ZIP Files with Pure SQLite by Alex Garcia I got curious to see a full list of functions he had registered in his sqlite-extension-examples.fly.dev Datasette instance that weren’t available in a regular Datasette.
Here’s how I figured that out.
pragma_function_list()
You can list all of the functions available to SQLite using PRAGMA function_list
.
Datasette doesn’t allow non-SELECT queries, but it does allow-list a set of pragma_x()
functions which you can instead call like this:
select * from pragma_function_list()
Here’s that for latest.datasette.io (truncated):
name | builtin | type | enc | narg | flags |
---|---|---|---|---|---|
pow | 1 | s | utf8 | 2 | 2099200 |
group_concat | 1 | w | utf8 | 1 | 2097152 |
group_concat | 1 | w | utf8 | 2 | 2097152 |
json_type | 1 | s | utf8 | 1 | 2048 |
json_type | 1 | s | utf8 | 2 | 2048 |
julianday | 1 | s | utf8 | -1 | 2099200 |
I decided to create a comma-separated list of quoted names. I used this query:
select "'" || group_concat(name, "', '") || "'" from pragma_function_list()
Run against latest.datasette.io this returned the following:
'pow', 'group_concat', 'group_concat', 'json_type', 'json_type', 'julianday', 'ntile', 'nullif', 'sqlite_compileoption_get', 'json_valid', 'json_quote', 'json_patch', '->', 'json_array', 'current_timestamp', 'power', 'sqlite_compileoption_used', 'json_remove', 'json_object', 'json_insert', '->>', 'sin', 'sum', 'quote', 'printf', 'likelihood', 'json_replace', 'json_extract', 'last_value', 'rank', 'sign', 'sqrt', 'sinh', 'tan', 'round', 'round', 'rtrim', 'rtrim', 'nth_value', 'tanh', 'random', 'trim', 'trim', 'time', 'radians', 'trunc', 'total', 'substr', 'substr', 'replace', 'upper', 'subtype', 'typeof', 'load_extension', 'load_extension', 'soundex', 'json_group_array', 'avg', 'abs', 'json_group_object', 'json_array_length', 'json_array_length', 'strftime', 'atan', 'asin', 'acos', 'substring', 'substring', 'randomblob', 'unicode', 'percent_rank', 'row_number', 'atanh', 'asinh', 'acosh', 'cos', 'atan2', 'last_insert_rowid', 'sqlite_log', 'unlikely', 'cosh', 'ceil', 'char', 'unixepoch', 'exp', 'count', 'count', 'date', 'ceiling', 'total_changes', 'changes', 'sqlite_version', 'degrees', 'floor', 'coalesce', 'glob', 'zeroblob', 'hex', 'iif', 'sqlite_source_id', 'format', 'datetime', 'cume_dist', 'ln', 'instr', 'json', 'dense_rank', 'log', 'log', 'ifnull', 'current_date', 'current_time', 'lag', 'lag', 'lag', 'mod', 'log2', 'like', 'like', 'max', 'max', 'min', 'min', 'lead', 'lead', 'lead', 'log10', 'lower', 'ltrim', 'ltrim', 'first_value', 'pi', 'length', 'likely', 'json_set', 'escape_fts', 'prepare_connection_args', 'convert_units', 'sleep', 'rtreedepth', 'match', 'snippet', 'fts5_source_id', 'offsets', 'matchinfo', 'matchinfo', 'optimize', 'rtreecheck', 'rtreenode', 'highlight', 'bm25', 'fts3_tokenizer', 'fts3_tokenizer', 'fts5'
Comparing via copy-and-paste
To see the functions that were registered for https://sqlite-extension-examples.fly.dev/ but not for https://latest.datasette.io/ I used the above output to construct the following query:
select name from pragma_function_list() where name not in ('pow', 'group_concat', 'group_concat', 'json_type', 'json_type', 'julianday', 'ntile', 'nullif', 'sqlite_compileoption_get', 'json_valid', 'json_quote', 'json_patch', '->', 'json_array', 'current_timestamp', 'power', 'sqlite_compileoption_used', 'json_remove', 'json_object', 'json_insert', '->>', 'sin', 'sum', 'quote', 'printf', 'likelihood', 'json_replace', 'json_extract', 'last_value', 'rank', 'sign', 'sqrt', 'sinh', 'tan', 'round', 'round', 'rtrim', 'rtrim', 'nth_value', 'tanh', 'random', 'trim', 'trim', 'time', 'radians', 'trunc', 'total', 'substr', 'substr', 'replace', 'upper', 'subtype', 'typeof', 'load_extension', 'load_extension', 'soundex', 'json_group_array', 'avg', 'abs', 'json_group_object', 'json_array_length', 'json_array_length', 'strftime', 'atan', 'asin', 'acos', 'substring', 'substring', 'randomblob', 'unicode', 'percent_rank', 'row_number', 'atanh', 'asinh', 'acosh', 'cos', 'atan2', 'last_insert_rowid', 'sqlite_log', 'unlikely', 'cosh', 'ceil', 'char', 'unixepoch', 'exp', 'count', 'count', 'date', 'ceiling', 'total_changes', 'changes', 'sqlite_version', 'degrees', 'floor', 'coalesce', 'glob', 'zeroblob', 'hex', 'iif', 'sqlite_source_id', 'format', 'datetime', 'cume_dist', 'ln', 'instr', 'json', 'dense_rank', 'log', 'log', 'ifnull', 'current_date', 'current_time', 'lag', 'lag', 'lag', 'mod', 'log2', 'like', 'like', 'max', 'max', 'min', 'min', 'lead', 'lead', 'lead', 'log10', 'lower', 'ltrim', 'ltrim', 'first_value', 'pi', 'length', 'likely', 'json_set', 'escape_fts', 'prepare_connection_args', 'convert_units', 'sleep', 'rtreedepth', 'match', 'snippet', 'fts5_source_id', 'offsets', 'matchinfo', 'matchinfo', 'optimize', 'rtreecheck', 'rtreenode', 'highlight', 'bm25', 'fts3_tokenizer', 'fts3_tokenizer', 'fts5')
This returned the following list:
http_cookieshttp_headers_datehttp_headershttp_debughtml_counthtml_texthtml_texthtml_extracthtml_group_element_spanhtml_group_element_divhtml_elementhttp_versionhtml_escapehtmlfts5_rowidfts5_decode_nonehtml_unescapehtml_trimfts5_decodehttp_headers_gethtml_attr_hasfts5_expr_tclhtml_attr_getfts5_exprfts5_isalnumhtml_versionhttp_headers_hasfts5_foldhtml_validhtml_tablehtml_debughtml_attribute_gethtml_attribute_has
A better solution using json_group_array() and json_each()
Alex pointed out an alternative solution using SQLite’s JSON functions, which is actually better because it avoids any risk of commas or quotation marks in the values breaking the resulting string.
select json_group_array(distinct name)from pragma_function_list()
Try that against latest.datasette.io
Output:
["pow","group_concat","json_type","julianday","ntile","nullif","sqlite_compileoption_get","json_valid","json_quote","json_patch","->","json_array","current_timestamp","power","sqlite_compileoption_used","json_remove","json_object","json_insert","->>","sin","sum","quote","printf","likelihood","json_replace","json_extract","last_value","rank","sign","sqrt","sinh","tan","round","rtrim","nth_value","tanh","random","trim","time","radians","trunc","total","substr","replace","upper","subtype","typeof","load_extension","soundex","json_group_array","avg","abs","json_group_object","json_array_length","strftime","atan","asin","acos","substring","randomblob","unicode","percent_rank","row_number","atanh","asinh","acosh","cos","atan2","last_insert_rowid","sqlite_log","unlikely","cosh","ceil","char","unixepoch","exp","count","date","ceiling","total_changes","changes","sqlite_version","degrees","floor","coalesce","glob","zeroblob","hex","iif","sqlite_source_id","format","datetime","cume_dist","ln","instr","json","dense_rank","log","ifnull","current_date","current_time","lag","mod","log2","like","max","min","lead","log10","lower","ltrim","first_value","pi","length","likely","json_set","escape_fts","prepare_connection_args","convert_units","sleep","rtreedepth","match","snippet","fts5_source_id","offsets","matchinfo","optimize","rtreecheck","rtreenode","highlight","bm25","fts3_tokenizer","fts5"]
Then use this query and run it on the other instance:
select namefrom pragma_function_list()where name not in ( select value from json_each(:json) )
A neat thing about this alternative is you can pass the single JSON string as a named parameter, rather than needing to paste the list of IN terms into the SQL query itself.