Custom Jinja template tags with attributes#

I decided to implement a custom Jinja template block tag for my datasette-render-markdown plugin. I wanted the tag to work like this:

1
{% markdown %}
2
# This will be rendered as markdown
3

4
- Bulleted
5
- List
6
{% endmarkdown %}

A basic Jinja extension#

After some fiddling around with GitHub Code Search and ChatGPT I settled on this as the simplest possible skeleton for a custom Jinja tag:

1
from jinja2 import nodes
2
from jinja2.exceptions import TemplateSyntaxError
3
from jinja2.ext import Extension
4

5

6
class MarkdownExtension(Extension):
7
    tags = set(["markdown"])
8

9
    def __init__(self, environment):
10
        super(MarkdownExtension, self).__init__(environment)
11

12
    def parse(self, parser):
13
        # We need this for reporting errors
14
        lineno = next(parser.stream).lineno
15
        body = parser.parse_statements(
16
            ["name:endmarkdown"], drop_needle=True
17
        )
18
        return nodes.CallBlock(
19
            self.call_method("_render_markdown"),
20
            [],
21
            [],
22
            body,
23
        ).set_lineno(lineno)
24

25
    async def _render_markdown(self, caller):
26
        return render_markdown(await caller())

Then add it to the Jinja environment like this:

1
env.add_extension(MarkdownExtension)

Note that my _render_makdown() method there is async def. This appeared to be necessary because I run Jinja for Datasette in async mode. If I didn’t I think this would work like this instead:

1
    def _render_markdown(self, caller):
2
        return render_markdown(caller())

I’m not sure of the best way to build an extension that works in both async and regular modes.

Adding support for attributes#

My render_markdown() function takes optional arguments for specifying if certain Markdown extensions should be enabled, or which additional tags and attributes should be allowed rather than being stripped by Bleach.

I decided to use the following syntax for that:

1
{% markdown
2
  extensions="tables"
3
  extra_tags="table thead tr th td tbody"
4
  extra_attrs="p:id,class a:name,href" %}
5
## Markdown table
6

7
First Header  | Second Header
8
------------- | -------------
9
Content Cell  | Content Cell
10
Content Cell  | Content Cell
11

12
<a href="https://www.example.com/" name="namehere">Example</a>
13
<p id="paragraph" class="klass">Paragraph</p>
14
{% endmarkdown %}

Adding key="value" attribute support to a custom Jinja tag was trickier than I expected!

You have to work directly with the parser.

After spending some time in the Python debugger, I figured out that the tokens in my test document looked something like this:

1
[Token(lineno=6, type='name', value='markdown'),
2
 Token(lineno=6, type='name', value='foo'),
3
 Token(lineno=6, type='assign', value='='),
4
 Token(lineno=6, type='string', value='bar'),
5
 Token(lineno=6, type='name', value='baz'),
6
 Token(lineno=6, type='assign', value='='),
7
 Token(lineno=6, type='string', value='bar2'),
8
 Token(lineno=6, type='block_end', value='%}'),
9
 Token(lineno=6, type='data', value='\n# This is markdown'),
10
 Token(lineno=11, type='block_begin', value='{%'),
11
 Token(lineno=11, type='name', value='endmarkdown'),
12
 Token(lineno=11, type='block_end', value='%}'),
13
 Token(lineno=11, type='data', value='\n')]

To turn key="value" syntax into a dictionary of attributes, I would need to read every token up to the block_end ("%}") token, then look for sequences of three tokens - a name, an assign (=) and a string.

I ended up writing this code to do that:

1
    def parse(self, parser):
2
        # We need this for reporting errors
3
        lineno = next(parser.stream).lineno
4

5
        # Gather tokens up to the next block_end ('%}')
6
        gathered = []
7
        while parser.stream.current.type != "block_end":
8
            gathered.append(next(parser.stream))
9

10
        # If all has gone well, we will have a sequence of triples of tokens:
11
        #   (type='name, value='attribute name'),
12
        #   (type='assign', value='='),
13
        #   (type='string', value='attribute value')
14
        # Anything else is a parse error
15

16
        if len(gathered) % 3 != 0:
17
            raise TemplateSyntaxError("Invalid syntax for markdown tag", lineno)
18
        attrs = {}
19
        for i in range(0, len(gathered), 3):
20
            if (
21
                gathered[i].type != "name"
22
                or gathered[i + 1].type != "assign"
23
                or gathered[i + 2].type != "string"
24
            ):
25
                raise TemplateSyntaxError(
26
                    (
27
                        "Invalid syntax for markdown attribute - got "
28
                        "'{}', should be name=\"value\"".format(
29
                            "".join([str(t.value) for t in gathered[i : i + 3]]),
30
                        )
31
                    ),
32
                    lineno,
33
                )
34
            attrs[gathered[i].value] = gathered[i + 2].value

This did the trick! At the end of that block, attrs is a dictionary of all of the key="value" attributes that were included in that open tag.

Validating the attributes#

For my particular template tag, I only wanted three optional attributes to be supported. I added some code to validate them (and handle their slightly weird custom syntax):

1
        # Validate the attributes
2
        kwargs = {}
3
        for attr, value in attrs.items():
4
            if attr in ("extensions", "extra_tags"):
5
                kwargs[attr] = value.split()
6
            elif attr == "extra_attrs":
7
                # Custom syntax: tag:attr1,attr2 tag2:attr3,attr4
8
                extra_attrs = {}
9
                for tag_attrs in value.split():
10
                    tag, attrs = tag_attrs.split(":")
11
                    extra_attrs[tag] = attrs.split(",")
12
                kwargs["extra_attrs"] = extra_attrs
13
            else:
14
                raise TemplateSyntaxError("Unknown attribute '{}'".format(attr), lineno)

Raising TemplateSyntaxError is a clean way to report errors in Jinja - and you pass the current template lineno to that exception to ensure it is reported back to the user.

Passing attributes to the render method#

At the end of this block I had kwargs, ready to be passed to my own render_template(value, **kwargs) function.

But there was one last problem: I needed to call this code:

1
return nodes.CallBlock(
2
    self.call_method("_render_markdown"),
3
    [],
4
    [],
5
    body,
6
).set_lineno(lineno)

While passing through the kwargs I had collected to that _render_markdown() method.

I eventually found a pattern that worked, but it’s kind of gross:

1
        body = parser.parse_statements(["name:endmarkdown"], drop_needle=True)
2

3
        return nodes.CallBlock(
4
            # I couldn't figure out how to send attrs to the _render_markdown
5
            # method other than json.dumps and then passing as a nodes.Const
6
            self.call_method("_render_markdown", [nodes.Const(json.dumps(kwargs))]),
7
            [],
8
            [],
9
            body,
10
        ).set_lineno(lineno)
11

12
    async def _render_markdown(self, kwargs_json, caller):
13
        kwargs = json.loads(kwargs_json)
14
        return render_markdown(await caller(), **kwargs)

I’m serializing the kwargs dictionary to a JSON string, then wrapping that in nodes.Const(). I can then pass that as a list to the .call_method() method call.

Anything passed in that list becomes available to that _render_markdown() method as a positional argument - so I can take kwargs and json.loads() it to get the data back.

I don’t know why I had to do it this way, and I’d be delighted to find a cleaner mechanism for this - but it does work.

The finished code#

You can see the finished code here in the datasette-render-markdown GitHub repository.