Fancy ASCII with svgbob
- Description
- Using svgbob to transform ascii art in HTML documents
- Date
- tags
- rust
- meta
prelude
The SSG I use, Zola, does not support any extension points beyond templates, shortcodes, and a small set of configuration options. You can get pretty far with just those, but I wanted to be able to make diagrams that are nice looking without needing javascript on the client.
One approach is to just use ascii art in a triple-backtick code block:
+-----------+
| box |
| |
+-----------+
This works, but isn't pretty. There's a tool called svgbob which can take ascii art and convert
it into nice looking graphics. This is perfect, but Zola doesn't support it at the moment.
I made bobifier, a simple Rust program which processes HTML
to find ASCII art and convert it into fancy SVG.
concept
We need a post-processor that takes HTML files, finds a specific structure that we can create in markdown, extracts the text from inside that structure, processes it, and then adds it back in.
Ideally the program is written in Rust to minimize dependencies.
Creating a dummy grammar
Zola supports adding custom language grammars in the textmate format. Our grammar is basically a no-op,
but we want a custom one so that the data-lang attribute of the code block is set to a special value.
This is how we will find it when processing an HTML document.
{
"name": "ascii",
"scopeName": "source.ascii",
"fileTypes": ["ascii"]
}
This makes a new ascii grammar that doesn't do any processing. Now I can use the ascii language
in my code blocks and target them later.
Processing HTML in Rust
The next step is to ingest the HTML document and find the <code> tags with the correct attribute.
Unfortunately, the state of HTML DOM manipulation in Rust is subpar. There are a few libraries designed
for parsing HTML (html5ever, and some derivative libraries), but many are focused on reading only,
or are outdated. Cloudflare owns a library called lol_html which they use for HTML rewriting in their
Workers platform. Unlike traditional DOM-based libraries, lol_html is streaming, meaning it
doesn't hold the entire document in memory and can modify streams of HTML in place.
This is more of a nuisance for me. The counterpoint is that modification of the HTML
is a very explicit goal.
They go into more detail in their blog post. I think with some tomfoolery we can express the desired transformation as sequential transformations:
- Find a
pre.gialloblock, store it, and remove it from the HTML, keeping the inner content. - If we encounter a
codeblock and have apre.giallostored, check thedata-langattribute. If it isascii, discard thepre.gialloand start capturing the inner text. Otherwise, restore thepre.gialloblock surrounding thecode. - If we were capturing the inner text, and we've reached the end of the code block, take the text and generate the SVG.
- Insert the SVG, as well as a hidden copy of the original ascii, into the HTML stream.
There's a small FSM I made which is used to guide behaviors.
+---------+
| Idle |
+--->| |<------+
| +--+------+ |
| | Found pre |
| | Take tag | "Non-Ascii code tag"
| v | replace the removed pre
End of code tag | +-------------+ |
insert SVG and | | InsidePre | |
original content | | +----+
| +--+----------+
| | Found an ascii
| | start capturing text
| v
| +-------------+
+---+ Capturing |
| |
+-------------+
The FSM is a little weird. I have to wrap it in the Rc<RefCell<T>> special,
because it needs to be moved into several lambdas which all mutate it. It does
not need to be Arc<Mutex<T>> because the execution is single threaded.
lol_html is built around the concept of "Handlers", the most common of which is
an element handler. For example:
element!("pre.giallo", |el: &mut Element| {
let attrs = el
.attributes()
.iter()
.map(|attr| (attr.name(), attr.value()))
.collect();
el.remove_and_keep_content();
state.borrow_mut().pre(attrs);
Ok(())
}),
In this example, we use the pre.giallo CSS selector (a <pre> tag with class=giallo) to
find the exterior wrapping around normal code blocks in Zola. We extract all of the attributes of the tag,
and then remove the tag itself, but keep the content. We store the attributes in the state variable, which
is read from later if we want to keep the original content.
The relevant source is here. There are a few
other incantations and tricks, like needing to un-escape the text which breaks svgbob. On top of this, svgbob only
implements the width and height attributes, not the svg-exclusive viewbox attribute. viewbox is required
to make SVG scalable when embedded in an HTML document. To fix this, I use another HTML rewriter that
finds the SVG tag and adds the viewbox attribute with viewbox=0 0 <width> <height>.
This process is run for all of the HTML in a directory. I added a simple configuration file that controls some of the
parameters used for processing, as well as the svgbob settings. The one issue is that zola serve will obviously not call
the bobifier command, but I can set up my own equivalent using zola build, live-server, and inotify-tools, which
I store in the nix shell so I can have all the dependencies.
If you want to use this tool, you should probably just copy it and make whatever changes you need to fit it to your environment.