Even More Tagged Union Subsets with Comptime
May 18, 2026
One of the cooler things you can do with Zig's comptime is create what Mitchell Hashimoto calls "tagged union subsets." Hashimoto shows how he can take an existing tagged union, one representing all possible keyboard shortcut actions in Ghostty, and use comptime to derive more specific unions representing just those actions affecting a particular terminal window or just those actions affecting the entire application.
Having these more specific union types available means that he can write
functions accepting only terminal-specific actions or only application-wide
actions. These functions look something like this, where ScopedAction(.app)
is a call to a function that runs at comptime and returns the more specific
union type:
1// This function handles application-wide keyboard shortcuts.
2pub fn performAppAction(action: ScopedAction(.app)) void {
3 switch (action) {
4 .quit => ...,
5 .close_all_windows => ...,
6 .open_config => ...,
7 .reload_config => ...,
8 }
9}
10
11// This function handles shortcuts scoped to a terminal window.
12pub fn performTerminalAction(action: ScopedAction(.terminal)) void {
13 switch (action) {
14 .new_window => ...,
15 .close_window => ...,
16 .scroll_lines => ...,
17 }
18}
This pattern is useful mainly because Zig has exhaustive switching. Given a function like performAppAction() above, Zig will yell at you
(helpfully) if you add a new application-wide action but forget to add a
matching case to the switch. If performAppAction() instead took an
action of type Action, then Zig's yelling would lose specificity; Zig
would yell at you if you were to add a terminal-specific action without a
matching case. That's much less helpful, since in the context of
performAppAction() we don't care about terminal-specific actions.
A Similar Problem with Trees
I recently ran into a situation that seemed to call for using tagged union subsets. I ended up with a solution heavily inspired by Hashimoto's but different in a few ways that I think are interesting.
I am working on a parser for MyST, a new Markdown format. Markdown parsers typically read an input document and produce HTML directly, but MyST has a spec that defines an AST for parsed MyST documents. Much of the code in my parser is concerned with constructing and traversing this AST.
The MyST AST has many different types of nodes. To represent this, I use a large tagged union of node types.
Some nodes (headings, inline emphasis, MyST directives) are allowed to have children while others (inline code, images) are not. While working on the parser, I have often found myself writing functions that transform the AST using a recursive operation on nodes in the tree. Functions in this vein want to do one thing with nodes of a particular type, but, for all other nodes, provided they have children, want to just recurse through to all descendants.
My first attempt at implementing these functions looked something like the
following. In this example, I'm traversing the tree looking for
myst_directive nodes to hand off to the code that implements MyST directives.
1fn transformDirectives(alloc: Allocator, node: *ast.Node) !*ast.Node {
2 switch (node.*) {
3 // The nodes we're looking for. We pass the node to a function that
4 // will implement the directive by adding children to the node.
5 .myst_directive => return try transformBuiltinDirective(alloc, node),
6 // Nodes with children. We need to recurse to look for more directives.
7 // `inline` is a fancy Zig keyword that allows the below switch
8 // prong to be polymorphic across all our node types with children.
9 inline .block,
10 .heading,
11 .paragraph,
12 .emphasis,
13 .strong,
14 ...,
15 => |node_payload| {
16 for (0..node_payload.children.len) |i| {
17 node_payload.children[i] = try transformDirectives(
18 alloc,
19 node_payload.children[i],
20 );
21 }
22 return node;
23 },
24 // Leaf nodes, nothing to do.
25 .text,
26 .image,
27 .code,
28 .inline_code,
29 ...,
30 => return node,
31 }
32}
This switch has three cases. In order to correctly partition all the various node types into the three cases, I need to list them all out. This is verbose (especially because in real life there are many more types than shown above). Worse, since I have many functions like this one, every time I add a new node type to the AST, I need to update dozens of switch statements across my codebase to make sure every node type is correctly identified as either a node with children or a leaf node.
We might be tempted to make this better by eliding the leaf node types with
an else case:
1fn transformDirectives(alloc: Allocator, node: *ast.Node) !*ast.Node {
2 switch (node.*) {
3 // The nodes we're looking for. We pass the node to a function that
4 // will implement the directive by adding children to the node.
5 .myst_directive => return try transformBuiltinDirective(alloc, node),
6 // Nodes with children. We need to recurse to look for more directives.
7 // `inline` is a fancy Zig keyword that allows the below switch
8 // prong to be polymorphic across all our node types with children.
9 inline .block,
10 .heading,
11 .paragraph,
12 .emphasis,
13 .strong,
14 ...,
15 => |node_payload| {
16 for (0..node_payload.children.len) |i| {
17 node_payload.children[i] = try transformDirectives(
18 alloc,
19 node_payload.children[i],
20 );
21 }
22 return node;
23 },
24 else => return node, // Remaining nodes are leaf nodes.
25 }
26}
This makes things less verbose. But, even if we no longer have to worry about the leaf nodes, we still have to list out the nodes with children. More importantly, if we later add a node type that has children but forget to update all our switch statements, we will unknowingly be skipping some subtrees of the AST in our recursion because we will be treating a node with children like a leaf node. A disaster!
Reversing things so that we list out the leaf nodes and put the nodes with
children in an inline else case is better. If we were to add a new leaf node
type and forget to update the switch, we would get a compile error (from
trying to access the children of a node that doesn't have them in our inline else block). We're still stuck maintaining lists of node types that are leaf
nodes across every similar switch in our codebase.
What we really want is some way to have both an else covering all the leaf
nodes and an inline else covering all the nodes with children, leaving just
the .myst_directive case as the one that cares about a specific node type.
Using Comptime to Create Subsets
We can use tagged union subsets to solve this problem. What makes this case
different from Hashimoto's is that we don't want to limit the types of nodes
transformDirectives() can accept. Instead, we want some kind of smarter
switch statement that knows how to handle both nodes with children and leaf
nodes without us having to tend these enormous brittle lists of node types
everywhere.
We start by defining a function that determines whether a node can have children or is a leaf node:
1pub const HasChildren = enum {
2 yes,
3 no,
4
5 /// Maps node types onto a value in the HasChildren enum.
6 ///
7 /// In other words, answers whether a type of node has children.
8 ///
9 /// `NodeType` is the backing enum for the `Node` tagged union.
10 pub fn fromNodeType(node_type: NodeType) HasChildren {
11 return switch (node_type) {
12 .block,
13 .heading,
14 .paragraph,
15 .emphasis,
16 .strong,
17 ...
18 => .yes,
19 .text,
20 .image,
21 .code,
22 .inline_code,
23 ...,
24 => .no,
25 };
26 }
27};
This is similar to Hashimoto's scope() function, except I've chosen to
declare the function on the enum itself. It's verbose because we need to list
out all our node types, but this is the one and only place we'll have to do
this from now on.
Next, we need a function like Hashimoto's ScopedAction() that creates a
tagged union type for each subset we want of our main Node union.
1/// Returns a tagged union with only those fields matching node types that
2/// either have or don't have children (depending on `choice`).
3fn HasChildrenNode(comptime choice: HasChildren) type {
4 const all_fields = @typeInfo(Node).@"union".fields;
5
6 var i: usize = 0;
7 var fields: [all_fields.len]std.builtin.Type.UnionField = undefined;
8 for (all_fields) |field| {
9 const node = @unionInit(Node, field.name, undefined);
10 if (HasChildren.fromNodeType(node) == choice) {
11 fields[i] = .{
12 .name = field.name,
13 .type = *field.type, // Use pointer type so we can modify node
14 .alignment = field.alignment,
15 };
16 i += 1;
17 }
18 }
19
20 return @Type(.{
21 .@"union" = .{
22 .layout = .auto,
23 // Can't just use `null` here, at least not in Zig 0.15.2
24 .tag_type = HasChildrenNodeType(choice),
25 .fields = fields[0..i],
26 .decls = &.{},
27 },
28 });
29}
This function is more or less identical to Hashimoto's. We are copying over the
fields from the Node union to our new union subset type, skipping those
fields that don't match the choice we've specified for whether the node
should have children or not. On line 11, we modify the field type to make it a
pointer so that we can still mutate the node union payload; this isn't
necessary in Hashimoto's example. On line 24, we also have a call to
HasChildrenNodeType() where Hashimoto just had null. I discovered that Zig
would complain if I just set the tag_type of the returned union to null; I
suspect this has been disallowed in more recent versions of Zig.
So we also need a tag type for our union. The definition of
HasChildrenNodeType() is below. This function employs a similar method to
create an enum subset that will be the backing enum / tag type for our tagged
union subset.
1/// Returns an enum containing only members from the NodeType enum that either
2/// have or don't have children (depending on `choice`).
3fn HasChildrenNodeType(comptime choice: HasChildren) type {
4 const e_info = @typeInfo(NodeType);
5 const all_fields = e_info.@"enum".fields;
6
7 var i: usize = 0;
8 var fields: [all_fields.len]std.builtin.Type.EnumField = undefined;
9 for (all_fields) |field| {
10 // Here we can figure out the node type from the field directly using
11 // @enumFromInit() instead of initializing a Node just to pass to
12 // `fromNodeType()`.
13 const node_type: NodeType = @enumFromInt(field.value);
14 if (HasChildren.fromNodeType(node_type) == choice) {
15 fields[i] = field;
16 i += 1;
17 }
18 }
19
20 return @Type(.{ .@"enum" = .{
21 .tag_type = e_info.@"enum".tag_type,
22 .fields = fields[0..i],
23 .decls = &.{},
24 .is_exhaustive = true,
25 } });
26}
With all these pieces in place, we now have access to two new types,
HasChildrenNode(.yes) and HasChildrenNode(.no). We can define functions
that accept only nodes with children by using a parameter of type
HasChildrenNode(.yes), or only leaf nodes by using a parameter of type
HasChildrenNode(.no). This isn't where we want to stop though.
A Union of Unions
I'm going to throw yet another union at you now. Don't worry, this one isn't generated by some scary comptime function. It's a regular tagged union except that it uses our scary comptime functions from above for the types of its fields.
1// Bisects nodes into those that have children and those that don't.
2const HasChildrenBisection = union(HasChildren) {
3 yes: HasChildrenNode(.yes),
4 no: HasChildrenNode(.no),
5};
This union puts our two subset union types back together as a union of unions.
Why would we want to do that? Don't we already have the Node union type we
started with? Well, yes, but this union allows us to represent a value of any
possible node type in such a way that we can discriminate between nodes with
children and nodes without.
We're close to seeing this all come together. The last thing we need is a
function on our Node tagged union. This function fills a similar role to the
scoped() function that Hashimoto defines on his Action tagged union—it
narrows a Node to one of our union subset types. Except instead of just
returning one of the union subset types, we're going to return
HasChildrenBisection from above:
1/// Returns a union bisecting nodes into those that have children and those
2/// that don't.
3pub fn hasChildren(self: *Node) HasChildrenBisection {
4 return switch (HasChildren.fromNodeType(self.*)) {
5 .yes => .{
6 .yes = switch (self.*) {
7 inline else => |*n, tag| blk: {
8 if (comptime HasChildren.fromNodeType(tag) != .yes) {
9 unreachable;
10 }
11
12 break :blk @unionInit(
13 HasChildrenNode(.yes),
14 @tagName(tag),
15 n,
16 );
17 },
18 },
19 },
20 .no => .{
21 .no = switch (self.*) {
22 inline else => |*n, tag| blk: {
23 if (comptime HasChildren.fromNodeType(tag) != .no) {
24 unreachable;
25 }
26
27 break :blk @unionInit(
28 HasChildrenNode(.no),
29 @tagName(tag),
30 n,
31 );
32 },
33 },
34 },
35 };
36}
Here we have an outer switch that uses the HasChildren.fromNodeType()
function we defined earlier to check whether or not self has children
depending on its tag. In either case, we use @unionInit() to create the
appropriate union subset type, then set it as the active field (either .yes
or .no) of the HasChildrenBisection union that we return.
The HasChildren.fromNodeType() check in the outer switch happens at runtime.
In the body of the inline else prongs, tag is a comptime-known value
and the HasChildren.fromNodeType() check there happens during comptime. You
can think of inline else as a way to tell the compiler to generate a bunch of
switch prongs automatically; here we use unreachable essentially to say, "for
the node types that don't make sense given the result of the outer switch,
don't bother generating an inner switch prong for this node type." Hashimoto
does something similar but uses return null as the early exit; we use
unreachable because we can be certain that we'll never get a node of the
wrong type (since we checked in the outer switch).
Okay, this all seems kinda crazy. But I promise you it's worth it! Because now
we can take our earlier implementation of transformDirectives() and rewrite
it like this:
1fn transformDirectives(alloc: Allocator, node: *ast.Node) !*ast.Node {
2 switch (node.hasChildren()) {
3 .yes => |branch_node| switch (branch_node) {
4 // The nodes we're looking for. We pass the node to a function that
5 // will implement the directive by adding children to the node.
6 .myst_directive => {
7 return try transformBuiltinDirective(alloc, node);
8 },
9 // Nodes with children. We need to recurse to look for more
10 // directives.
11 inline else => |node_payload| {
12 for (0..node_payload.children.len) |i| {
13 node_payload.children[i] = try transformDirectives(
14 alloc,
15 scratch,
16 node_payload.children[i],
17 );
18 }
19 return node;
20 },
21 },
22 .no => return node, // Nothing to do for leaf nodes.
23 }
24}
We've turned our single switch statement into a nested switch statement. We
first switch on the return value of hasChildren(), which tells us which
subset of nodes we are dealing with. Then, within each prong of this outer
switch, we can deal just with the nodes in that subset. The branch_node value
above is of type HasChildrenNode(.yes), so the inner switch on branch_node
is exhaustive over just node types with children. This lets us have an inline else block that covers only the unspecified node types that have children.
Since we treat all leaf nodes the same in this particular AST transform, we
don't have an inner switch for the .no prong. But there's no reason we
couldn't write a transform like this:
1fn transformDirectivesButWeHateImagesWithLongURLs(
2 alloc: Allocator,
3 node: *ast.Node,
4) !*ast.Node {
5 switch (node.hasChildren()) {
6 .yes => |branch_node| switch (branch_node) {
7 // The nodes we're looking for. We pass the node to a function that
8 // will implement the directive by adding children to the node.
9 .myst_directive => {
10 return try transformBuiltinDirective(alloc, node);
11 },
12 // Nodes with children. We need to recurse to look for more
13 // directives.
14 inline else => |node_payload| {
15 for (0..node_payload.children.len) |i| {
16 node_payload.children[i] = try transformDirectives(
17 alloc,
18 scratch,
19 node_payload.children[i],
20 );
21 }
22 return node;
23 },
24 },
25 .no => |leaf_node| switch (leaf_node) {
26 // Special case for images
27 .image => |node_payload| {
28 if (node_payload.url.len > 200) {
29 @panic("found image with URL longer than 200 characters")
30 }
31
32 return node,
33 },
34 else => return node, // Nothing to do for remaining leaf nodes
35 },
36 }
37}
This is a nonsense version of transformDirectives() that panics if it
encounters an image node with a long URL while traversing the tree. It's not
very useful, but it demonstrates how you could have an inner switch for each
case of the outer switch if necessary. In the latter inner switch, leaf_node
is of type HasChildrenNode(.no) and the switch over leaf_node is exhaustive
over only the node types that can't have children.
This nested switch construct gives us exactly what we wanted! We have a way to
specify just the node types we really care about while having, in effect, two
else cases covering the remaining node types. We now get to make this
distinction between nodes with children and nodes without children across our
codebase while keeping the verbose classification of node types in a single
place.
Conclusion
There are a lot of hoops to jump through to make this all work, but the end
result is an ergonomic idiom that I've found myself reusing throughout my
parser. It's also possible to save on subsequent hoop-jumping by making
those scary comptime functions generic over any partitioning of the Node
union into subsets. I'm leaving how to do that as an exercise to the reader.
But the upshot is that if I wanted to, say, switch on which parser stage is
responsible for producing a given node, or whether a given node can be
referenced using a MyST cross-reference, I could do that just by adding an enum
similar to HasChildren and a function on Node like hasChildren(). I
wouldn't have to write new versions of HasChildrenNode() or
HasChildrenNodeType().
This pattern is a powerful tool to have in your back pocket as a Zig programmer. It's basically a way to give switch statements type-narrowing superpowers. It's a minor embellishment on top of Hashimoto's original example and is not all that different from what this Ghostty code is doing, though it does save having to unwrap an optional type. In any case, it makes me excited to see what other comptime tricks are out there yet to be discovered.