Sinclair Target

Even More Tagged Union Subsets with Comptime

May 18, 2026

One of the cooler things you can do with Zig's comptime is create what Mitchell Hashimoto calls "tagged union subsets." Hashimoto shows how he can take an existing tagged union, one representing all possible keyboard shortcut actions in Ghostty, and use comptime to derive more specific unions representing just those actions affecting a particular terminal window or just those actions affecting the entire application.

Having these more specific union types available means that he can write functions accepting only terminal-specific actions or only application-wide actions. These functions look something like this, where ScopedAction(.app) is a call to a function that runs at comptime and returns the more specific union type:

 1// This function handles application-wide keyboard shortcuts.
 2pub fn performAppAction(action: ScopedAction(.app)) void {
 3  switch (action) {
 4    .quit => ...,
 5    .close_all_windows => ...,
 6    .open_config => ...,
 7    .reload_config => ...,
 8  }
 9}
10
11// This function handles shortcuts scoped to a terminal window.
12pub fn performTerminalAction(action: ScopedAction(.terminal)) void {
13    switch (action) {
14        .new_window => ...,
15        .close_window => ...,
16        .scroll_lines => ...,
17    }
18}

This pattern is useful mainly because Zig has exhaustive switching. Given a function like performAppAction() above, Zig will yell at you (helpfully) if you add a new application-wide action but forget to add a matching case to the switch. If performAppAction() instead took an action of type Action, then Zig's yelling would lose specificity; Zig would yell at you if you were to add a terminal-specific action without a matching case. That's much less helpful, since in the context of performAppAction() we don't care about terminal-specific actions.

A Similar Problem with Trees

I recently ran into a situation that seemed to call for using tagged union subsets. I ended up with a solution heavily inspired by Hashimoto's but different in a few ways that I think are interesting.

I am working on a parser for MyST, a new Markdown format. Markdown parsers typically read an input document and produce HTML directly, but MyST has a spec that defines an AST for parsed MyST documents. Much of the code in my parser is concerned with constructing and traversing this AST.

The MyST AST has many different types of nodes. To represent this, I use a large tagged union of node types.

Some nodes (headings, inline emphasis, MyST directives) are allowed to have children while others (inline code, images) are not. While working on the parser, I have often found myself writing functions that transform the AST using a recursive operation on nodes in the tree. Functions in this vein want to do one thing with nodes of a particular type, but, for all other nodes, provided they have children, want to just recurse through to all descendants.

My first attempt at implementing these functions looked something like the following. In this example, I'm traversing the tree looking for myst_directive nodes to hand off to the code that implements MyST directives.

 1fn transformDirectives(alloc: Allocator, node: *ast.Node) !*ast.Node {
 2    switch (node.*) {
 3        // The nodes we're looking for. We pass the node to a function that
 4        // will implement the directive by adding children to the node.
 5        .myst_directive => return try transformBuiltinDirective(alloc, node),
 6        // Nodes with children. We need to recurse to look for more directives. 
 7        // `inline` is a fancy Zig keyword that allows the below switch
 8        // prong to be polymorphic across all our node types with children.
 9        inline .block,
10        .heading,
11        .paragraph,
12        .emphasis,
13        .strong,
14        ...,
15        => |node_payload| {
16            for (0..node_payload.children.len) |i| {
17                node_payload.children[i] = try transformDirectives(
18                    alloc,
19                    node_payload.children[i],
20                );
21            }
22            return node;
23        },
24        // Leaf nodes, nothing to do.
25        .text, 
26        .image, 
27        .code, 
28        .inline_code,
29        ...,
30        => return node,
31    }
32}

This switch has three cases. In order to correctly partition all the various node types into the three cases, I need to list them all out. This is verbose (especially because in real life there are many more types than shown above). Worse, since I have many functions like this one, every time I add a new node type to the AST, I need to update dozens of switch statements across my codebase to make sure every node type is correctly identified as either a node with children or a leaf node.

We might be tempted to make this better by eliding the leaf node types with an else case:

 1fn transformDirectives(alloc: Allocator, node: *ast.Node) !*ast.Node {
 2    switch (node.*) {
 3        // The nodes we're looking for. We pass the node to a function that
 4        // will implement the directive by adding children to the node.
 5        .myst_directive => return try transformBuiltinDirective(alloc, node),
 6        // Nodes with children. We need to recurse to look for more directives. 
 7        // `inline` is a fancy Zig keyword that allows the below switch
 8        // prong to be polymorphic across all our node types with children.
 9        inline .block,
10        .heading,
11        .paragraph,
12        .emphasis,
13        .strong,
14        ...,
15        => |node_payload| {
16            for (0..node_payload.children.len) |i| {
17                node_payload.children[i] = try transformDirectives(
18                    alloc,
19                    node_payload.children[i],
20                );
21            }
22            return node;
23        },
24        else => return node, // Remaining nodes are leaf nodes.
25    }
26}

This makes things less verbose. But, even if we no longer have to worry about the leaf nodes, we still have to list out the nodes with children. More importantly, if we later add a node type that has children but forget to update all our switch statements, we will unknowingly be skipping some subtrees of the AST in our recursion because we will be treating a node with children like a leaf node. A disaster!

Reversing things so that we list out the leaf nodes and put the nodes with children in an inline else case is better. If we were to add a new leaf node type and forget to update the switch, we would get a compile error (from trying to access the children of a node that doesn't have them in our inline else block). We're still stuck maintaining lists of node types that are leaf nodes across every similar switch in our codebase.

What we really want is some way to have both an else covering all the leaf nodes and an inline else covering all the nodes with children, leaving just the .myst_directive case as the one that cares about a specific node type.

Using Comptime to Create Subsets

We can use tagged union subsets to solve this problem. What makes this case different from Hashimoto's is that we don't want to limit the types of nodes transformDirectives() can accept. Instead, we want some kind of smarter switch statement that knows how to handle both nodes with children and leaf nodes without us having to tend these enormous brittle lists of node types everywhere.

We start by defining a function that determines whether a node can have children or is a leaf node:

 1pub const HasChildren = enum {
 2    yes,
 3    no,
 4
 5    /// Maps node types onto a value in the HasChildren enum.
 6    ///
 7    /// In other words, answers whether a type of node has children.
 8    ///
 9    /// `NodeType` is the backing enum for the `Node` tagged union.
10    pub fn fromNodeType(node_type: NodeType) HasChildren {
11        return switch (node_type) {
12            .block,
13            .heading,
14            .paragraph,
15            .emphasis,
16            .strong,
17            ...
18            => .yes,
19            .text, 
20            .image, 
21            .code, 
22            .inline_code,
23            ...,
24            => .no,
25        };
26    }
27};

This is similar to Hashimoto's scope() function, except I've chosen to declare the function on the enum itself. It's verbose because we need to list out all our node types, but this is the one and only place we'll have to do this from now on.

Next, we need a function like Hashimoto's ScopedAction() that creates a tagged union type for each subset we want of our main Node union.

 1/// Returns a tagged union with only those fields matching node types that
 2/// either have or don't have children (depending on `choice`).
 3fn HasChildrenNode(comptime choice: HasChildren) type {
 4    const all_fields = @typeInfo(Node).@"union".fields;
 5
 6    var i: usize = 0;
 7    var fields: [all_fields.len]std.builtin.Type.UnionField = undefined;
 8    for (all_fields) |field| {
 9        const node = @unionInit(Node, field.name, undefined);
10        if (HasChildren.fromNodeType(node) == choice) {
11            fields[i] = .{
12                .name = field.name,
13                .type = *field.type, // Use pointer type so we can modify node
14                .alignment = field.alignment,
15            };
16            i += 1;
17        }
18    }
19
20    return @Type(.{
21        .@"union" = .{
22            .layout = .auto,
23            // Can't just use `null` here, at least not in Zig 0.15.2
24            .tag_type = HasChildrenNodeType(choice),
25            .fields = fields[0..i],
26            .decls = &.{},
27        },
28    });
29}

This function is more or less identical to Hashimoto's. We are copying over the fields from the Node union to our new union subset type, skipping those fields that don't match the choice we've specified for whether the node should have children or not. On line 11, we modify the field type to make it a pointer so that we can still mutate the node union payload; this isn't necessary in Hashimoto's example. On line 24, we also have a call to HasChildrenNodeType() where Hashimoto just had null. I discovered that Zig would complain if I just set the tag_type of the returned union to null; I suspect this has been disallowed in more recent versions of Zig.

So we also need a tag type for our union. The definition of HasChildrenNodeType() is below. This function employs a similar method to create an enum subset that will be the backing enum / tag type for our tagged union subset.

 1/// Returns an enum containing only members from the NodeType enum that either
 2/// have or don't have children (depending on `choice`).
 3fn HasChildrenNodeType(comptime choice: HasChildren) type {
 4    const e_info = @typeInfo(NodeType);
 5    const all_fields = e_info.@"enum".fields;
 6
 7    var i: usize = 0;
 8    var fields: [all_fields.len]std.builtin.Type.EnumField = undefined;
 9    for (all_fields) |field| {
10        // Here we can figure out the node type from the field directly using
11        // @enumFromInit() instead of initializing a Node just to pass to 
12        // `fromNodeType()`.
13        const node_type: NodeType = @enumFromInt(field.value);
14        if (HasChildren.fromNodeType(node_type) == choice) {
15            fields[i] = field;
16            i += 1;
17        }
18    }
19
20    return @Type(.{ .@"enum" = .{
21        .tag_type = e_info.@"enum".tag_type,
22        .fields = fields[0..i],
23        .decls = &.{},
24        .is_exhaustive = true,
25    } });
26}

With all these pieces in place, we now have access to two new types, HasChildrenNode(.yes) and HasChildrenNode(.no). We can define functions that accept only nodes with children by using a parameter of type HasChildrenNode(.yes), or only leaf nodes by using a parameter of type HasChildrenNode(.no). This isn't where we want to stop though.

A Union of Unions

I'm going to throw yet another union at you now. Don't worry, this one isn't generated by some scary comptime function. It's a regular tagged union except that it uses our scary comptime functions from above for the types of its fields.

1// Bisects nodes into those that have children and those that don't.
2const HasChildrenBisection = union(HasChildren) {
3    yes: HasChildrenNode(.yes),
4    no: HasChildrenNode(.no),
5};

This union puts our two subset union types back together as a union of unions. Why would we want to do that? Don't we already have the Node union type we started with? Well, yes, but this union allows us to represent a value of any possible node type in such a way that we can discriminate between nodes with children and nodes without.

We're close to seeing this all come together. The last thing we need is a function on our Node tagged union. This function fills a similar role to the scoped() function that Hashimoto defines on his Action tagged union—it narrows a Node to one of our union subset types. Except instead of just returning one of the union subset types, we're going to return HasChildrenBisection from above:

 1/// Returns a union bisecting nodes into those that have children and those
 2/// that don't.
 3pub fn hasChildren(self: *Node) HasChildrenBisection {
 4    return switch (HasChildren.fromNodeType(self.*)) {
 5        .yes => .{
 6            .yes = switch (self.*) {
 7                inline else => |*n, tag| blk: {
 8                    if (comptime HasChildren.fromNodeType(tag) != .yes) {
 9                        unreachable;
10                    }
11
12                    break :blk @unionInit(
13                        HasChildrenNode(.yes),
14                        @tagName(tag),
15                        n,
16                    );
17                },
18            },
19        },
20        .no => .{
21            .no = switch (self.*) {
22                inline else => |*n, tag| blk: {
23                    if (comptime HasChildren.fromNodeType(tag) != .no) {
24                        unreachable;
25                    }
26
27                    break :blk @unionInit(
28                        HasChildrenNode(.no),
29                        @tagName(tag),
30                        n,
31                    );
32                },
33            },
34        },
35    };
36}

Here we have an outer switch that uses the HasChildren.fromNodeType() function we defined earlier to check whether or not self has children depending on its tag. In either case, we use @unionInit() to create the appropriate union subset type, then set it as the active field (either .yes or .no) of the HasChildrenBisection union that we return.

The HasChildren.fromNodeType() check in the outer switch happens at runtime. In the body of the inline else prongs, tag is a comptime-known value and the HasChildren.fromNodeType() check there happens during comptime. You can think of inline else as a way to tell the compiler to generate a bunch of switch prongs automatically; here we use unreachable essentially to say, "for the node types that don't make sense given the result of the outer switch, don't bother generating an inner switch prong for this node type." Hashimoto does something similar but uses return null as the early exit; we use unreachable because we can be certain that we'll never get a node of the wrong type (since we checked in the outer switch).

Okay, this all seems kinda crazy. But I promise you it's worth it! Because now we can take our earlier implementation of transformDirectives() and rewrite it like this:

 1fn transformDirectives(alloc: Allocator, node: *ast.Node) !*ast.Node {
 2    switch (node.hasChildren()) {
 3        .yes => |branch_node| switch (branch_node) {
 4            // The nodes we're looking for. We pass the node to a function that
 5            // will implement the directive by adding children to the node.
 6            .myst_directive => {
 7                return try transformBuiltinDirective(alloc, node);
 8            },
 9            // Nodes with children. We need to recurse to look for more
10            // directives. 
11            inline else => |node_payload| {
12                for (0..node_payload.children.len) |i| {
13                    node_payload.children[i] = try transformDirectives(
14                        alloc,
15                        scratch,
16                        node_payload.children[i],
17                    );
18                }
19                return node;
20            },
21        },
22        .no => return node, // Nothing to do for leaf nodes.
23    }
24}

We've turned our single switch statement into a nested switch statement. We first switch on the return value of hasChildren(), which tells us which subset of nodes we are dealing with. Then, within each prong of this outer switch, we can deal just with the nodes in that subset. The branch_node value above is of type HasChildrenNode(.yes), so the inner switch on branch_node is exhaustive over just node types with children. This lets us have an inline else block that covers only the unspecified node types that have children.

Since we treat all leaf nodes the same in this particular AST transform, we don't have an inner switch for the .no prong. But there's no reason we couldn't write a transform like this:

 1fn transformDirectivesButWeHateImagesWithLongURLs(
 2    alloc: Allocator, 
 3    node: *ast.Node,
 4) !*ast.Node {
 5    switch (node.hasChildren()) {
 6        .yes => |branch_node| switch (branch_node) {
 7            // The nodes we're looking for. We pass the node to a function that
 8            // will implement the directive by adding children to the node.
 9            .myst_directive => {
10                return try transformBuiltinDirective(alloc, node);
11            },
12            // Nodes with children. We need to recurse to look for more
13            // directives. 
14            inline else => |node_payload| {
15                for (0..node_payload.children.len) |i| {
16                    node_payload.children[i] = try transformDirectives(
17                        alloc,
18                        scratch,
19                        node_payload.children[i],
20                    );
21                }
22                return node;
23            },
24        },
25        .no => |leaf_node| switch (leaf_node) {
26            // Special case for images
27            .image => |node_payload| {
28                if (node_payload.url.len > 200) {
29                    @panic("found image with URL longer than 200 characters")
30                }
31
32                return node,
33            },
34            else => return node, // Nothing to do for remaining leaf nodes
35        },
36    }
37}

This is a nonsense version of transformDirectives() that panics if it encounters an image node with a long URL while traversing the tree. It's not very useful, but it demonstrates how you could have an inner switch for each case of the outer switch if necessary. In the latter inner switch, leaf_node is of type HasChildrenNode(.no) and the switch over leaf_node is exhaustive over only the node types that can't have children.

This nested switch construct gives us exactly what we wanted! We have a way to specify just the node types we really care about while having, in effect, two else cases covering the remaining node types. We now get to make this distinction between nodes with children and nodes without children across our codebase while keeping the verbose classification of node types in a single place.

Conclusion

There are a lot of hoops to jump through to make this all work, but the end result is an ergonomic idiom that I've found myself reusing throughout my parser. It's also possible to save on subsequent hoop-jumping by making those scary comptime functions generic over any partitioning of the Node union into subsets. I'm leaving how to do that as an exercise to the reader. But the upshot is that if I wanted to, say, switch on which parser stage is responsible for producing a given node, or whether a given node can be referenced using a MyST cross-reference, I could do that just by adding an enum similar to HasChildren and a function on Node like hasChildren(). I wouldn't have to write new versions of HasChildrenNode() or HasChildrenNodeType().

This pattern is a powerful tool to have in your back pocket as a Zig programmer. It's basically a way to give switch statements type-narrowing superpowers. It's a minor embellishment on top of Hashimoto's original example and is not all that different from what this Ghostty code is doing, though it does save having to unwrap an optional type. In any case, it makes me excited to see what other comptime tricks are out there yet to be discovered.