Markdown Features I Didn't Know About
Feb 23, 2026
To get some experience using Zig, I thought it’d be fun to build a Markdown parser. I was feeling ambitious and decided I’d try my hand at making my parser spec-compliant. I’m now five months deep into this “fun” side-project and shorn of my naivetë.
It turns out Markdown is complicated. And just… big. In all my years of using it, I never thought to lookup the official word on what Markdown can and cannot do. But I wish I had. In my hours and hours of staring at the Commonmark specification, I’ve discovered that there are many features I’d have found helpful had I just known about them.
Markdown parsers, even the Commonmark-compliant ones, can start to diverge in behavior as you combine increasingly esoteric syntax in complicated ways. I’ve tested the examples below across a few different parsers and these particular features are broadly supported. (Though not necessarily by syntax highlighters, you’ll notice.)
Headings
In Markdown, you can create headings using leading # symbols:
|
|
What I didn’t know is that this is just one of two different “styles” of
headings that Markdown supports. Headings using # symbols are known as
“ATX-style” headings, after the markup
format proposed by Aaron Swartz. In
Markdown, you can also use “Setext-style” headings derived from Structure
Enhanced Text.
Setext-style headings look like this:
|
|
One thing I like about Setext-style headings is that the underline can be as long as you want it to be. You can use this freedom to clearly demarcate the various sections in your document:
|
|
A drawback of the Setext style is that it can only be used to define
h1 and h2 headings. If you need deeper levels of subheadings, you have to
use the ATX style and give up on the longer lines. Or so I thought! Because it
turns out you can follow an ATX-style heading with an arbitrary number of
# characters, so visually demarcating your sections is still possible:
|
|
The following is also valid if you don’t like the long lines but want an indicator of depth as you scan down the right-hand side:
|
|
Blockquotes
Did you know that blockquotes can be nested? I’ve never had occasion to blockquote somebody blockquoting somebody else, but I’m happy to know that Markdown would be there for me if I needed to:
|
|
To show you what this would look like, here’s the above rendered by my own blog:
In her latest newsletter, Sarah explains the virtues of Zig:
Zig is great and I love it. Like my friend, Bob, says, it’s just a joyful language to program in. As he describes it:
How could you not like a language called “Zig”? It just makes me want to zig-ah-zig ah.
How can you argue with that?
I think Sarah has a great point.
The leading > character doesn’t have to precede every line of the
blockquote, provided that the blockquote is just one paragraph long:
|
|
In the Commonmark spec, the
trailing lines without the > are called “lazy continuation lines.” If you
want to blockquote multiple paragraphs, you would typically just precede every
line with >. But if you’re really lazy, this is technically valid:
|
|
Finally, any Markdown element can be contained within blockquotes. If you want to blockquote somebody else’s code block, you can:
|
|
Hugo (really Goldmark) can parse this, but I expected my blog to trip up trying to style the resulting HTML. Surprisingly:
Sarah shows how to write Python on her blog:
Functions that Do Nothing
In Python, this is an example of a function
foo()that does nothing:
1 2def foo(): passWasn’t that exciting?
Code Fences
Like headings, code fences come in two styles. There’s the familiar backtick
fence (```), which I’ve already used above to delimit code blocks.
Delightfully, code blocks can also be delimited with squigglies:
|
|
The Commonmark specfication uses the less whimsical name, “tilde fence.” As far as I can see, the tilde fence only exists to make things easier if you want to use lots of backticks in your code block. But you could also avoid problems with nested backticks by using a longer line of backticks in the fence (the block won’t close until a fence at least as long as the opening fence is reached):
|
|
One small thing that clearly distinguishes tilde fences from backtick fences is that they allow you to use backticks in what’s known as the “info string.” The info string follows the opening code fence and is canonically used to specify the name of the programming language used in the code block. But actually anything after that first word should be ignored, so with a tilde fence you could do this:
|
|
Adding info strings like the above to your code blocks could help you keep track of what each code block is meant to demonstrate.
Autolinks
As long as I’ve been using Markdown, I’ve been writing links like this when I want the link URL to appear as text:
|
|
This works, but there’s an easier way: the autolink. The following Markdown is equivalent to the above:
|
|
There’s even special support for parsing email addresses as mailto: links:
|
|
Hard Line Breaks and HTML Character Entities
If you need to control exactly where line breaks and spacing appear in your text, Markdown allows you to do that too.
Hard line breaks are inserted when you end a line using a backslash. They render
as a <br/> element in HTML.
HTML numeric and character entity references, like & or Ӓ, are
also valid in Markdown. You can use to insert non-breaking spaces.
As a last example, putting these tools together, here’s how you can use Markdown to write poetry:
|
|
In Xanadu did Kubla Khan
A stately pleasure-dome decree:
Where Alph, the sacred river, ran
Through caverns measureless to man
Down to a sunless sea.
So twice five miles of fertile ground
With walls and towers were girdled round;
And there were gardens bright with sinuous rills,
Where blossomed many an incense-bearing tree;
And here were forests ancient as the hills,
Enfolding sunny spots of greenery.
The Future of Markdown
The parser I’m working on is meant to be a Commonmark-compliant parser, but only because Commonmark’s syntax is a subset of the syntax I’m actually trying to support. If everything goes well, sometime in 2029 my parser will support all of the MyST Markdown spec.
MyST, or Markedly Structured Text, adds additional elements to Markdown that make it easier to extend. MyST also defines an abstract syntax tree for Markdown documents that makes it possible to parse a document once and potentially output it as HTML, LaTeX, or anything else later.
I’m excited about MyST. It’s intended primarily for scientific publishing, but I see no reason it couldn’t be useful for blog publishing as well. I want to use my parser to build a static site generator based on MyST. First, though, I’ll need to get it successfully parsing the Markdown document for this blog post, which believe you me is a real doozy.