Skip to content
HN On Hacker News ↗

Architecting a Conversion Engine in Swift

▲ 33 points 7 comments by arthurofbabylon 1w ago HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

5 %

AI likelihood · overall

Human
100% human-written 0% AI-generated
SEGMENTS · HUMAN 5 of 5
SEGMENTS · AI 0 of 5
WORD COUNT 1,403
PEAK AI % 17% · §5
Analyzed
Jun 22
backend: pangram/v3.3
Segments scanned
5 windows
avg 281 words each
Distribution
100 / 0%
human / AI fraction
Verdict
Human
Pangram v3.3

Article text · 1,403 words · 5 segments analyzed

Human AI-generated
§1 Human · 0%

Minimal now supports importing and exporting across Markdown, Rich Text, HTML, PDF, plain text, and our proprietary format MNML. The following essay describes how we accomplished this system in the Swift programming language. To read about the human-centered design and how we fit this new technology into our iOS and macOS app, read the design-centric essay right here.It’s not easy to coerce one file format into another, and adding support for more and more file formats gets ever more complex. To manage this complexity, we built a cohesive system that relies on an Intermediate Representation to serve as a middleman across file formats. The Intermediate Representation (or “IR”) allows us to simply convert a given file format to and from the IR, instead of across every possible file format pair. Intermediate RepresentationIf we were to convert notes directly from one format to another, we’d be caught in a complex web of conversions that only gets more complex as we add more file formats. To avoid this mess, we first built a system called Intermediate Representation. Mess vs elegance. Without the IR, six file formats would produce 30 relationships (N^2 - N). With the IR, six file formats produces 12 distinct relationships (N * 2).The IR sits in the middle of all file formats. With this architecture, adding support for a new format simply requires building that format’s own bespoke converter, without any concern for other file formats. Complexity grows linearly as we expand support to new types of data. Nature does this. Biologists call this the bow-tie or hourglass architecture. In gist, a simplified intermediary stage allows the two sides of the interaction to be independently complex. For example, a cell consumes an incredible array of molecules that is then digested into a smaller set of shared intermediaries (catabolism). On the other side, the cell then assembles these intermediaries back into the complex array of molecules that the cell puts to use (anabolism). If the cell had to map all of the input molecules to the required output molecules, it would need multiple times more internal processes to fully metabolize. The Intermediate Representation makes it simpler and makes either side of the operation more evolvable. However, nature provides a powerful lesson for programmers and designers: systems that depend on a shared intermediary often get frozen.

§2 Human · 4%

While the IR makes it easier for any side of the equation to evolve on its own, it makes it harder for the broader system to evolve: changing the definition of the IR requires everything that interacts with it to update to the new shape. An excellent example of architectural lock-in is the genetic code, often described as a "frozen accident." Nucleic-acid information (stored in DNA and copied as RNA) is read in 3-unit sequences called codons, and each codon is an instruction for the assembly of amino acid chains. The meaning of these symbols is so deeply embedded in the machinery of life that it cannot simply be changed or re-interpreted; if the meaning of a codon were changed, proteins would be misconstructed by the cell. The genetic code – the shared mapping of codons to amino acids – is like a biological Intermediate Representation. It sits between the nucleic-acid sequences and the construction of proteins, determining how complex life is expressed via a standardized code. This standard is nearly universal across life for a reason: once it became central to the expression of cellular machinery, it became increasingly improbable that the shared logic and rules might change.CodeBelow is our IR's document structure, written in Swift. We placed it in its own "IR" namespace to prevent naming collisions without creating a dedicated package (this code lives alongside the rest of our code). /// Namespace for the Intermediate-Representation. enum IR { // Intentionally has no cases. Exists purely to scope the types below. } // MARK: - Document extension IR { /// A parsed note in dialect-neutral form. struct Document: Equatable { /// A sequence of `Block` (stacks vertically), each containing a sequence of `Inline` (stacks horizontally). var blocks: [Block] var resources: [String: Resource] init(blocks: [Block] = [], resources: [String: Resource] = [:]) { ... } } } // MARK: - Blocks extension IR { /// A unit of textual content that stacks vertically. indirect enum Block: Equatable { case blankLine case paragraph([Inline]) case codeBlock(language: String?,

§3 Human · 7%

content: String) case heading(level: Int, inlines: [Inline]) case bulletList([ListItem]) case orderedList(items: [ListItem], start: Int) case todoList([TodoItem]) case blockquote([Block]) case pullquote([Inline]) case horizontalRule case embed(resourceId: String) } struct ListItem: Equatable { var blocks: [Block] init(blocks: [Block]) { self.blocks = blocks } } struct TodoItem: Equatable { var checked: Bool var blocks: [Block] init(checked: Bool, blocks: [Block]) { ... } } } // MARK: - Inlines extension IR { /// Content that flows horizontally inside a block. indirect enum Inline: Equatable { case text(String) case strong([Inline]) case emphasis([Inline]) case underline([Inline]) case link(url: String, inlines: [Inline]) case inlineCode(String) case folder(name: String) case embed(resourceId: String) case lineBreak } } // MARK: - Resources extension IR { /// An embed payload. Held off the tree and referenced by id. struct Resource: Equatable { var kind: String var mimeType: String? var data: Data? var url: String? var attributes: [String: String] init(kind: String, mimeType: String? = nil, data: Data? = nil, url: String? = nil, attributes: [String: String] = [:]) { ... } } }Real code describing the IR (Intermediate Representation) structure.Not all file formats support the same conventions (eg, Markdown doesn’t represent colored text, and MNML doesn’t support tables), so during conversion we’ll often emit concessions./// A record of something the engine simplified, downgraded, or set aside during conversion. struct Concession: Equatable { var category: Category var description: String var count: Int?

§4 Human · 13%

init(category: Category, description: String, count: Int? = nil) { ... } enum Category: Equatable { case unsupportedFormatting case downgraded case dropped case truncated } }

extension Array where Element == Concession { mutating func appendOrIncrement(_ concession: Concession) { ... } }Real code describing the structure of concession aggregation and reporting. Parsing and RenderingConversion begins by parsing the source file into the IR, and ends by rendering the IR into the target file format. Below are DocumentParser and DocumentRenderer protocols. Each file format needs to implement its parsing and rendering logic by crafting bespoke implementations of these protocols./// Protocol for producing an `IR.Document` from source text of a specific `Format`. /// Each parser handles exactly one input format — `DocumentParserMinimal` reads MNML, `DocumentParserHTML` reads HTML, and so on. The orchestrator picks the right one based on the caller's `Format`. /// Parsers never render. Their only output is the Intermediate-Representation and any concessions. protocol DocumentParser { /// The input format this parser handles. var format: Format { get } /// Parse `source` into an IR document. /// Any content the IR can't represent is recorded in the result's concessions. func parse(_ source: String) throws -> ParseResult }

/// The outcome of a parse: the IR document, plus a record of what was simplified. struct ParseResult: Equatable { var document: IR.Document var concessions: [Concession] init(document: IR.Document, concessions: [Concession] = []) { self.document = document self.concessions = concessions } }DocumentParser protocol. Real code./// Protocol for producing output of a specific `Format` from an `IR.Document`. /// Each renderer handles exactly one output format: `HTMLRenderer` emits HTML, `MNMLRenderer` emits MNML, and so on. The orchestrator picks the right one based on the caller's `Format`. /// Renderers never parse. Their only input is the Intermediate-Representation; their only output is the rendered form and the concessions incurred producing it (see `RenderResult`).

§5 Human · 17%

protocol DocumentRenderer { /// The output format this renderer produces. var format: Format { get } /// Render `document` into this renderer's format. /// Any IR content the target format can't express is recorded in the result's concessions. func render(_ document: IR.Document) throws -> RenderResult }

/// The outcome of a render: the output payload, plus a record of what was simplified. /// Output is carried as `Data` so this struct can serve text and binary formats alike. /// Text formats (HTML, Markdown, MNML, plain text, RTF) populate `data` with UTF-8 bytes; binary formats (PDF) populate it directly. struct RenderResult: Equatable { var data: Data var concessions: [Concession] init(data: Data, concessions: [Concession] = []) { self.data = data self.concessions = concessions } }DocumentRenderer protocol. Real code.For example, to support HTML we implemented DocumentParserHTML and DocumentRendererHTML, and to support Markdown we implemented DocumentParserMarkdown and DocumentRendererMarkdown, allowing us to convert between HTML and Markdown files.