html-to-markdown-swift

0.9.0

A robust, fully featured Swift port of the popular html-to-markdown Go library
jaredhowland/html-to-markdown-swift

What's New

html-to-markdown-swift 0.9.0 (Swift)

A robust, fully featured Swift port of html-to-markdown — convert HTML (even entire websites) into clean, readable Markdown.

Features

  • ✅ Handles deeply nested and malformed HTML
  • ✅ Full CommonMark support
  • GitHub Flavored Markdown (GFM) — tables, task lists, strikethrough
  • ✅ Extensible plugin system — add custom renderers, pre/post processors, and text transformers
  • ✅ Domain resolution — relative links become absolute URLs
  • ✅ CSS selector–based include/exclude filtering
  • ✅ Smart escaping (only escapes when necessary)
  • ✅ Thread-safe converter instances

Usage

Swift Package Manager

Add to your Package.swift:

dependencies: [
    .package(url: "https://github.com/jaredhowland/html-to-markdown-swift.git", from: "0.9.0")
]

Add to your target:

.product(name: "HTMLToMarkdown", package: "html-to-markdown-swift")

Basic Conversion

import HTMLToMarkdown

let html = "<strong>Bold</strong> and <em>italic</em>"
let markdown = try HTMLToMarkdown.convert(html)
// **Bold** and _italic_

With Domain

Convert relative links to absolute URLs:

let html = "<a href=\"/about\">About</a>"
let markdown = try HTMLToMarkdown.convert(html, options: [.domain("https://example.com")])
// [About](https://example.com/about)

With Plugins

let markdown = try HTMLToMarkdown.convert(html, plugins: [
    BasePlugin(),
    CommonmarkPlugin(),
    GFMPlugin()
])

Collapse & Tag Types

Each HTML element has a tag typeblock, inline, or remove. This controls how whitespace and newlines are handled around elements. You can override the type for any tag:

// Treat <div> as inline instead of block
conv.Register.tagType("div", .inline, priority: PriorityEarly)

// Remove an element from output
conv.Register.tagType("nav", .remove)

Plugins

Name Description
BasePlugin Core functionality: default tag types, removes <script>, <style>, <input>
CommonmarkPlugin CommonMark spec: headings, bold, italic, links, images, code, lists, blockquotes, etc.
GFMPlugin GitHub Flavored Markdown: bundles Strikethrough, Table, TaskListItems + definition lists, details/summary, sub/sup, abbreviations
TaskListItemsPlugin Converts <input type="checkbox"> in list items to - [x] / - [ ]
StrikethroughPlugin Converts <strike>, <s>, <del> to ~~text~~
TablePlugin Converts HTML tables to GFM-style pipe tables
VimeoEmbedPlugin Converts Vimeo <iframe> embeds to [Title](https://vimeo.com/ID) links
YouTubeEmbedPlugin Converts YouTube <iframe> embeds to clickable thumbnail images
AtlassianPlugin Atlassian/Confluence: autolinks, image sizing, Confluence code macros, attachment links
MultiMarkdownPlugin MultiMarkdown 4: sub/sup, definition lists, image attributes, figure/figcaption, footnotes
MarkdownExtraPlugin PHP Markdown Extra: definition lists, footnotes, header IDs {#id}, abbreviation reference list
PandocPlugin Pandoc Markdown: LaTeX math ($...$, $$...$$), definition lists, footnotes, sub/sup ^x^/~x~, header IDs
RMarkdownPlugin R Markdown (extends Pandoc): tabsets → ## sections, figure captions from <figcaption>
FrontmatterPlugin Extracts page metadata (<title>, <meta>) and prepends YAML frontmatter
TypographyPlugin Bundles SmartQuotesPlugin, ReplacementsPlugin, LinkifyPlugin; configure with smartQuotes/replacements/linkify flags and quoteStyle (.english, .german, .french, .swedish)
SmartQuotesPlugin Converts straight " and ' to typographic quotes; locale-aware styles; skips code regions; handles <q> elements
ReplacementsPlugin (c)©, (r)®, (tm), +-±, ..., ---, --; skips code regions
LinkifyPlugin Converts bare https:///http:// URLs to [url](url) links; handles parentheses in URLs; skips code regions and existing Markdown links
ReferenceLinkPlugin Numbered reference-style links at document bottom (deduplication, titles); inlineLinks: true to revert to inline
EmojiPlugin GitHub emoji :shortcode: output from <img class="emoji"> and Unicode emoji conversion; bundled 1900+ entry table

Writing a Plugin

Implement the Plugin protocol:

import HTMLToMarkdown

public class MyPlugin: Plugin {
    public var name: String { return "my-plugin" }
    public init() {}

    public func initialize(conv: Converter) throws {
        // Render <aside> as a blockquote
        conv.Register.rendererFor("aside", .block, { ctx, w, node in
            w.writeString("> ")
            ctx.renderChildNodes(w, node)
            return .success
        })

        // Pre-process the DOM before rendering
        conv.Register.preRenderer({ ctx, doc in
            // Modify the SwiftSoup document
        })

        // Post-process the final markdown string
        conv.Register.postRenderer({ ctx, result in
            return result.trimmingCharacters(in: .whitespacesAndNewlines)
        })

        // Bundle another plugin as a dependency
        try conv.Register.plugin(CommonmarkPlugin())
    }
}

Available registration methods:

Method Purpose
rendererFor(tag, type, handler) Render a specific HTML tag
renderer(handler) Catch-all renderer for all tags
preRenderer(handler, priority) Transform DOM before rendering
postRenderer(handler, priority) Transform final markdown string
textTransformer(handler) Transform text node content
escapedChar(char) Mark a character as needing escaping
unEscaper(handler) Control when a character is unescaped
tagType(tag, type, priority) Override block/inline/remove classification
plugin(plugin) Register a sub-plugin dependency

Examples

These examples were generated using html-to-markdown-swift 0.9.0. See the folders under Examples/ for runnable sample code and their output.

See the Examples/ directory for complete runnable examples:

FAQ

Can I extend the converter with custom rules?
Yes — implement the Plugin protocol and register renderers, pre/post processors, or text transformers in initialize(conv:).

Is the output safe to display in a browser?
This library converts HTML to Markdown — it does not sanitize HTML. If you need XSS protection, sanitize the input HTML before conversion or the output Markdown before rendering.

Is it thread-safe?
Yes. Each Converter instance is protected by an internal lock and safe for concurrent use from multiple threads.

Why does my [ get escaped as \[?
The converter automatically escapes characters that would trigger unintended Markdown formatting. If you're writing a custom renderer, use w.writeString(...) directly (bypasses text transformation) instead of writing to a child context.

How do I run the tests?

swift test

Many tests use golden files in Tests/data/ — an input HTML file and an expected Markdown output file. To update golden files after intentional output changes, update the .out.md files accordingly.

How do I contribute?
Issues and pull requests are welcome. Please ensure all tests pass (swift test) and add tests for new behaviour.

License

MIT License. This Swift port is based on html-to-markdown by Johannes Kaufmann. HTML parsing uses SwiftSoup.

Description

  • Swift Tools 5.5.0
View More Packages from this Author

Dependencies

Last updated: Thu Mar 05 2026 12:34:44 GMT-1000 (Hawaii-Aleutian Standard Time)