A robust, fully featured Swift port of html-to-markdown — convert HTML (even entire websites) into clean, readable Markdown.
- ✅ Handles deeply nested and malformed HTML
- ✅ Full CommonMark support
- ✅ GitHub Flavored Markdown (GFM) — tables, task lists, strikethrough
- ✅ Extensible plugin system — add custom renderers, pre/post processors, and text transformers
- ✅ Domain resolution — relative links become absolute URLs
- ✅ CSS selector–based include/exclude filtering
- ✅ Smart escaping (only escapes when necessary)
- ✅ Thread-safe converter instances
Add to your Package.swift:
dependencies: [
.package(url: "https://github.com/jaredhowland/html-to-markdown-swift.git", from: "0.9.0")
]Add to your target:
.product(name: "HTMLToMarkdown", package: "html-to-markdown-swift")import HTMLToMarkdown
let html = "<strong>Bold</strong> and <em>italic</em>"
let markdown = try HTMLToMarkdown.convert(html)
// **Bold** and _italic_Convert relative links to absolute URLs:
let html = "<a href=\"/about\">About</a>"
let markdown = try HTMLToMarkdown.convert(html, options: [.domain("https://example.com")])
// [About](https://example.com/about)let markdown = try HTMLToMarkdown.convert(html, plugins: [
BasePlugin(),
CommonmarkPlugin(),
GFMPlugin()
])Each HTML element has a tag type — block, inline, or remove. This controls how whitespace and newlines are handled around elements. You can override the type for any tag:
// Treat <div> as inline instead of block
conv.Register.tagType("div", .inline, priority: PriorityEarly)
// Remove an element from output
conv.Register.tagType("nav", .remove)| Name | Description |
|---|---|
BasePlugin |
Core functionality: default tag types, removes <script>, <style>, <input> |
CommonmarkPlugin |
CommonMark spec: headings, bold, italic, links, images, code, lists, blockquotes, etc. |
GFMPlugin |
GitHub Flavored Markdown: bundles Strikethrough, Table, TaskListItems + definition lists, details/summary, sub/sup, abbreviations |
TaskListItemsPlugin |
Converts <input type="checkbox"> in list items to - [x] / - [ ] |
StrikethroughPlugin |
Converts <strike>, <s>, <del> to ~~text~~ |
TablePlugin |
Converts HTML tables to GFM-style pipe tables |
VimeoEmbedPlugin |
Converts Vimeo <iframe> embeds to [Title](https://vimeo.com/ID) links |
YouTubeEmbedPlugin |
Converts YouTube <iframe> embeds to clickable thumbnail images |
AtlassianPlugin |
Atlassian/Confluence: autolinks, image sizing, Confluence code macros, attachment links |
MultiMarkdownPlugin |
MultiMarkdown 4: sub/sup, definition lists, image attributes, figure/figcaption, footnotes |
MarkdownExtraPlugin |
PHP Markdown Extra: definition lists, footnotes, header IDs {#id}, abbreviation reference list |
PandocPlugin |
Pandoc Markdown: LaTeX math ($...$, $$...$$), definition lists, footnotes, sub/sup ^x^/~x~, header IDs |
RMarkdownPlugin |
R Markdown (extends Pandoc): tabsets → ## sections, figure captions from <figcaption> |
FrontmatterPlugin |
Extracts page metadata (<title>, <meta>) and prepends YAML frontmatter |
TypographyPlugin |
Bundles SmartQuotesPlugin, ReplacementsPlugin, LinkifyPlugin; configure with smartQuotes/replacements/linkify flags and quoteStyle (.english, .german, .french, .swedish) |
SmartQuotesPlugin |
Converts straight " and ' to typographic quotes; locale-aware styles; skips code regions; handles <q> elements |
ReplacementsPlugin |
(c)→©, (r)→®, (tm)→™, +-→±, ...→…, ---→—, --→–; skips code regions |
LinkifyPlugin |
Converts bare https:///http:// URLs to [url](url) links; handles parentheses in URLs; skips code regions and existing Markdown links |
ReferenceLinkPlugin |
Numbered reference-style links at document bottom (deduplication, titles); inlineLinks: true to revert to inline |
EmojiPlugin |
GitHub emoji :shortcode: output from <img class="emoji"> and Unicode emoji conversion; bundled 1900+ entry table |
Implement the Plugin protocol:
import HTMLToMarkdown
public class MyPlugin: Plugin {
public var name: String { return "my-plugin" }
public init() {}
public func initialize(conv: Converter) throws {
// Render <aside> as a blockquote
conv.Register.rendererFor("aside", .block, { ctx, w, node in
w.writeString("> ")
ctx.renderChildNodes(w, node)
return .success
})
// Pre-process the DOM before rendering
conv.Register.preRenderer({ ctx, doc in
// Modify the SwiftSoup document
})
// Post-process the final markdown string
conv.Register.postRenderer({ ctx, result in
return result.trimmingCharacters(in: .whitespacesAndNewlines)
})
// Bundle another plugin as a dependency
try conv.Register.plugin(CommonmarkPlugin())
}
}Available registration methods:
| Method | Purpose |
|---|---|
rendererFor(tag, type, handler) |
Render a specific HTML tag |
renderer(handler) |
Catch-all renderer for all tags |
preRenderer(handler, priority) |
Transform DOM before rendering |
postRenderer(handler, priority) |
Transform final markdown string |
textTransformer(handler) |
Transform text node content |
escapedChar(char) |
Mark a character as needing escaping |
unEscaper(handler) |
Control when a character is unescaped |
tagType(tag, type, priority) |
Override block/inline/remove classification |
plugin(plugin) |
Register a sub-plugin dependency |
These examples were generated using html-to-markdown-swift 0.9.0. See the folders under Examples/ for runnable sample code and their output.
See the Examples/ directory for complete runnable examples:
- 01 - Basic Conversion
- 02 - Vita with Frontmatter
- 03 - Wikipedia Article
- 04 - Exclude Navigation
- 05 - Custom Plugin
- 06 - GFM Features
- 07 - Atlassian Markdown
- 08 - MultiMarkdown
- 09 - YouTube & Vimeo Embeds
- 10 - Atlassian Confluence
- 11 - Markdown Extra
- 12 - Pandoc
- 13 - R Markdown
- 14 - Typography
- 15 - Reference Links
- 16 - Emoji
Can I extend the converter with custom rules?
Yes — implement the Plugin protocol and register renderers, pre/post processors, or text transformers in initialize(conv:).
Is the output safe to display in a browser?
This library converts HTML to Markdown — it does not sanitize HTML. If you need XSS protection, sanitize the input HTML before conversion or the output Markdown before rendering.
Is it thread-safe?
Yes. Each Converter instance is protected by an internal lock and safe for concurrent use from multiple threads.
Why does my [ get escaped as \[?
The converter automatically escapes characters that would trigger unintended Markdown formatting. If you're writing a custom renderer, use w.writeString(...) directly (bypasses text transformation) instead of writing to a child context.
How do I run the tests?
swift testMany tests use golden files in Tests/data/ — an input HTML file and an expected Markdown output file. To update golden files after intentional output changes, update the .out.md files accordingly.
How do I contribute?
Issues and pull requests are welcome. Please ensure all tests pass (swift test) and add tests for new behaviour.
MIT License. This Swift port is based on html-to-markdown by Johannes Kaufmann. HTML parsing uses SwiftSoup.