Overview
The HTML Decoder parses HTML strings into AppFlowy EditorDocument objects. It handles both simple and complex HTML structures, including nested elements, inline styles, and various formatting options.
DocumentHTMLDecoder
The main class for decoding HTML to document format.Constructor
Parameters
- customDecoders (
Map<String, ElementParser>): Map of HTML tag names to custom parser functions. Default is an empty map.
Properties
enableColorParse
true, the decoder extracts text color and background color from inline CSS.
Methods
convert
Document object.
Parameters:
input(String): The HTML string to parse
Document - The parsed document
Example:
Helper Function
htmlToDocument
A convenient helper function that provides a simpler API.html(String): The HTML string to parsecustomDecoders(Map<String, ElementParser>): Custom decoders for specific HTML tags
Document - The parsed document
Example:
Supported HTML Tags
Headings
<h1>through<h6>→ Heading nodes with levels 1-6
Text Formatting
<p>→ Paragraph nodes<strong>or<b>→ Bold text<em>or<i>→ Italic text<u>→ Underlined text<del>or<s>→ Strikethrough text<code>→ Inline code<a href="...">→ Links with href attribute<span>→ Styled text with inline CSS<mark>→ Highlighted text
Lists
<ul>→ Bulleted list<ol>→ Numbered list<li>→ List items
Block Elements
<blockquote>→ Quote blocks<img>→ Images (network URLs only)<hr>→ Dividers<br>→ Line breaks<div>→ Container elements
Tables
<table>→ Table structure<tr>→ Table rows<th>→ Table headers<td>→ Table data cells
Inline Style Parsing
The decoder extracts formatting from inline CSS styles:Font Weight
Text Decoration
Colors
Font Style
Example Usage
Basic HTML Parsing
Complex Formatting
Nested Lists
Tables
Special Handling
Google Docs Compatibility
The decoder includes special handling for documents copied from Google Docs:- Handles single all-encompassing tags under the body
- Filters out meaningless tags like
<b style="font-weight:normal;">
Image Requirements
- Only network images (HTTP/HTTPS URLs) are supported
- Invalid or local image URLs are converted to empty paragraphs
Empty Content
Empty text nodes with only whitespace are automatically skipped:Custom Decoders
Create custom decoders for specific HTML tags:HTMLTags Constants
The decoder provides tag name constants:Tag Categories
See Also
- HTML Encoder - Convert documents to HTML
- Markdown Decoder - Parse Markdown content