formatferry-markdown

Other

Local-first document-to-Markdown converter supporting 8 file types (HTML, DOCX, PDF, XLSX, CSV, JSON, XML, PPTX) and 8 output flavours (GitHub, CommonMark, Slack, Discord, Reddit, Confluence, R Markdown, custom). Conversion runs entirely in-process — file content never leaves the machine. No server-side processing for local files.

Install

openclaw skills install formatferry-markdown

FormatFerry Markdown Converter

Local-first document-to-Markdown converter. File content is processed entirely in-process — nothing leaves your machine. Supports 8 input formats and 8 output flavours, with optional URL extraction and batch mode for premium users.

Key differentiator vs alternatives: Output flavours tailor Markdown to specific platforms (Slack *bold* vs GitHub **bold**, Confluence wiki markup, R Markdown, etc.). No other converter offers this.

Prerequisites

  • Node.js 18+ and npm must be installed
  • Install the CLI globally:
npm install -g formatferry

Supported File Types

FormatExtensionNotes
HTML.html, .htmWeb pages, snippets
Word.docxMicrosoft Word documents
PDF.pdfIncluding OCR for scanned documents
Excel.xlsxSpreadsheets with tables
CSV.csvComma-separated data
JSON.jsonStructured data
XML.xmlMarkup and data feeds
PowerPoint.pptxSlide content

Supported Markdown Flavours

Use the -f / --flavour flag to select output format:

  • github (default) — GitHub Flavored Markdown
  • commonmark — Standard CommonMark
  • slack — Slack-compatible markdown
  • discord — Discord-compatible markdown
  • reddit — Reddit-compatible markdown
  • confluence — Confluence wiki markup
  • rmarkdown — R Markdown
  • custom — Custom format

Usage Examples

# Convert a file
formatferry -i document.docx -o output.md

# Pipe HTML from stdin
echo '<h1>Hello</h1>' | formatferry

# Choose a flavour
formatferry -i notes.html -f slack -o notes.md

# Convert a PDF (includes OCR for scanned documents)
formatferry -i paper.pdf -o paper.md

# URL extraction (requires FORMATFERRY_API_KEY)
formatferry --url https://example.com/article -o article.md

# Batch convert (requires FORMATFERRY_LICENSE_KEY)
formatferry --batch "docs/**/*.docx" --output-dir ./markdown/

Environment Variables

Both environment variables are optional. The CLI works for local file conversion with zero credentials.

VariableRequiredPurpose
FORMATFERRY_API_KEYNoNeeded only for --url flag (URL extraction). Not needed for local file conversion.
FORMATFERRY_LICENSE_KEYNoNeeded only for --batch mode (premium feature).

Set them via your shell profile or pass inline:

FORMATFERRY_API_KEY=ff_xxxxx formatferry --url https://example.com/article

Privacy

  • Local file conversion is fully in-process — file content is never uploaded or sent to any server
  • Optional license validation ping — if a license key is stored, the CLI may ping formatferry.vibingfun.com to check entitlement (cached for 24h, skippable with --offline)
  • URL extraction (--url) is the only feature that sends content to a server — it fetches and processes the URL server-side
  • --offline flag disables all network calls, falling back to cached or free-tier entitlements

Procedure

  1. Determine input type:

    • Text/pasted content → pipe to stdin or save to temp file
    • File path → use -i <path>
    • URL → use --url <url> (requires FORMATFERRY_API_KEY)
    • Multiple files → use --batch (requires FORMATFERRY_LICENSE_KEY)
  2. Execute conversion:

    # Stdin (most common for agent use)
    echo "$INPUT" | formatferry
    
    # File
    formatferry -i "$FILE_PATH" -o "$OUTPUT_PATH"
    
    # URL
    formatferry --url "$URL" -o output.md
    
  3. Capture output:

    • stdout is Markdown by default
    • Use -o <file> to write directly to file
  4. Return clean Markdown to user

  5. Clean up temp files if created

Pitfalls & Recovery

IssueSolution
formatferry: command not foundInstall via npm install -g formatferry
node: command not foundInstall Node.js 18+ first
API rate limit hitWait 60s or use local file input instead of URL
Large file (>20MB PDF)Consider splitting before conversion
Invalid URLVerify URL starts with http:// or https://
Empty outputVerify input has content; check for HTML entity encoding issues

Verification

# Test basic conversion
echo '<h1>Test</h1><p>Content</p>' | formatferry

# Verify no HTML tags remain
echo '<div>test</div>' | formatferry | grep -c '<.*>' || echo "Clean: 0 HTML tags"

# Test file conversion
echo '<p>File test</p>' > /tmp/test.html
formatferry -i /tmp/test.html