Does anyone have a good setup/configuration for converting documents to Obsidian-flavored markdown with Pandoc? I’ve been fiddling with it for a few hours but can’t seem to get everything right:

  • Obsidian markdown doesn’t support ^superscript^. I can get Pandoc to use sup instead by allowing raw_html, but then…
  • Image embeds don’t work. Pandoc wants to use img for some reason, and no matter what relative src I use the image just won’t show up.

I could fix all of this by running the files through a linter of some sort, but I feel like I’m missing something. Surely someone must have had these issues before me, right?

  • DrakeRichards@lemmy.worldOP
    link
    fedilink
    arrow-up
    3
    ·
    10 months ago

    I got this mostly working, but it was not easy. Not only does Obsidian have a few peculiarities that make it less compatible with standard Markdown, but Word also does a few funny things.

    Here’s the config.yaml I used for Pandoc:

    from: docx
    to: markdown-smart-simple_tables-multiline_tables-grid_tables+pipe_tables+yaml_metadata_block-superscript-subscript-bracketed_spans-native_spans-link_attributes-raw_html+rebase_relative_paths+four_space_rule
    extract-media: "./"
    wrap: preserve
    markdown-headings: atx
    tab-stop: 2
    shift-heading-level-by: 1
    standalone: true
    template: obsidian.md
    filters:
      - compact-list.lua
      - remove-single-characters.py
      - remove-extra-linebreaks.py
    metadata:
      tags: "tags/go/here"
    

    The three filters:

    • Removed extra linebreaks added between bulleted lists to make them more compact.
    • Removed lines with only a single character in them. Usually an invisible character like nbsp, which made Pandoc’s linter not remove them automatically.
    • Removes linebreaks enclosed in Strong tags. This is an artifact from Word where a line is bolded but has no content: technically the line break is bolded.

    I then ran the resulting file through a RegExp replacement to change the superscript carats into HTML sup tags.

    Even after all this, I still have to go through with an Obsidian plugin to convert the standard Markdown links and embeds into [[Wikilink]] style, since Obsidian will only do one or the other throughout your whole vault.