Search code examples
luaabstract-syntax-treepandoc

Pandoc Lua Filter: Is there an 'identity function' example?


Is there is an equivalent of the identity transform for Lua Filters for Pandoc available anywhere?

This would be a useful starting point for experimentation, assuming such an identity transform explicitly defined all possible function 'slots'.

For context: my use-case here is to extract specific parts of Markdown documents - mostly codeblocks - and I want to ignore everything else.


Solution

  • Identity transform for the Strong element:

    return {
      {
        Strong = function (elem)
          return elem
        end,
      }
    }
    

    Or if you don't need multiple filter runs, you can do the shorthand:

    function Strong(elem)
      return elem
    end
    

    You can return nil for elements that you want to filter out. For example removing all Strong elements would be:

    function Strong(elem)
      return nil
    end
    

    But there are quite a lot of different elements.

    So perhaps what you should do is do pandoc -t json and pipe that to a program in your favourite programming language and extract all code blocks. See pandoc filter tools for some packages, but if you're only interested in code blocks at the top level (and you don't need to recursively walk the tree), then no package should be needed.

    But it's true, you can achieve the same thing with a lua filter:

    local codeBlocks = {}
    
    return {
      {
        CodeBlock = function (el)
          -- on each CodeBlock match,
          -- append to codeBlocks like it's an array
          table.insert(codeBlocks, el)
        end,
        -- on the next run, we simply return a new
        -- Pandoc document with the code blocks
        Pandoc = function()
          return pandoc.Pandoc(codeBlocks)
        end
      }
    }
    

    Example usage (assuming filter code is in 'filter_codeblock.lua', and markdown document is 'main.md')

    pandoc --lua-filter filter_codeblock.lua main.md -t markdown