Uploaded image for project: 'pdfconverter'
  1. pdfconverter
  2. PDFCON-4

Use pdf2htmlEx to convert PDF to HTML

    XMLWordPrintable

Details

    • Improvement
    • Resolution: Fixed
    • pdf2htmlex
    • Medium

    Description

      The library we use is found under https://github.com/coolwanglu/pdf2htmlEX

      Integrate the process, so the uploaded PDF is converted and the converted files (including images) are moved to the correct place, where the iframe loads them.

      Use the call

      pdf2htmlEX --embed-image 0 --hdpi 72 --no-drm 1  --decompose-ligature 1 <filename>.pdf 
      

      The parameters are the correct once to use.

      I case of multiple PDFs uploaded for the task, the PDFs should be merged into one PDF before converting to an html-file. An easy way to do this is to use Ghostscript. the following example shows how to merge 2 PDFs (org_1.pdf and org_2.pdf) into one new file merged.pdf:

      gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress -sOutputFile=merged.pdf org_1.pdf org_2.pdf
      

      Attachments

        Activity

          People

            Stephan Stephan Bergmann
            marcmittag Marc Mittag [Administrator]
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: