Tag Archives: pdf to html

How to convert pdf to html on Ubuntu 9.04

This week, in our Linux Page (in Spanish), we tried to find how to freely convert pdf to html files. Unfortunately we have not been able to discover a satisfying solution. In fact, first of all (1) we upload our complex pdf file (text, color drawing and pictures) on Gmail email and we sent it to ourself. When we opened the email we click on “view as html” option and we were able to read the text (unfortunately too tiny) without drawings and pictures. It was not bad but we were really far from what we were looking for. As second experiment (2) we tried to use Kword and we had (as html) text and pictures but there were many incongruences between the texts and the borders and we missed some phrases. In few words, we had a better look but a worse result. Then (3) we installed pdftohtml using Synaptic Package Manager but unfortunately we were not satisfied from the html file we obtained. At this point we concentrated our researches on a free online solution and in order we tried: “Online conversion tools for Adobe PDF documents“, “convertpdftohtml“, “pdftextonline“, and “pdf-search-engine” but the results were not good. Unfortunately, we dismissed and I confess we could not find a solution to solve the task but we think that the “Kword solution”, if improved, is not far from a good solution in converting the file from pdf to html.  Please, if you have suggestions about this topic, feel free to add a comment. Thank you. AddThis mp3 link