This is a response to an article in New Zealand Computerworld, 14 February 2006.
Yes, but HTML is unfit for print. A more constructive approach would be to recognise where PDF has a place and encourage government web sites to use it wisely.
In his article, PDF: Unfit for Human Consumption, Jakob Nielsen makes the case that PDF documents are unfit for human consumption online. He notes that the place for PDF is as an electronic delivery mechanism for paper documents, and nothing more.
The reason PDF won’t go away is that, as Norm Walsh says, “Printing from web browsers still sucks”. To focus on PDF is to address a symptom, while ignoring the root cause. Much web content is initially written to be printed. The problem will not be solved until we create format-neutral, multi-purpose content. Then we can serve up content dynamically as HTML for online browsing and as PDF for offline printing.
Creating or storing PDF are not website functions. A better approach is to run a “Typesetting Service” that takes content from a site and composes it into print form, returning a PDF file to the requestor. This gives you true print-on-demand. It also enables content mash-ups — combining selected content from several web sites into a single print document.
The Foundation for Research, Science and Technology part-funded a project to build a prototype of such a typesetting system, under its GPSRD scheme. The result is http://www.wikipublisher.org/ — every web page carries a PDF icon; click the icon and the server generates a PDF. You can typeset page collections, such as a user manual, and control various aspects of the PDF produced, such as get a “large print” version. The software is publicly available and open source.
Sites with large collections of PDF documents need an interim solution. Providing an HTML paragraph describing the document’s key messages is an inadequate response. A better option is to hold all the PDF documents in a suitable web repository. Each document has a full metadata record, not just a textual description. This type of software (http://www.eprints.org/ is one open source example) uses open standards, enabling one search to cover multiple repositories.
This is not a technology issue; we need to rethink how we create, distribute and display information for web audiences: