=<{$Description}
(:description Web and print exist as two solitudes: printed web pages often disappoint and converting print documents into good web pages is hard. A wiki makes it easy for authors to create rich web content, but is little help if readers wish to print the results. Wikipublisher lets readers turn wiki pages or page collections into print, with a quality better than most word processing documents. This lowers the time and cost of creating online and print versions of the same content, with no loss of quality in either medium. :)
(:title The Wikipublisher Project :)
(:typeset-page title="{$Title}" subtitle="John Rankin(nl)Affinity Limited(nl)Wellington, New Zealand(nl)Email: john.rankin@affinity.co.nz(and)Craig Anslow, James Noble,(nl)Brenda Chawner, Donald Gordon(nl)Victoria University of Wellington(nl)Wellington, New Zealand" urlstyle=on colorlinks=on autonumber=2 fontsize=2col ucsection=on :)
(:bib fmt=num page=WikipublisherProject title=References :)
(:bibend:)
!![[#introduction]] Introduction
Using a wiki makes it easy for authors to collaborate together wherever they may be, as long as they have access to a web browser. This is fine if the result you want is web pages, but what if readers wish to print these? Most people reading more than a page of text will print it; a study of scholarly reading behaviour reports that 80% of researchers read scholarly articles on paper and only 20% read them online cite(Nicholas:etal:2008).
Reading a printed web page is usually a disappointing experience -- we have been conditioned to have low expectations of printing from the Web. Even if the printed result is "good enough" we can still only print one web page at a time, unless the site has deliberately created multi-page articles with a combined "printable view" of the content.
[[http://www.wikipublisher.org/ -> Wikipublisher]] changes this, by allowing users to turn individual pages or page collections into a document suitable for printing. Wiki content is first transformed into XML and then into Latex cite(Lamport:1994), to produce printed output of the highest quality -- superior to anything that can be achieved over the Web using {`CSS|cascading style sheets} or with most word processors.
!![[#design]] The Wikipublisher Project
The Wikipublisher project was conceived in 2004, and the first beta version of the software was released in late 2005. All the software is free and open source. We adopted a number of design principles for the project cite(Rankin:2008):
!Online First! Most of our authoring tools are "print first" and converting print documents into `HTML for the Web is hard to do well. Creating content online first makes it instantly and widely accessible without print to web conversion issues.
!Print Still Matters! The longer and richer the content, the more likely the reader is to print it. Therefore, a web page worth reading is worth printing.
!One Authoritative Online Source! Most publishing systems require three or more versions: word processing source; a `PDF snapshot of the word processing source, and a collection of static web pages generated from the source. The more frequently the content changes and the more authors involved in creating it, the more important it is to have one authoritative source.
Fig(fig.architecture) shows the architecture of Wikipublisher. The core architectural decision was to treat generating web pages and generating print pages as separate services. This means one print server can potentially support many web page servers -- printing is in most cases a low volume activity compared to browsing, so it is inappropriate to burden the web page server with print duties. We define a print `API that lets a web server expose its content in a way that the print server can process. As a result, the print server can work with any web content management system able to support the print `API. This design also promotes a more rigorous separation of the underlying content from its presentation in different media, making a wiki an ideal lightweight content server.
%id=fig.architecture center%Attach:architecture.png"Architecture" | Wikipublisher Architecture
Authors interact with the wiki server with a web browser (1 and 2). To create a print document, a reader submits a form (3) to the print server which says, "If you issue this http request (4), you will receive a stream of Wikibook `XML (5); convert it to Latex and `PDF, then give me back the result (8)." The wiki administrator has configured the wiki server so that (4 and 5), "If you receive an http request (4) in this format, convert wiki to `XML instead (5) instead of `HTML." The wiki server thus needs to give the reader a form (2) in order to, "Tell the print server (at this address) to issue this http request."(3) Finally, the print server needs to retrieve supplementary materials, such as image files, referenced in the `XML (6 and 7), and return a print document (8).
Fig(fig.implementation) shows the pipeline tool suite approach adopted for Wikipublisher. Wiki markup is translated into an intermediate print-oriented `XML form, and then transformed into Latex. The reasons were largely pragmatic -- we built on top of things that already worked. The '''t'''book system cite(Bronger:2003) is a free software project for converting `XML documents into Latex using `XSLT, so if we could convert wiki markup into `XML, we could use '''t'''book to typeset it. The `PmWiki project cite(Michaud:2002) is a ''markup agnostic'' wiki engine (almost), which lets a site administrator redefine or augment the markup translation rules.
%id=fig.implementation center%Attach:implementation.png"Implementation" | Wikipublisher Implementation Pipeline Tool Suite
We wrote a plug-in for PmWiki cite(Rankin:2009) (written in `PHP) that replaces all the wiki to `HTML translation rules with wiki to `XML rules. We found that the wiki markup had rules for which there were no equivalents in the '''t'''book {`DTD|document type definition} and hence no `XML to Latex translations. We therefore added a range of extensions to the '''t'''book `DTD, style files and `XSLT, and called the resulting `XML to Latex conversion service Wikibook and [[Wikibook `DTD -> http://www.wikipublisher.org/dtd/wikibook.dtd]]. The plug-in also provides a "print metadata manager" which lets authors and readers customise the way the print output is presented, by passing configuration parameters to the Wikibook `PDF server.
We made `XML generation and Wikibook transformation as robust as possible. Consistent presentation of printed outputs is completely automatic -- not just within a document type (all reports have the same look), but different document types are all recognisably part of the same family. Businesses which typically produce a large number of documents of a small number of document types can get a consistent look (a house style) at minimal cost and in particular with less quality control effort. There is a huge quality advantage when we shift typesetting from the desk-top to the server, because we eliminate local stylistic variations. Of course, limiting local customisation can also be a disadvantage in many situations.
We run a free public Wikibook `PDF server for those wishing to try out the software. In the past 5 years we have had 340 wiki sites registered to use the Wikipublisher system via the public server. This has been a fruitful source of feedback for the system's evolution, in response to others' experiences. The web site has an issues register for people to log bugs or change requests, a tip of the week where we publish short "how to" stories, a discussion group, software release notes, and a cookbook for user-contributed local customisations (plug-ins) to extend Wikipublisher's capabilities.
!![[#conclusions]] Conclusions
The better Wikipublisher does its job, the less people notice it; good typography is invisible, letting the reader focus on reading. In producing print documents, most people are accustomed to making a trade-off between the convenience of a word processor and the quality of a desk-top publishing system. Most choose convenience, with the unfortunate result that typographic mediocrity has become entrenched in our culture. A big reason for the popularity of wikis is their convenience. Wikipublisher lets us combine the convenience of a wiki with the typesetting quality of the finest desk-top publishing software. Because the system embeds good typesetting practices in the software, the quality comes free.
In the future we plan to deploy the Wikibook PDF server in a Microsoft Windows environment, currently it is only running in GNU/Linux and Mac OS X environments. For further adoption we would like to write Wikipublisher plugins for other content engines such as MediaWiki and Twiki.
!!!User-specified Latex classes
In an ideal world, an author could instruct the Wikibook `PDF server to typeset their content using any valid Latex class file (as long as it is reachable with an http request). The current Wikibook `DTD defines four distinct document types: letter, article, report and book. The wiki plug-in makes sure the wiki produces Wikibook `XML that complies with the requested `DTD. To support user-defined classes, Wikipublisher would have to make sure that the document type used is compatible with the specified class.
It would have been really useful to load the correct `ACM template for this paper! As it was, the authors exported the raw Latex as an article and manually converted this to use a different class.
!!!Use of Wikipublisher
To inform further development of the system, we would like to conduct an empirical study of how people are using Wikipublisher. We would like to explore the following research question with the current user base: "What has been your experience using Wikipublisher?" We envisage setting up an online survey form (on Wikipublisher) and gathering qualitative data from a self-selecting sample of users. The survey would explore the kind of content, motivations for adopting Wikipublisher, benefits they have gained, issues they have encountered, and their plans for the future.