|
|
Xpdf
|
|
|
====
|
|
|
|
|
|
version 2.00
|
|
|
2002-nov-04
|
|
|
|
|
|
The Xpdf software and documentation are
|
|
|
copyright 1996-2002 Glyph & Cog, LLC.
|
|
|
|
|
|
Email: derekn@foolabs.com
|
|
|
WWW: http://www.foolabs.com/xpdf/
|
|
|
|
|
|
The PDF data structures, operators, and specification are
|
|
|
copyright 1985-2001 Adobe Systems Inc.
|
|
|
|
|
|
|
|
|
What is Xpdf?
|
|
|
-------------
|
|
|
|
|
|
Xpdf is an open source viewer for Portable Document Format (PDF)
|
|
|
files. (These are also sometimes also called 'Acrobat' files, from
|
|
|
the name of Adobe's PDF software.) The Xpdf project also includes a
|
|
|
PDF text extractor, PDF-to-PostScript converter, and various other
|
|
|
utilities.
|
|
|
|
|
|
Xpdf runs under the X Window System on UNIX, VMS, and OS/2. The non-X
|
|
|
components (pdftops, pdftotext, etc.) also run on Win32 systems and
|
|
|
should run on pretty much any system with a decent C++ compiler.
|
|
|
|
|
|
Xpdf is designed to be small and efficient. It can use Type 1,
|
|
|
TrueType, or standard X fonts.
|
|
|
|
|
|
|
|
|
Distribution
|
|
|
------------
|
|
|
|
|
|
Xpdf is licensed under the GNU General Public License (GPL), version
|
|
|
2. In my opinion, the GPL is a convoluted, confusing, ambiguous mess.
|
|
|
But it's also pervasive, and I'm sick of arguing. And even if it is
|
|
|
confusing, the basic idea is good.
|
|
|
|
|
|
In order to cut down on the confusion a little bit, here are some
|
|
|
informal clarifications:
|
|
|
|
|
|
- I don't mind if you redistribute Xpdf in source and/or binary form,
|
|
|
as long as you include all of the documentation: README, man pages
|
|
|
(or help files), and COPYING. (Note that the README file contains a
|
|
|
pointer to a web page with the source code.)
|
|
|
|
|
|
- Selling a CD-ROM that contains Xpdf is fine with me, as long as it
|
|
|
includes the documentation. I wouldn't mind receiving a sample
|
|
|
copy, but it's not necessary.
|
|
|
|
|
|
- If you make useful changes to Xpdf, please make the source code
|
|
|
available -- post it on a web site, email it to me, whatever.
|
|
|
|
|
|
If you're interested in commercial licensing, please see the Glyph &
|
|
|
Cog web site:
|
|
|
|
|
|
http://www.glyphandcog.com/
|
|
|
|
|
|
|
|
|
Compatibility
|
|
|
-------------
|
|
|
|
|
|
Xpdf is developed and tested on a Linux 2.2 x86 system.
|
|
|
|
|
|
In addition, it has been compiled by others on Solaris, AIX, HP-UX,
|
|
|
SCO UnixWare, Digital Unix, Irix, and numerous other Unix
|
|
|
implementations, as well as VMS and OS/2. It should work on pretty
|
|
|
much any system which runs X11 and has Unix-like libraries. You'll
|
|
|
need ANSI C++ and C compilers to compile it.
|
|
|
|
|
|
The non-X components of Xpdf (pdftops, pdftotext, pdfinfo, pdffonts,
|
|
|
pdfimages) can also be compiled on Win32 systems. See the Xpdf web
|
|
|
page for details.
|
|
|
|
|
|
If you compile Xpdf for a system not listed on the web page, please
|
|
|
let me know. If you're willing to make your binary available by ftp
|
|
|
or on the web, I'll be happy to add a link from the Xpdf web page. I
|
|
|
have decided not to host any binaries I didn't compile myself (for
|
|
|
disk space and support reasons).
|
|
|
|
|
|
If you can't get Xpdf to compile on your system, send me email and
|
|
|
I'll try to help.
|
|
|
|
|
|
Xpdf has been ported to the Acorn, Amiga, BeOS, and EPOC. See the
|
|
|
Xpdf web page for links.
|
|
|
|
|
|
|
|
|
Getting Xpdf
|
|
|
------------
|
|
|
|
|
|
The latest version is available from:
|
|
|
|
|
|
http://www.foolabs.com/xpdf/
|
|
|
|
|
|
or:
|
|
|
|
|
|
ftp://ftp.foolabs.com/pub/xpdf/
|
|
|
|
|
|
Source code and several precompiled executables are available.
|
|
|
|
|
|
Announcements of new versions are posted to several newsgroups
|
|
|
(comp.text.pdf, comp.os.linux.announce, and others) and emailed to a
|
|
|
list of people. If you'd like to receive email notification of new
|
|
|
versions, just let me know.
|
|
|
|
|
|
|
|
|
Running Xpdf
|
|
|
------------
|
|
|
|
|
|
To run xpdf, simply type:
|
|
|
|
|
|
xpdf file.pdf
|
|
|
|
|
|
To generate a PostScript file, hit the "print" button in xpdf, or run
|
|
|
pdftops:
|
|
|
|
|
|
pdftops file.pdf
|
|
|
|
|
|
To generate a plain text file, run pdftotext:
|
|
|
|
|
|
pdftotext file.pdf
|
|
|
|
|
|
There are four additional utilities (which are fully described in
|
|
|
their man pages):
|
|
|
|
|
|
pdfinfo -- dumps a PDF file's Info dictionary (plus some other
|
|
|
useful information)
|
|
|
pdffonts -- lists the fonts used in a PDF file along with various
|
|
|
information for each font
|
|
|
pdftopbm -- converts a PDF file to a series of PBM-format bitmaps
|
|
|
pdfimages -- extracts the images from a PDF file
|
|
|
|
|
|
Command line options and many other details are described in the man
|
|
|
pages (xpdf.1, etc.) and the VMS help files (xpdf.hlp, etc.).
|
|
|
|
|
|
|
|
|
Upgrading from Xpdf 0.9x
|
|
|
------------------------
|
|
|
|
|
|
WARNING: Xpdf 1.x switched to a completely different config file setup
|
|
|
than Xpdf 0.9x.
|
|
|
|
|
|
Many of the configuration options that used to be X resources have
|
|
|
been moved into the Xpdf config file. There are also lots of new and
|
|
|
improved options. If you're upgrading from Xpdf 0.9x, please read
|
|
|
through the sample config file (doc/sample-xpdfrc) and the xpdfrc(5)
|
|
|
man page.
|
|
|
|
|
|
The Asian language support has been pulled out into separate packages,
|
|
|
loaded at run time. This is much cleaner than the 0.9x approach -- it
|
|
|
makes the base distribution smaller, allows the language support
|
|
|
packages to be upgraded separately, and lets users customize Xpdf for
|
|
|
other text encodings without modifying the source. See the web site
|
|
|
(http://www.foolabs.com/xpdf) for info on downloading the language
|
|
|
support packages.
|
|
|
|
|
|
All of the Xpdf tools, including the X viewer and the command line
|
|
|
programs, read the same config file. They first attempt to read the
|
|
|
user's personal config file:
|
|
|
|
|
|
$HOME/.xpdfrc
|
|
|
|
|
|
If it is not found, the Xpdf tools read a system-wide config file,
|
|
|
typically something like:
|
|
|
|
|
|
/usr/local/etc/xpdfrc
|
|
|
|
|
|
(this location can be customized when building Xpdf).
|
|
|
|
|
|
The Win32 command line tools look in the directory containing the
|
|
|
executable, e.g.:
|
|
|
|
|
|
C:/Program Files/Xpdf/xpdfrc
|
|
|
|
|
|
Xpdf comes with a "sample-xpdfrc" file in the doc directory. This is
|
|
|
a good starting point for constructing your own config file.
|
|
|
|
|
|
For full details, please see the xpdfrc(5) man page.
|
|
|
|
|
|
|
|
|
Fonts
|
|
|
-----
|
|
|
|
|
|
By default, Xpdf will use X server fonts. It requires the following
|
|
|
fonts:
|
|
|
|
|
|
* Courier: medium-r, bold-r, medium-o, and bold-o
|
|
|
* Helvetica: medium-r, bold-r, medium-o, and bold-o
|
|
|
* Times: medium-r, bold-r, medium-i, and bold-i
|
|
|
* Symbol: medium-r
|
|
|
* Zapf Dingbats: medium-r
|
|
|
|
|
|
Most X installations should already have all of these fonts (except
|
|
|
maybe Zapf Dingbats).
|
|
|
|
|
|
If Xpdf is built with support for t1lib (or FreeType 2), you can get
|
|
|
much higher quality text by using Type 1 fonts instead of X server
|
|
|
fonts. For example, you can use the Type 1 fonts that come with
|
|
|
ghostscript. To do this, add the following lines to the xpdfrc file
|
|
|
(e.g., /usr/local/etc/xpdfrc or $HOME/.xpdfrc):
|
|
|
|
|
|
displayFontT1 Times-Roman /usr/local/share/ghostscript/fonts/n021003l.pfb
|
|
|
displayFontT1 Times-Italic /usr/local/share/ghostscript/fonts/n021023l.pfb
|
|
|
displayFontT1 Times-Bold /usr/local/share/ghostscript/fonts/n021004l.pfb
|
|
|
displayFontT1 Times-BoldItalic /usr/local/share/ghostscript/fonts/n021024l.pfb
|
|
|
displayFontT1 Helvetica /usr/local/share/ghostscript/fonts/n019003l.pfb
|
|
|
displayFontT1 Helvetica-Oblique /usr/local/share/ghostscript/fonts/n019023l.pfb
|
|
|
displayFontT1 Helvetica-Bold /usr/local/share/ghostscript/fonts/n019004l.pfb
|
|
|
displayFontT1 Helvetica-BoldOblique /usr/local/share/ghostscript/fonts/n019024l.pfb
|
|
|
displayFontT1 Courier /usr/local/share/ghostscript/fonts/n022003l.pfb
|
|
|
displayFontT1 Courier-Oblique /usr/local/share/ghostscript/fonts/n022023l.pfb
|
|
|
displayFontT1 Courier-Bold /usr/local/share/ghostscript/fonts/n022004l.pfb
|
|
|
displayFontT1 Courier-BoldOblique /usr/local/share/ghostscript/fonts/n022024l.pfb
|
|
|
displayFontT1 Symbol /usr/local/share/ghostscript/fonts/s050000l.pfb
|
|
|
displayFontT1 ZapfDingbats /usr/local/share/ghostscript/fonts/d050000l.pfb
|
|
|
|
|
|
You will need to replace '/usr/local/share/ghostscript/fonts' with the
|
|
|
appropriate path on your system.
|
|
|
|
|
|
|
|
|
Compiling Xpdf
|
|
|
--------------
|
|
|
|
|
|
See the separate file, INSTALL.
|
|
|
|
|
|
|
|
|
Bugs
|
|
|
----
|
|
|
|
|
|
If you find a bug in Xpdf, i.e., if it prints an error message,
|
|
|
crashes, or incorrectly displays a document, and you don't see that
|
|
|
bug listed here, please send me email, with a pointer (URL, ftp site,
|
|
|
etc.) to the PDF file.
|
|
|
|
|
|
|
|
|
Acknowledgments
|
|
|
---------------
|
|
|
|
|
|
Thanks to:
|
|
|
|
|
|
* Patrick Voigt for help with the remote server code.
|
|
|
* Patrick Moreau, Martin P.J. Zinser, and David Mathog for the VMS
|
|
|
port.
|
|
|
* David Boldt and Rick Rodgers for sample man pages.
|
|
|
* Brendan Miller for the icon idea.
|
|
|
* Olly Betts for help testing pdftotext.
|
|
|
* Peter Ganten for the OS/2 port.
|
|
|
* Michael Richmond for the Win32 port of pdftops and pdftotext.
|
|
|
* Frank M. Siegert for improvements in the PostScript code.
|
|
|
* Leo Smiers for the decryption patches.
|
|
|
* Rainer Menzner for creating t1lib, and for helping me adapt it to
|
|
|
xpdf.
|
|
|
* Pine Tree Systems A/S for funding the OPI and EPS support in
|
|
|
pdftops.
|
|
|
* Easy Software Products for funding the "sh" operator support.
|
|
|
* Tom Kacvinsky for help with FreeType and for being my interface to
|
|
|
the FreeType team.
|
|
|
* Theppitak Karoonboonyanan for help with Thai support.
|
|
|
* Leonard Rosenthol for help and contributions on a bunch of things.
|
|
|
* Alexandros Diamantidis and Maria Adaloglou for help with Greek
|
|
|
support.
|
|
|
* Lawrence Lai for help with the CJK Unicode maps.
|
|
|
|
|
|
Various people have contributed modifications made for use by the
|
|
|
pdftex project:
|
|
|
|
|
|
* Han The Thanh
|
|
|
* Martin Schr<68>der of ArtCom GmbH
|
|
|
|
|
|
|
|
|
References
|
|
|
----------
|
|
|
|
|
|
Adobe Systems Inc., _PDF Reference: Adobe Portable Document Format
|
|
|
version 1.4_, 3nd ed.
|
|
|
Addison-Wesley, 2001, ISBN 0-201-75839-3.
|
|
|
http://partners.adobe.com/asn/developer/acrosdk/docs/filefmtspecs/PDFReference.pdf
|
|
|
[The printed manual for PDF version 1.4.]
|
|
|
|
|
|
Adobe Systems Inc., _Portable Document Format: Changes from Version
|
|
|
1.3 to 1.4_, Adobe Developer Support Technical Note #5409.
|
|
|
June 11, 2001.
|
|
|
http://partners.adobe.com/asn/developer/acrosdk/docs/filefmtspecs/PDF14Deltas.pdf
|
|
|
[Updates for PDF 1.4.]
|
|
|
|
|
|
Adobe Systems Inc., _PostScript Language Reference_, 3rd ed.
|
|
|
Addison-Wesley, 1999, ISBN 0-201-37922-8.
|
|
|
[The official PostScript manual.]
|
|
|
|
|
|
Adobe Systems, Inc., _The Type 42 Font Format Specification_,
|
|
|
Adobe Developer Support Technical Specification #5012. 1998.
|
|
|
http://partners.adobe.com/asn/developer/pdfs/tn/5012.Type42_Spec.pdf
|
|
|
[Type 42 is the format used to embed TrueType fonts in PostScript
|
|
|
files.]
|
|
|
|
|
|
Adobe Systems, Inc., _Adobe CMap and CIDFont Files Specification_,
|
|
|
Adobe Developer Support Technical Specification #5014. 1995.
|
|
|
http://www.adobe.com/supportservice/devrelations/PDFS/TN/5014.CIDFont_Spec.pdf
|
|
|
[CMap file format needed for Japanese and Chinese font support.]
|
|
|
|
|
|
Adobe Systems, Inc., _Adobe-Japan1-4 Character Collection for
|
|
|
CID-Keyed Fonts_, Adobe Developer Support Technical Note #5078.
|
|
|
2000.
|
|
|
http://partners.adobe.com/asn/developer/PDFS/TN/5078.CID_Glyph.pdf
|
|
|
[The Adobe Japanese character set.]
|
|
|
|
|
|
Adobe Systems, Inc., _Adobe-GB1-4 Character Collection for
|
|
|
CID-Keyed Fonts_, Adobe Developer Support Technical Note #5079.
|
|
|
2000.
|
|
|
http://partners.adobe.com/asn/developer/pdfs/tn/5079.Adobe-GB1-4.pdf
|
|
|
[The Adobe Chinese GB (simplified) character set.]
|
|
|
|
|
|
Adobe Systems, Inc., _Adobe-CNS1-3 Character Collection for
|
|
|
CID-Keyed Fonts_, Adobe Developer Support Technical Note #5080.
|
|
|
2000.
|
|
|
http://partners.adobe.com/asn/developer/PDFS/TN/5080.CNS_CharColl.pdf
|
|
|
[The Adobe Chinese CNS (traditional) character set.]
|
|
|
|
|
|
Adobe Systems Inc., _Supporting the DCT Filters in PostScript Level
|
|
|
2_, Adobe Developer Support Technical Note #5116. 1992.
|
|
|
http://www.adobe.com/supportservice/devrelations/PDFS/TN/5116.PS2_DCT.PDF
|
|
|
[Description of the DCTDecode filter parameters.]
|
|
|
|
|
|
Adobe Systems Inc., _Open Prepress Interface (OPI) Specification -
|
|
|
Version 2.0_, Adobe Developer Support Technical Note #5660. 2000.
|
|
|
http://partners.adobe.com/asn/developer/PDFS/TN/5660.OPI_2.0.pdf
|
|
|
|
|
|
Adobe Systems Inc., CMap files.
|
|
|
ftp://ftp.oreilly.com/pub/examples/nutshell/cjkv/adobe/
|
|
|
[The actual CMap files for the 16-bit CJK encodings.]
|
|
|
|
|
|
Adobe Systems Inc., Unicode glyph lists.
|
|
|
http://partners.adobe.com/asn/developer/type/unicodegn.html
|
|
|
http://partners.adobe.com/asn/developer/type/glyphlist.txt
|
|
|
http://partners.adobe.com/asn/developer/type/corporateuse.txt
|
|
|
http://partners.adobe.com/asn/developer/type/zapfdingbats.txt
|
|
|
[Mappings between character names to Unicode.]
|
|
|
|
|
|
Aldus Corp., _OPI: Open Prepress Interface Specification 1.3_. 1993.
|
|
|
http://partners.adobe.com/asn/developer/PDFS/TN/OPI_13.pdf
|
|
|
|
|
|
Anonymous, RC4 source code.
|
|
|
ftp://ftp.ox.ac.uk/pub/crypto/misc/rc4.tar.gz
|
|
|
ftp://idea.sec.dsi.unimi.it/pub/crypt/code/rc4.tar.gz
|
|
|
[This is the algorithm used to encrypt PDF files.]
|
|
|
|
|
|
T. Boutell, et al., "PNG (Portable Network Graphics) Specification,
|
|
|
Version 1.0. RFC 2083.
|
|
|
[PDF uses the PNG filter algorithms.]
|
|
|
|
|
|
CCITT, "Information Technology - Digital Compression and Coding of
|
|
|
Continuous-tone Still Images - Requirements and Guidelines", CCITT
|
|
|
Recommendation T.81.
|
|
|
http://www.w3.org/Graphics/JPEG/
|
|
|
[The official JPEG spec.]
|
|
|
|
|
|
A. Chernov, "Registration of a Cyrillic Character Set". RFC 1489.
|
|
|
[Documentation for the KOI8-R Cyrillic encoding.]
|
|
|
|
|
|
Roman Czyborra, "The ISO 8859 Alphabet Soup".
|
|
|
http://czyborra.com/charsets/iso8859.html
|
|
|
[Documentation on the various ISO 859 encodings.]
|
|
|
|
|
|
L. Peter Deutsch, "ZLIB Compressed Data Format Specification version
|
|
|
3.3". RFC 1950.
|
|
|
[Information on the general format used in FlateDecode streams.]
|
|
|
|
|
|
L. Peter Deutsch, "DEFLATE Compressed Data Format Specification
|
|
|
version 1.3". RFC 1951.
|
|
|
[The definition of the compression algorithm used in FlateDecode
|
|
|
streams.]
|
|
|
|
|
|
Jim Flowers, "X Logical Font Description Conventions", Version 1.5, X
|
|
|
Consortium Standard, X Version 11, Release 6.1.
|
|
|
ftp://ftp.x.org/pub/R6.1/xc/doc/hardcopy/XLFD/xlfd.PS.Z
|
|
|
[The official specification of X font descriptors, including font
|
|
|
transformation matrices.]
|
|
|
|
|
|
Foley, van Dam, Feiner, and Hughes, _Computer Graphics: Principles and
|
|
|
Practice_, 2nd ed. Addison-Wesley, 1990, ISBN 0-201-12110-7.
|
|
|
[Colorspace conversion functions, Bezier spline math.]
|
|
|
|
|
|
Robert L. Hummel, _Programmer's Technical Reference: Data and Fax
|
|
|
Communications_. Ziff-Davis Press, 1993, ISBN 1-56276-077-7.
|
|
|
[CCITT Group 3 and 4 fax decoding.]
|
|
|
|
|
|
ISO, _Coding of Still Pictures_, IS 14492 Final Committee Draft.
|
|
|
http://www.jpeg.org/fcd14492.pdf
|
|
|
[The final draft of the JBIG2 spec.]
|
|
|
|
|
|
ITU, "Standardization of Group 3 facsimile terminals for document
|
|
|
transmission", ITU-T Recommendation T.4, 1999.
|
|
|
ITU, "Facsimile coding schemes and coding control functions for Group 4
|
|
|
facsimile apparatus", ITU-T Recommendation T.6, 1993.
|
|
|
http://www.itu.int/
|
|
|
[The official Group 3 and 4 fax standards - used by the CCITTFaxDecode
|
|
|
stream, as well as the JBIG2Decode stream.]
|
|
|
|
|
|
Christoph Loeffler, Adriaan Ligtenberg, George S. Moschytz, "Practical
|
|
|
Fast 1-D DCT Algorithms with 11 Multiplications". IEEE Intl. Conf. on
|
|
|
Acoustics, Speech & Signal Processing, 1989, 988-991.
|
|
|
[The fast IDCT algorithm used in the DCTDecode filter.]
|
|
|
|
|
|
Microsoft, _TrueType 1.0 Font Files_, rev. 1.66. 1995.
|
|
|
http://www.microsoft.com/typography/tt/tt.htm
|
|
|
[The TrueType font spec (in MS Word format, naturally).]
|
|
|
|
|
|
Thai Industrial Standard, "Standard for Thai Character Codes for
|
|
|
Computers", TIS-620-2533 (1990).
|
|
|
http://www.nectec.or.th/it-standards/std620/std620.htm
|
|
|
[The TIS-620 Thai encoding.]
|
|
|
|
|
|
P. Peterlin, "ISO 8859-2 (Latin 2) Resources".
|
|
|
http://sizif.mf.uni-lj.si/linux/cee/iso8859-2.html
|
|
|
[This is a web page with all sorts of useful Latin-2 character set and
|
|
|
font information.]
|
|
|
|
|
|
Charles Poynton, "Color FAQ".
|
|
|
http://www.inforamp.net/~poynton/ColorFAQ.html
|
|
|
[The mapping from the CIE 1931 (XYZ) color space to RGB.]
|
|
|
|
|
|
R. Rivest, "The MD5 Message-Digest Algorithm". RFC 1321.
|
|
|
[MD5 is used in PDF document encryption.]
|
|
|
|
|
|
Unicode Consortium, "Unicode Home Page".
|
|
|
http://www.unicode.org/
|
|
|
[Online copy of the Unicode spec.]
|
|
|
|
|
|
W3C Recommendation, "PNG (Portable Network Graphics) Specification
|
|
|
Version 1.0".
|
|
|
http://www.w3.org/Graphics/PNG/
|
|
|
[Defines the PNG image predictor.]
|
|
|
|
|
|
Gregory K. Wallace, "The JPEG Still Picture Compression Standard".
|
|
|
ftp://ftp.uu.net/graphics/jpeg/wallace.ps.gz
|
|
|
[Good description of the JPEG standard. Also published in CACM, April
|
|
|
1991, and submitted to IEEE Transactions on Consumer Electronics.]
|
|
|
|
|
|
F. Yergeau, "UTF-8, a transformation format of ISO 10646". RFC 2279.
|
|
|
[A commonly used Unicode encoding.]
|