новые иконки в OpenBoard
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 
OpenBoard/thirdparty/xpdf/xpdf-3.02/doc/pdftotext.cat

120 lines
3.9 KiB

pdftotext(1) pdftotext(1)
NAME
pdftotext - Portable Document Format (PDF) to text con-
verter (version 3.02)
SYNOPSIS
pdftotext [options] [PDF-file [text-file]]
DESCRIPTION
Pdftotext converts Portable Document Format (PDF) files to
plain text.
Pdftotext reads the PDF file, PDF-file, and writes a text
file, text-file. If text-file is not specified, pdftotext
converts file.pdf to file.txt. If text-file is '-', the
text is sent to stdout.
CONFIGURATION FILE
Pdftotext reads a configuration file at startup. It first
tries to find the user's private config file, ~/.xpdfrc.
If that doesn't exist, it looks for a system-wide config
file, typically /usr/local/etc/xpdfrc (but this location
can be changed when pdftotext is built). See the
xpdfrc(5) man page for details.
OPTIONS
Many of the following options can be set with configura-
tion file commands. These are listed in square brackets
with the description of the corresponding command line
option.
-f number
Specifies the first page to convert.
-l number
Specifies the last page to convert.
-layout
Maintain (as best as possible) the original physi-
cal layout of the text. The default is to 'undo'
physical layout (columns, hyphenation, etc.) and
output the text in reading order.
-raw Keep the text in content stream order. This is a
hack which often "undoes" column formatting, etc.
Use of raw mode is no longer recommended.
-htmlmeta
Generate a simple HTML file, including the meta
information. This simply wraps the text in <pre>
and </pre> and prepends the meta headers.
-enc encoding-name
Sets the encoding to use for text output. The
encoding-name must be defined with the unicodeMap
command (see xpdfrc(5)). The encoding name is
case-sensitive. This defaults to "Latin1" (which
is a built-in encoding). [config file: textEncod-
ing]
-eol unix | dos | mac
Sets the end-of-line convention to use for text
output. [config file: textEOL]
-nopgbrk
Don't insert page breaks (form feed characters)
between pages. [config file: textPageBreaks]
-opw password
Specify the owner password for the PDF file. Pro-
viding this will bypass all security restrictions.
-upw password
Specify the user password for the PDF file.
-q Don't print any messages or errors. [config file:
errQuiet]
-cfg config-file
Read config-file in place of ~/.xpdfrc or the sys-
tem-wide config file.
-v Print copyright and version information.
-h Print usage information. (-help and --help are
equivalent.)
BUGS
Some PDF files contain fonts whose encodings have been
mangled beyond recognition. There is no way (short of
OCR) to extract text from these files.
EXIT CODES
The Xpdf tools use the following exit codes:
0 No error.
1 Error opening a PDF file.
2 Error opening an output file.
3 Error related to PDF permissions.
99 Other error.
AUTHOR
The pdftotext software and documentation are copyright
1996-2007 Glyph & Cog, LLC.
SEE ALSO
xpdf(1), pdftops(1), pdfinfo(1), pdffonts(1), pdftoppm(1),
pdfimages(1), xpdfrc(5)
http://www.foolabs.com/xpdf/
27 Febuary 2007 pdftotext(1)