A simple question on copy/paste

James Wilkinson fedora at westexe.demon.co.uk
Thu Jun 1 12:07:03 UTC 2006


Stephen Liu wrote:
> All documents on Internet printed as .ps files and/or later converted
> to .pdf have problem.  Disregarding they can be retrieve and read,
> their text can't be highlighted and copy/paste.  I don't know what
> mistake committed.  Any suggestion?

Text can be stored in a PDF as a series of numbers (e.g. 65 is A) plus
font information, or as a picture of the text. Exactly how this happens
depends on how the PDF was created (and in this case, how the PostScript
version was created in the first place -- often there'll be an option
somewhere in whatever created them to create bitmaps or include font
information).

Once it's turned into a picture, then there's no easy way to go back to
the text it was created from. This isn't a limitation of your program,
but of existing technology. There are "OCR" programs that can "read" the
text in the same way as you or I would -- they look at the shapes, and
try to recognise letters. But they aren't foolproof (or particularly
fast).

Hope this helps,

James.

-- 
E-mail address: james | Helpful Advice from Thames Water:
@westexe.demon.co.uk  | "If you have difficulty reading this leaflet,
                      | please ask someone to help you."
                      |     -- Read on "The News Quiz", BBC Radio 4




More information about the fedora-list mailing list