[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Shell script for extracting text from MS Word docs



I've never used the wvNNNN stuff because I just like the text. I use the
strings command. I.e.:

cat word.DOC | strings -n 4 > word.txt

There will be a bunch of text that's not necessary before the actual body of
text in the document, however a few cuts/deletes and you should have a
pretty nice representation of the document. This is only text. For images, I
usually just visit my friendly NT workstation and convert them there using
Word, which the company requires us to have.

Karl L. Pearson
Senior Consulting Systems Analyst
Senior Consulting Database Analyst
karlp ourldsfamily com

On Wed, 28 Mar 2001, Venkatesh, PC wrote:

Sorry, this is a bit off-topic for this list.  Does anyone know of a shell
script/command line method for extracting text from MS Word docs? [I.e. not
opening the doc in abiword, etc.].
And, what mailing lists might be useful for questions of this sort?  Thanks

++++++++++++++++++++++++++++++++++++++++
P.C. Venkatesh
Risk Analysis Division, OCC
Mail Stop 2-1, 250 E Street, S.W.,
Washington D.C. 20219, USA.

vox: 202 874 8698
fax: 202 874 5394
email: pc venkatesh occ treas gov



_______________________________________________
Redhat-install-list mailing list
Redhat-install-list redhat com
https://listman.redhat.com/mailman/listinfo/redhat-install-list





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]