Grep MS Word document


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grep MS Word document
# 1  
Old 11-04-2008
Grep MS Word document

Hi,

I have to read a MS word document to find some strings(expressions) .The reading should be done by paragraph.I have to show the entire paragraph If I find any string/expression in that.

Please help me out.

Thanks
Regards
Kris
# 2  
Old 11-04-2008
Is the file in unicode?
# 3  
Old 11-04-2008
Yes its in unicode
# 4  
Old 11-05-2008
Try this on your file:
Code:
iconv -f UTF-16 -t UTF-8 myfile > temporary_file
grep 'pattern to match' temporary_file

If this works the way you want try this command to get the whole paragraph
Code:
sed -e '/./{H;$!d;}' -e 'x;/pattern to match/!d;'

This assumes a blank line exists between paragraphs.
# 5  
Old 11-06-2008
Thank you for your time but It didn't work,I am getting invalid codeset error when I issue the follwoing command.

iconv -f UTF-16 -t UTF-8 filename > tempfilename

Error
iconv: Invalid codeset: UTF-8: The system cannot find the file specified.
iconv: Invalid codeset: UTF-16: The system cannot find the file specified.

when I issue iconv -l I am getting the follwing code set

Character sets: ISO8859-1:1987 8859 ISO8859-1 ISO8859-2 ISO8859-3 ISO8859-4 ISO8
859-5 ISO8859-6 ISO8859-7 ISO8859-8 ISO8859-9 CP037 EBCDIC CP273 CP277 CP278 CP2
80 CP284 CP285 CP297 CP437 CP500 CP850 CP852 CP857 CP860 CP863 CP865 CP866 CP870
CP871 CP905 ISO646 646 C


Thanks in advance
Regards
kris
# 6  
Old 11-06-2008
You will have to do some reading:
A Quick Primer On Unicode and Software Internationalization Under Linux and UNIX

This explains about using unicode in unix.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Word change in a document

I have a bunch of documents where I need to change the word pi to pisignage. No big deal there: sed -i -e 's/pi/spisignage/g' /path/to/file However it is finding things like the word stopping and making the word stoppisignageng. Any suggestions to just find the word pi and change it? ... (3 Replies)
Discussion started by: wspgpete
3 Replies

2. Shell Programming and Scripting

Word change in a document

I have a bunch of documents where I need to change the word pi to pisignage. No big deal there: sed -i -e 's/pi/spisignage/g' /path/to/file However it is finding things like the word stopping and making the word stoppisignageng. Any suggestions to just find the work pi and change it? ... (0 Replies)
Discussion started by: wspgpete
0 Replies

3. Shell Programming and Scripting

Creating word document (.doc) with attachment in unix solaris

Hi All, Is it possible to creation a word document (.doc) in unix solaris which includes an attachment (i.e similar to insert -> object add attachment in windows) Requirement is to add files into .doc and the files is present in unix servers. Is it possible to do this within unix instead... (5 Replies)
Discussion started by: ajay547
5 Replies

4. Programming

Extract xml data and create word document using perl.

Hi, I have large xml data file.I need to extract node and some tags in the node and after I need to create word document. my XMl data is look like as below -<student> <number>24</number> <education>bachelor</bachelor> <specialization>computers</specialization> ... (3 Replies)
Discussion started by: veerubiji
3 Replies

5. Programming

extract xml data and create word document using perl.

hi, i have large xml file which contains students information, i need to extract student number and some address tags and create a word document for the extracted data. my data looking llike this <student> <number>24</number> <education>bachelors</education> ... (1 Reply)
Discussion started by: veerubiji
1 Replies

6. Shell Programming and Scripting

copy contents of unix file to Word document

Hello, I have a unix file about 3000lines which i want to copy from and paste it into a Word document. If i cat the file and try to scroll through it then not everything is captured so i am getting and incomplete paste. Any help is really appreciated. jak (2 Replies)
Discussion started by: jakSun8
2 Replies

7. Solaris

Copy and paste text from a word document into a txt file in vi

Hello, Can anybody please tell me how we can copy and paste text from a word document into a text file that we are editing in vi? Is it possible to do that while we are editing the text file in vi in insert mode? Thanks, (3 Replies)
Discussion started by: Pouchie1
3 Replies

8. UNIX for Dummies Questions & Answers

Converting LATEX PDF to WORD document

Hi there, is it possible to convert pdf files to Word with some free :p software or with some trick??? Now I'm working with LATEX and I can get pdf format but I would like to get .rtf or .doc files too:rolleyes:. Lately I found something like that, but it wasn't free. Thanks for any... (1 Reply)
Discussion started by: Giordano Bruno
1 Replies

9. UNIX for Advanced & Expert Users

Mutt - Word Document or Formatted text as a Message

Hi, I am writing a mailing script by using mutt command. I that i have facing a issues. because, i want to send Some Formatted text as the mail message. but, i try to send the Word Document file as the Mail message. it shows some junk characters in the mail. :confused:I think the mutt command is... (1 Reply)
Discussion started by: krsenkumar
1 Replies

10. UNIX for Dummies Questions & Answers

transfer word document using ftp,sftp

Hello All, I want to transfer some world documents from solaris server to my local PC. using FTP i can not see the content of the files. Pls. tell me some other alternative (as sftp - i have tried with sftp ip_address which is not working) . (3 Replies)
Discussion started by: artikulkarni
3 Replies
Login or Register to Ask a Question