Dear friends,
I'm a novice Unix user and I'm trying to learn the ropes. I have a big task I have to accomplish and I'm convinced Unix can get the job done, I just haven't figured out how. I recently posted on the topic of cutting text between unique text patterns and somebody helped me a great deal. It worked great.
There are other tasks, however, that I want to accomplish.
I'm doing a content analysis of newspaper articles that I've exported in .txt format from a ProQuest database. The .rtf files look like this when I cat them in Unix.
Quote:
\
Adoptive parents often face tactless questions: Curiosity lies behind rude\
inquiries:[Final Edition]\
Karen Miles. Edmonton Journal. Edmonton, Alta.:Mar 15, 2000. p. F7 \
\
Author(s): Karen Miles\
\
Document types: News\
\
Section: Living\
\
Publication title: Edmonton Journal. Edmonton, Alta.: Mar 15, 2000. pg. F.7\
\
Source type: Newspaper\
\
ProQuest document 220675241\
ID:\
\
Text Word Count 314\
\
Document URL: {\field{\*\fldinst{HYPERLINK "http://proquest.umi.com"}}{\fldrslt http://proquest.umi.com}}/\
pqdweb?did=220675241&Fmt=3&clientId=14119&RQT=309&VName=PQD\
\
Abstract (Document Summary)\
\
"Is she adopted?" the school registrar whispered to Kathryn Creedy, of\
Alexandria, Va., when Creedy signed Alexis, her then five- year-old Romanian\
daughter, up for school. Since Alexis could clearly hear the question, Creedy\
curtly replied: "Yes, my daughter is adopted -- and she knows it!"\
\
"The media loves to sensationalize stories." Then point out that those stories\
are rare, and that there are many successful people who were adopted -\
- including playwright Edward Albee, Olympic skater Scott Hamilton, actor\
Melissa Gilbert and Apple Computer co- founder Steven Jobs.\
\
\
\
Full Text (314 words)\
\
Copyright Southam Publications Inc. Mar 15, 2000\
\
"Is she adopted?" the school registrar whispered to Kathryn Creedy, of\
Alexandria, Va., when Creedy signed Alexis, her then five- year-old Romanian\
daughter, up for school. Since Alexis could clearly hear the question, Creedy\
curtly replied: "Yes, my daughter is adopted -- and she knows it!"\
\
But Creedy resented the probe, especially when she later learned that the\
registrar hadn't needed the information. "The woman violated our privacy just\
to satisfy her own curiosity," she says.\
The information contained in the 'fields' such as author, date, document type, and text word count would be immensely valuable to me. I would like to be able to extract the information (preferably only the text
after the field titles (author, date, text word count) although this is not necessary. I imagine find and replace functions in Excel could be used to delete that easily. I would like to be able to get this information to some kind of excel database, probably via a .csv file.
Ultimately, I will have hundreds of these news stories to extract the information from.
Does anybody have any suggestions?
Simon