Extracting information from text fields.


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Extracting information from text fields.
# 15  
Old 06-08-2007
Hi: Ok, that works great! I'm 90% there....
The only thing is that I need to modify the SED expression to fit some other fields I need to do. Could you walk me through that SED expression to let me know what it all does?
# 16  
Old 06-08-2007
Spindoctor,
The "sed" command is removing from the "Author" line:
Code:
Author(s): Karen Miles\

Anything that is not considered the name of the author, such as:
Code:
1) "Author(s): "
2) "\"

# 17  
Old 06-08-2007
Well, this is weird: it's working to the extent that it's giving me material I can work with; but it's not removing the phrase "Author(s)" from the output line. It's giving me:
Author(s): ALLEN PANZERI Journal Staff Writer
# 18  
Old 06-08-2007
Spindoctor,
See if this solves the issue:
Code:
for mFile in *
do
  echo "Now working with file = <"$mFile">"
  egrep 'Author' b | sed -e 's/.*: \(.*\)/\1/' -e 's;\\;;'
done

# 19  
Old 06-08-2007
Shell_life, thanks for this.
I just noticed that in some of my students files, the author's names are now on the line below the field name "Author" is there a way that the search string you just provided can be used to capture the next line? In some files, the text is all on one line; in other files the text is split between two lines. There must be some unix/dos conversion problems.
# 20  
Old 06-08-2007
Spindoctor,
A picture is worth a thousand of words.
The ideal would be for a sample of what you described.
In any event, here is one way to solve what I understood
you said:
Code:
mREa='/Author/,/\\/p'
mREb='s/.*:\(.*\)/\1/'
mREc='s;\\;;'
mREd='s/^ *\(.*\)/\1/'
mREe='/^$/d'
for mFile in *
do
  echo "Now working with file = <"$mFile">"
  egrep 'Author' $mFile | sed -n "$mREa" | sed -e "$mREb" -e "$mREc" -e "$mREd" -e "$mREe"
done

# 21  
Old 06-08-2007
Fair Enough shell life:
Some of the files are formatted as follows (this is the nice, unproblematic way)

Quote:
Peace River Tory resigns key post:[Final Edition]
NORM OVENDEN Journal Ottawa Bureau. Edmonton Journal. Edmonton, Alta.:May 6,
1992. p. A10

Author(s): NORM OVENDEN Journal Ottawa Bureau

Document types: NEWS

Dateline: Ottawa

Publication title: Edmonton Journal. Edmonton, Alta.: May 6, 1992. pg. A.10

Source type: Newspaper

ProQuest document 193438631
ID:

Text Word Count 294

Document URL: http://proquest.umi.com/
pqdweb?did=193438631&Fmt=3&clientId=14119&RQT=309&VName=PQD
The more problematic way is:

Quote:
Daishowa mill won't meet environmental standards:[Final Edition]
SCOTT McKEEN Journal Staff Writer. Edmonton Journal. Edmonton, Alta.:May 3, 1989.
p. B2

Author(s):
SCOTT McKEEN Journal Staff Writer
Document types:
NEWS
Publication title:
Edmonton Journal. Edmonton, Alta.: May 3, 1989. pg. B.2
Source type:
Newspaper
ProQuest document ID:
191197901
Text Word Count
503
Document URL:
http://proquest.umi.com/pqdweb?did=1...lientId=14119&
RQT=309&VName=PQD
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting information from XML to excel

Hi, I am trying to extract information from a XML file and write it to a excel sheet. I am not sure where to start from. Here is the content from my input XML file. <com.cloudbees.hudson.plugins.folder.properties.FolderProxyGroupContainer plugin="nectar-rbac@4.5"> <groups> ... (4 Replies)
Discussion started by: Sajjadmehdi
4 Replies

2. Shell Programming and Scripting

[Solved] Extracting information from DDL's

Dear Experts, I need your help here. I have lot of teradata DDL's as follows, i want to extract field names , field attributes and NOT NULL information from DDL.Could you please help here. Sample DDL: CREATE MULTISET TABLE APS_CALL_IN_PICKUP_CANCELED ,NO FALLBACK , NO BEFORE... (2 Replies)
Discussion started by: srikanth38
2 Replies

3. Shell Programming and Scripting

Extracting information using awk

I want to write a script that extracts a value from a line of text. I know it can be done using awk but I've never used awk before so I don't know how to do it. The text is: Mem: 100M Active, 2150K Cache, 500M Buf, 10G Free I want to extract the free memory value to use as a variable. In... (5 Replies)
Discussion started by: millsy5
5 Replies

4. Shell Programming and Scripting

Problems extracting some information

Hi there! Well, I'm writing a script to obtain certain information about files. Specifically, I want to get the information about those files which last access were in the last 24 hours, so I'm doing something like this: find <directory_name> -atime -1 -printf '%f %a\n' I would also... (4 Replies)
Discussion started by: Skirmish
4 Replies

5. Shell Programming and Scripting

Extracting relevant information from syslogs.

I need to analyse some syslogs and I want to print out all the lines containing SSH connections to the inside interface of the firewall and ignore lines where the originating port is 22. So basically I want to print all matches after "to inside:" that contains /22 and ignore lines where /22 occur... (2 Replies)
Discussion started by: lewk
2 Replies

6. Shell Programming and Scripting

extracting information from multiple files

Hello there, I am trying to extract (string) information ( a list words) from 4 files and then put the results into 1 file. Currently I am doing this using grep -f list.txt file1 . and repeat the process for the other 3 files. The reasons i am doing that (a) I do know how to code (b) each file... (4 Replies)
Discussion started by: houkto
4 Replies

7. Shell Programming and Scripting

Extracting information from Config files /text processing

Hello All, This is my first post on this forums, which I consider one of the best of its kind. The reason for my post is that I want to export some information form Nagios configuration files to a DB. I know that there are other tools available to do this, like NDO, monarch, etc... But I want to... (3 Replies)
Discussion started by: oconmx
3 Replies

8. Shell Programming and Scripting

Problems with extracting information

Hi all, <select name="comp" id="comp" style="width:130px;"> <?php $sqlcomp = mysql_query("SELECT * FROM comp"); while ($redcomp = mysql_fetch_array($sqlcomp)) { extract($redcomp); echo "<option value=\"$comp_id\">comp_name</option>"; } ?> ... (0 Replies)
Discussion started by: c0mrade
0 Replies

9. Shell Programming and Scripting

extracting fields

Hi, i have a line with several fields (indefinite number of - count varies) separated by colon. Now, i need to pick each field (except the first one) and have it assigned to variable within a loop. In other words, in the first iteration of the loop, the variable must be assigned with 2nd... (2 Replies)
Discussion started by: prvnrk
2 Replies

10. Shell Programming and Scripting

Extracting information from a template

I have a template that I usually use to generate stats on an hourly basis for a number of cell sites altogether. I would like to be able to write a script that would go to the template and extract the information for any single site at any time during the day. For example, let's say that my... (4 Replies)
Discussion started by: Ernst
4 Replies
Login or Register to Ask a Question