AWK, extract data from multiple files


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers AWK, extract data from multiple files
# 1  
Old 09-28-2010
AWK, extract data from multiple files

Hi,
I'm using AWK to try to extract data from multiple files (*.txt). The script should look for a flag that occurs at a specific position in each file and it should return the data to the right of that flag.

I should end up with one line for each file, each containing 3 columns:
Filename1.txt Data1 MoreData1
Filename2.txt Data2 MoreData2
Filename3.txt Data3 MoreData3

The trouble is, my output seems to be as follows:
Filename1.txt
Filename2.txt Data1 MoreData1
Filename3.txt Data2 MoreData2

I'm not sure what I've done wrong because I'm still struggling with the syntax and structure of AWK. Any idea what is wrong?

I'm very new to UNIX, AWK, etc. so simple language would be appreciated (if possibleSmilie).

ThanksSmilie

Here's the code I have, it's in a file called "extract_data_from_files.awk":
Code:
 
#!/usr/bin/awk
BEGIN {
  print "Starting script...";
}
{
  if ( substr($0,2,3) == "UWI") # substr(s, a, b) and returns b number of 
                                # chars from string s, starting at position a.
  {
    well = substr($0,11,20);
    gsub(/^[ \t]+|[ \t]+$/, "", well); #trim spaces
  }
  if ( substr($0,2,3) == "LSR")
  {
    kb = substr($0,11,20);
    gsub(/^[ \t]+|[ \t]+$/, "", kb); #trim spaces
  }
  if ( FNR == 1 || EOF ) #FNR=total lines in each file. EOF=End Of File
  {
      concat = FILENAME "\t\t\t" well "\t\t\t" kb; #no commas needed to separate variables
      print concat;
  }
} 
END {
  print "Ending script...";
}

I run it from the konsole window using:
Code:
 
awk -f extract_data_from_files.awk *.txt

# 2  
Old 09-28-2010
Post contents of sample input files and desired output.
# 3  
Old 09-28-2010
Quote:
Originally Posted by bartus11
Post contents of sample input files and desired output.
I can't be specific with the data, but this should give you an idea:
Code:
 
line 1 blah blah
line 2 blah blah
line 3 blah blah
line 4 blah blah
line 5 blah blah
line 6 blah blah
 UWI .             Data1
line 8 blah blah
line 9 blah blah
 LSR .             MoreData1
line 11 blah blah

If the above was Filename1.txt, my script should find each flag and return each value, in this case it would return "Data1" and "MoreData1".

My script finds these values but it displays them wrong.

I want my output data to look like this:
Filename1.txt Data1 MoreData1
Filename2.txt Data2 MoreData2
Filename3.txt Data3 MoreData3

but it ends up looking like this:
Filename1.txt
Filename2.txt Data1 MoreData1
Filename3.txt Data2 MoreData2

I hope that makes sense. Sorry I can't use the actual files.
# 4  
Old 09-28-2010
Try:
Code:
awk 'FNR==1 && NR==1{printf FILENAME" "}FNR==1 && NR!=1{printf "\n"FILENAME" "}$1~"UWI" || $1~"LSR"{printf $3" "}END{printf "\n"}' *.txt

This User Gave Thanks to bartus11 For This Post:
# 5  
Old 09-29-2010
Quote:
Originally Posted by bartus11
Try:
Code:
awk 'FNR==1 && NR==1{printf FILENAME" "}FNR==1 && NR!=1{printf "\n"FILENAME" "}$1~"UWI" || $1~"LSR"{printf $3" "}END{printf "\n"}' *.txt

That looks much better, the data now looks more like this:

Filename1.txt Data1 MoreData1
Filename2.txt Data2 MoreData2
Filename3.txt Data3 MoreData3

However, there's still a problem. I'm assuming that $3 means the 3rd field in that line, unfortunately the data I want to extract has a space between it, so this script just returns the 1st part of it, not the full amount. For example, from above, Data1 can be Da ta1, therefore only Da is returned. The same applies to MoreData1.

I will try to understand your code and put it into my original code.

I'm assuming the code written before the {}s is the same as an IF statement.

Thanks for the reply, I'll have a closer look now to see if I can get things workingSmilie.
# 6  
Old 09-29-2010
Code:
awk 'FNR==1 && NR==1{printf FILENAME" "}FNR==1 && NR!=1{printf "\n"FILENAME" "}$1~"UWI" || $1~"LSR"{for (i=3;i<=NF;i++)printf $i" "}END{printf "\n"}' *.txt

These 2 Users Gave Thanks to bartus11 For This Post:
# 7  
Old 09-29-2010
Quote:
Originally Posted by bartus11
Code:
awk 'FNR==1 && NR==1{printf FILENAME" "}FNR==1 && NR!=1{printf "\n"FILENAME" "}$1~"UWI" || $1~"LSR"{for (i=3;i<=NF;i++)printf $i" "}END{printf "\n"}' *.txt

It's going to take me a while to understand your code before I can fine tune it (I'm a beginner). It's close to being correct, but has a few problems.
- When the script finds the flag "UWI", it returns too much information (ie. Data1 + unecessary text). Note: The start and end points of the data I want to extract are fixed positions in each file.
- When the script finds the flag "LSR", it returns information to the right of what I want. The only difference between the UWI and LSR flags is that they are on different lines, the piece of text I want to extract from each line is positioned in the same place in each line.

Let me have a look at it today and I'll reply tomorrowSmilie.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare 2 files and extract the data which is present in other file - awk is not working

file2 content f1file2 content f1,1,2,3,4,5 f1,2,4,6,8,10 f10,1,2,3,4,5 f10,2,4,6,8,10 f5,1,2,3,4,5 f5,2,4,6,8,10awk 'FNR==NR{a;next}; !($1 in a)' file2 file1output f10,1,2,3,4,5 f10,2,4,6,8,10 f5,1,2,3,4,5 f5,2,4,6,8,10awk 'FNR==NR{a;next}; ($1 in a)' file2 file1output nothing... (4 Replies)
Discussion started by: gksenthilkumar
4 Replies

2. Shell Programming and Scripting

awk - Multiple files - 1 file with multi-line data

Greetings experts, Have 2 input files, of which 1 file has 1 record per line; in 2nd file, multiple lines constitute 1 record; Hence declared the RS=";" Now in the first file which ends with ";" at each line of the line; But \nis also being considered as part of the data due to which I am... (1 Reply)
Discussion started by: chill3chee
1 Replies

3. Shell Programming and Scripting

Extract data in tabular format from multiple files

Hi, I have directory with multiple files from which i need to extract portion of specif lines and insert it in a new file, the new file will contain a separate columns for each file data. Example: I need to extract Value_1 & Value_3 from all files and insert in output file as below: ... (2 Replies)
Discussion started by: belalr
2 Replies

4. Shell Programming and Scripting

awk -- Extract data from html within multiple tags as reference

Hi, I'm trying to get some data from an html file, but the problem is before it can extract the information I have multiple patterns that need to be passed through. https://www.unix.com/shell-programming-scripting/150711-extract-data-awk-html-files.html Is a similar problem. The only... (5 Replies)
Discussion started by: counfhou
5 Replies

5. UNIX for Dummies Questions & Answers

Extract common data out of multiple files

I am trying to extract common list of Organisms from different files For example I took 3 files and showed expected result. In real I have more than 1000 files. I am aware about the useful use of awk and grep but unaware in depth so need guidance regarding it. I want to use awk/ grep/ cut/... (7 Replies)
Discussion started by: macmath
7 Replies

6. Shell Programming and Scripting

Extract data with awk and write to several files

Hi! I have one file with data that looks like this: 1 data data data data 2 data data data data 3 data data data data . . . 1 data data data data 2 data data data data 3 data data data data . . . I would like to have awk to write each block to a separate file, like this: 1... (3 Replies)
Discussion started by: LinWin
3 Replies

7. Shell Programming and Scripting

extract data with awk from html files

Hello everyone, I'm new to this forum and i am new as a shell scripter. my problem is to have html files in a directory and I would like to extract from these some data that lies between two different lines Here's my situation <td align="default"> oxidizability (mg / l): data_to_extract... (6 Replies)
Discussion started by: sbobotex
6 Replies

8. UNIX for Dummies Questions & Answers

awk, extract last line of multiple files

Hi, I have a directory full of *.txt files. I would like to print the last line of every file to screen. I know you can use FNR for printing the first line of each file, but how do I access the last line of each file? This code doesn't work, it only prints the last line of the last file:BEGIN... (5 Replies)
Discussion started by: Liverpaul09
5 Replies

9. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Hi, I'd like to process multiple files. For example: file1.txt file2.txt file3.txt Each file contains several lines of data. I want to extract a piece of data and output it to a new file. file1.txt ----> newfile1.txt file2.txt ----> newfile2.txt file3.txt ----> newfile3.txt Here is... (3 Replies)
Discussion started by: Liverpaul09
3 Replies

10. Shell Programming and Scripting

extract multiple cloumns from multiple files; skip rows and include filenames; awk

Hello, I am trying to write a bash shell script that does the following: 1.Finds all *.txt files within my directory of interest 2. reads each of the files (25 files) one by one (tab-delimited format and have the same data format) 3. skips the first 10 rows of the file 4. extracts and... (4 Replies)
Discussion started by: manishabh
4 Replies
Login or Register to Ask a Question