Visit Our UNIX and Linux User Community


single output of awk script processing multiple files


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers single output of awk script processing multiple files
# 1  
Old 08-08-2009
single output of awk script processing multiple files

Helllo UNIX Forum Smilie

Since I am posting on this board, yes, I am new to UNIX!
I read a copy of "UNIX made easy" from 1990, which felt like a making a "computer-science time jump" backwards Smilie
So, basically I have some sort of understanding what the basic concept is.

Problem Description:
What I am currently trying to do is writing an awk script.
This awk script should be able to repeat the same task on multiple input textfiles (extracting information (numerical values) from specified columns) and write the output to one single output file.

The output should be formatted in such a way, that it appears in 2 columns:
1st: index, 2nd: extracted value

I got to the point of extracting the information from multiple files and writing it into 1 output file. But my problem is, that the index starts all over again every time a new input file is read in, I would like it to increase every time, regardless whether it is a new file or not.

Ansatz:
My code looks the following way:

Code:
for i 
do
awk '{if ($1=="string") 
          print i++ " " $2  >> output_file             # index blank value
}' $i                                                                                                                    # reads in the i-th input file
done

I guess that each time the loop completes one cycle the awk script exits, effectively resetting the index-value i.

Own thoughts:
1a) Is there some sort of "save-attribute" so the awk-script doesn't "forget" the index-value?

1b) Alternatively could the index i of the awk-script get saved as a "global" variable in the shell-script and locally in the awk-script?

2) The other solution I considered was to use wc -l or some awk-command, to see how many lines the output_file would already consists of, but I think that would create a problem, when the file to be analyzed does not exist at that point (which would happen in the vey first run I presume). That could probably also be fixed by creating an empty output_file before any output is written (appended) to it. Then again, if there was no output written (if the value does not match the specified one), I would need to include a control structure checking for content in output_file. In case that there is none, output_file gets removed.

3) The other idea is to create one temporary file containing all the input files and give it to awk. After output is written the temporary file gets deleted again.

I doubt that solutions 1a) or 1b) are possible (are they?) and I don't really like solutions 2) (too complicated) and 3) (use of a temporary file).
My actual goal was to use the awk script, to write a code as smooth and easy as possible...

Question
What solution would you try (if at all any of the before mentioned), or do you have any hints at solving the problem?


Thanks in advance,
Kasimir

Last edited by DukeNuke2; 08-08-2009 at 01:02 PM.. Reason: added code tags
# 2  
Old 08-08-2009
Hi.

Instead of
Code:
for i 
do
awk '{if ($1=="string") 
          print i++ " " $2  >> output_file             # index blank value
}
' $i
done

you can pass multiple files to awk directly:
Code:
awk '$1 == "string" { print i++, $2  > "output_file" }
' file1 file2 file3 file[456] file7*

Then the counter will not reset. output_file must be quoted if it's not an Awk variable

Last edited by Scott; 08-08-2009 at 03:21 PM.. Reason: Quoted output_file variable
# 3  
Old 08-09-2009
Thank you very much for your help. I think this solved my problem.
What do the quotation marks around output_file actually mean, though - why were they necessary? What happened when not using them?
# 4  
Old 08-09-2009
It changes output_file from an awk variable into a string.

If you don't use the quotes, depending on your awk version either no output file will be written, or you'll get an error. If you declare an awk variable called output_file, then it would be fine:

Code:
/root/tmp # echo x | awk '{print > output_file}'
awk: (FILENAME=- FNR=1) fatal: expression for `>' redirection has null string value
/root/tmp # cat output_file
cat: output_file: cannot open [No such file or directory]

Code:
/root/tmp # echo x | awk '{print > "output_file"}'
/root/tmp # cat output_file 
x

Code:
/root/tmp # echo y | awk '{output_file="output_file"; print > output_file}'  
/root/tmp # cat output_file 
y

# 5  
Old 08-09-2009
thanks for your reply! Smilie

I tested it with a couple test files and it worked just fine (under latest cygwin), to finally conclude the case, I will be looking forward to test the code at work tomorrow.
# 6  
Old 08-10-2009
There is just one more problem:

Every input file is not to be read completely - only up to the position of the appearance of a certain string. Once this string is read, awk should stop processing the current file and start on the next one, until all input-files are processed.

My idea was something like this:

if ($1=="exit_string)
"go to the next input_file"

If all input files are read, exit.

My problem is, that I don't know how I could realize the "go to the next input_file" command. If I would just use the exit command, instead of the "go to next file", awk would quit after reading the first input file.

Does anybody have an idea?
# 7  
Old 08-10-2009
Hi.

If you're using LINUX, you can use nextfile() (I don't know which other UNIX Awks support this)

Code:
awk '$1 == "exit_string" { nextfile() }
     $1 == "string" { print i++, $2  > "output_file" }
     
' file1 file2 file3 file[456] file7*


Last edited by Scott; 08-10-2009 at 10:45 AM.. Reason: conditions were wrong way round
 

Previous Thread | Next Thread
Test Your Knowledge in Computers #257
Difficulty: Easy
Musical melodies were first generated by the computer originally named the CSIR Mark 1 (later renamed CSIRAC) in Australia in 1950.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Passing multiple files to awk for processing in bash script

Hi, I'm using awk command in bash script. I'm able to pass multiple files to awk for processing.The code i can use is as below(sample code) #!/bin/bash awk -F "," 'BEGIN { ... ... ... }' file1 file2 file3 In the above code i'm passing the file names manually and it is fine till my... (7 Replies)
Discussion started by: shree11
7 Replies

2. Shell Programming and Scripting

Using a single "find" cmd to search for multiple file types and output individual files

Hi All, I am new here but I have a scripting question that I can't seem to figure out with the "find" cmd. What I am trying to do is to only have to run a single find cmd parsing the directories and output the different file types to induvidual files and I have been running into problems.... (3 Replies)
Discussion started by: swaters
3 Replies

3. Shell Programming and Scripting

Combining columns from multiple files into one single output file

Hi, I have 3 files with one column value as shown File: a.txt ------------ Data_a1 Data_a2 File2: b.txt ------------ Data_b1 Data_b2 Data_b3 Data_b4 File3: c.txt ------------ Data_c1 Data_c2 Data_c3 Data_c4 Data_c5 (6 Replies)
Discussion started by: vfrg
6 Replies

4. Shell Programming and Scripting

Processing multiple files awk

hai i need my single awk script to act on 4 trace files of ns2 and to calculate througput and it should print result from each trace file in a single trace file. i tried with the following code but it doesnt work awk -f awkscript inputfile1 inputfile2 inputfile3 inputfile4>outputfile ... (4 Replies)
Discussion started by: sarathyy
4 Replies

5. Shell Programming and Scripting

awk, multiple files input and multiple files output

Hi! I'm new in awk and I need some help. I have a folder with a lot of files and I need that awk do something in each file and print a new file with the output. The input file name should be modified when I print the outpu files. Thanks in advance for help! :-) ciao (5 Replies)
Discussion started by: gabrysfe
5 Replies

6. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Hi, I'd like to process multiple files. For example: file1.txt file2.txt file3.txt Each file contains several lines of data. I want to extract a piece of data and output it to a new file. file1.txt ----> newfile1.txt file2.txt ----> newfile2.txt file3.txt ----> newfile3.txt Here is... (3 Replies)
Discussion started by: Liverpaul09
3 Replies

7. Shell Programming and Scripting

awk script processing data from 2 files

Hi! I have 2 files containing data that I need to process at the same time, I have problems in reading a different number of lines from the different files. Here is an explanation of what I need to do (possibly with an awk script). File "samples.txt" contains data in the format: time_instant... (6 Replies)
Discussion started by: Alice236
6 Replies

8. Shell Programming and Scripting

Writing output into different files while processing file using AWK

Hi, I am trying to do the following using AWK program. 1. Read the input data file 2. Parse the record and see if it contains errors 3. If the record contains errors, then write it into Reject file, else, write into usual output file or display it on the screen Here is what I have done -... (6 Replies)
Discussion started by: vidyak
6 Replies

9. Shell Programming and Scripting

awk, perl Script for processing a single line text file

I need a script to process a huge single line text file: The sample of the text is: "forward_inline_item": "Inline", "options_region_Australia": "Australia", "server_event_err_msg": "There was an error attempting to save", "Token": "Yes", "family": "Family","pwd_login_tab": "Enter Your... (1 Reply)
Discussion started by: hmsadiq
1 Replies

10. Shell Programming and Scripting

Help needed in processing multiple variables in a single sed command.

Is it possible to process multiple variables in a single sed command? I have the following ksh with three variables and I want to search for all variables which start with "var" inside input.txt. I tired "$var$" but it just prints out everyting in input.txt and does not work. $ more test.ksh... (5 Replies)
Discussion started by: stevefox
5 Replies

Featured Tech Videos