Applying the same awk over a directory of files with individual file output


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Applying the same awk over a directory of files with individual file output
# 1  
Old 11-09-2015
Applying the same awk over a directory of files with individual file output

I am trying to apply an
Code:
awk

action over multiple files in a directory. It is a simple action, I want to print out the 1st 2 columns (i.e. $1 and $2) in each tab-separated document and output the result in a new file
Code:
*.pp

This is the
Code:
awk

that I have come up with so far, which is not giving me a result. Can someone help me identify the error?

Code:
awk FNR == 1 {if (o)close(o) o = FILENAME sub(/\*/, ".pp", o)} NR % $2,$1 {print > }

# 2  
Old 11-09-2015
Hello owwow14,

Could you please try following and let me know if this helps.
Code:
for i in *.pp
do
   awk '{print $1 OFS $2 >> new_input_file}' OFS="\t" $i
done
  
OR
 
awk '{print $1 OFS $2 >> "new_output_file.txt";close(FILENAME)}' OFS="\t" *.pp

I haven't tested though, let me know if you have any queries on same.


Thanks,
R. Singh
# 3  
Old 11-09-2015
Dear R. Singh,

I have tried and unfortunately it does not help - it seems to outut all of the files into one files called *.pp, rather than individual files with the suffix .pp
Let me preface a bit more the data. The file is tab-separated but there are lines of content in each column.
For instance, the information in the files would look something like this:

File1:
Code:
I love you man    THIS IS GREAT NEWS    5    www.url.com

File2:
Code:
I love you girl    THIS IS AWESOME NEWS    6    www.url.org

File3:
Code:
I love you son    THIS IS BAD NEWS    7    www.url.co.uk

I need to print out in individual output files just the first two columns, so the output would be.

File1.pp
Code:
I love you man    THIS IS GREAT NEWS

File2.pp
Code:
I love you girl    THIS IS AWESOME NEWS

File3.pp
Code:
I love you son    THIS IS BAD NEWS

When I need to extract quickly information from a column, I usually query the document also by defining the separators:


Code:
awk -F'\t' '{print $1,$2}' input > output

.
# 4  
Old 11-09-2015
Hello owwow14,

You could try these following ones but I haven't tested these too. Let me know if you have any queries.
Code:
for i in file*
do
   awk '{print $1 OFS $2 >> "file"++i".pp"}' FS="\t" OFS="\t" $i
done
OR 
awk '{print $1 OFS $2 >> "file"++i".pp";close(FILENAME)}' FS="\t" OFS="\t" file*

Thanks,
R. Singh
# 5  
Old 11-09-2015
Hi again, R. Singh,

It seems to be working better, except I keep getting the error
Code:
"awk: cannot open "file1021.pp" for output (Too many open files)
"

I tried to modify the one-liner as follows to make sure that the files were closed:
See here:

Code:
awk '{print $1 OFS $2 >> "file"++i".pp";close("file"++i".pp")}' FS="\t" OFS="\t" *

However, I am still getting the same error, just with a file with a higher N, i.e.
Code:
awk: cannot open "file2041.pp" for output (Too many open files)

Any ideas where the leak is coming from?
# 6  
Old 11-09-2015
Hello owwow14,

Could you please give it a try, haven't tested this though too.
Code:
awk '{i="file"++i".pp";print $1 OFS $2 >> i;close(i)}' FS="\t" OFS="\t" file*

Also how about for loop solution in my previous post, that would have worked I think.

Thanks,
R. Singh
# 7  
Old 11-09-2015
Try:
Code:
awk 'FNR==1{close(f); f=FILENAME ".pp"} {print $1,$2>f}' FS='\t' OFS='\t' File*

or

Code:
for f in File*
do
  cut -f1,2 "$f" > "$f.pp"
done


Last edited by Scrutinizer; 11-09-2015 at 03:48 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to list files names and sizes in a directory and output result to the file?

Hi , I'm trying to list the files and output is written to a file. But when I execute the command , the output file is being listed. How to exclude it ? /tmp file1.txt file2.txt ls -ltr |grep -v '-' | awk print {$9, $5} > output.txt cat output.txt file1.txt file2.txt output.txt (8 Replies)
Discussion started by: etldeveloper
8 Replies

2. Shell Programming and Scripting

Using a single "find" cmd to search for multiple file types and output individual files

Hi All, I am new here but I have a scripting question that I can't seem to figure out with the "find" cmd. What I am trying to do is to only have to run a single find cmd parsing the directories and output the different file types to induvidual files and I have been running into problems.... (3 Replies)
Discussion started by: swaters
3 Replies

3. Shell Programming and Scripting

Grep multiple terms and output to individual files

Hi all, I'll like to search a list of tems in a huge file and then output each of the terms to individual files. I know I can use grep -f list main.file to search them but how can I split the output into individual files? Thank you. (6 Replies)
Discussion started by: ivpz
6 Replies

4. Shell Programming and Scripting

Awk based script to find the median of all individual columns in a data file

Hi All, I have some data like below. Step1,Param1,Param2,Param3 1,2,3,4 2,3,4,5 2,4,5,6 3,0,1,2 3,0,0,0 3,2,1,3 ........ so on Where I need to find the median(arithmetic) of each column from Param1...to..Param3 for each set of Step1 values. (Sort each specific column, if the... (5 Replies)
Discussion started by: ks_reddy
5 Replies

5. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

6. Shell Programming and Scripting

Splitting a file in to multiple files and passing each individual file to a command

I have an input file with contents like: MainFile.dat: 12247689|7896|77698080 16768900|hh78|78959390 12247689|7896|77698080 16768900|hh78|78959390 12247689|7896|77698080 16768900|hh78|78959390 12247689|7896|77698080 16768900|hh78|78959390 12247689|7896|77698080 16768900|hh78|78959390 ... (4 Replies)
Discussion started by: rkrish
4 Replies

7. Shell Programming and Scripting

Apply 'awk' to all files in a directory or individual files from a command line

Hi All, I am using the awk command to replace ',' by '\t' (tabs) in a csv file. I would like to apply this to all .csv files in a directory and create .txt files with the tabs. How would I do this in a script? I have the following script called "csvtabs": awk 'BEGIN { FS... (4 Replies)
Discussion started by: ScKaSx
4 Replies

8. AIX

missing blank spaces while cutting the file to individual files

$ indicates blank space file1.txt: 001_AHaris$$$$$020$$$$$$$$$ 001_ATony$$$$$$030$$$$$$$$$ 002_AChris$$$$$090$$$$$$$$$ 002_ASmit$$$$$$060$$$$$$$$$ 003_AJhon$$$$$$001$$$$$$$$$ $ indicates blank space code while read "LINE"; do echo "$LINE" | cut -c6- >> $(echo "$LINE" | cut... (1 Reply)
Discussion started by: techmoris
1 Replies

9. Shell Programming and Scripting

batch shell script to zip individual files in directory - help

help trying to figure out a batch shell script to zip each file in a directory into its own zip file using this code but it does not work tryed this also nothing seems to work , just ends without zipping any files i have over 3000 files i need to zip up individualy ... (7 Replies)
Discussion started by: wingchun22
7 Replies

10. Shell Programming and Scripting

Applying diff output to create new script

I believe I read somewhere that you can do a diff of two ksh scripts and use the output to create a new script with the differences. :p Could someone please show me the command(s) I'd need to use to get this accomplished? Or perhaps point me to a thread that explains this in detail. Thanks... (1 Reply)
Discussion started by: BCarlson
1 Replies
Login or Register to Ask a Question