perform 3 awk commands to multiple files in multiple directories


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting perform 3 awk commands to multiple files in multiple directories
# 1  
Old 10-26-2011
perform 3 awk commands to multiple files in multiple directories

Hi,

I have a directory /home/datasets/ which contains a bunch (720) of subdirectories called hour_1/ hour_2/ etc..etc.. in each of these there is a single text file called (hour_1.txt in hour_1/ , hour_2.txt for hour_2/ etc..etc..) and i would like to do some text processing in them.

Each of these text files contains records (where this record is unique and there are no duplicates) and i want to initially separate each of these records into its own file and name it based on the second field (where the $2 field is an identifier and have this form : cust_xxx_yyy of the record...I'm currently doing this (example for file hour_1/hour_1.txt) :

(1)
Code:
awk '{print $0 > $2".txt"}' hour_1.txt

which results to multiple .txt files starting with cust_

then i want to have all these files as a single column file, therefore i do this:

(2)
Code:
awk '{print >  "n_"FILENAME}' RS=" " cust_*

and finally i want to remove the first 3 records of the newly created files thus i do the following:

(3)
Code:
awk 'FNR>3 {print > "fin_"FILENAME}' n_cust*


I know that there might be an easier way of doing this even for a single directory, but is there a way to write a universal script and perform these 3 commands in all the directories?

Thanks in advance!

Last edited by Franklin52; 10-27-2011 at 04:11 AM.. Reason: Please use code tags, thank you
# 2  
Old 10-26-2011
You can try to use the script I sent you in this post as a base script:
Code:
find <Path> -name "hour_*.txt" -type f | \
while read fname
do
	fileBaseName=`basename "${fname}"`
	fileDirName=`dirname "${fname}"`

	echo "fileBaseName: [${fileDirName}][${fileBaseName}] - fname[${fname}]"
done

This may help you as a starting point. =o)
# 3  
Old 10-26-2011
thank you again felipe.vinturin for your quick response

I had (and still have your code severely in mind) but if i use the find<Path> part wouldn't this need to be iterated through a loop in order to access the specific directory out of the 720 in the main directory? (please be aware that i'm a newbie in shell scriptingSmilie )

so the main directory is : /home/datasets/

and in there there are 720 directories....by using the find <path> in order to access the single .txt file (and then in the do-done put these 3 awk commands) wouldn't i have to call every time the find tool to find again the path (e.g. /home/datasets/hour_1/ then /home/datasets/hour_2/ etc..etc..)?

thanks again

---------- Post updated at 02:42 PM ---------- Previous update was at 02:14 PM ----------

hi felipe again, i actually have tried your code by doing this:
Code:
!usr/bin/sh

find /home/tester/datasets/ -name "hour_*.txt" -type f | \

while read fname
do
	fileBaseName = `basename "${fname}" `
	fileDirName = `dirname "${fname}" `

#	echo "fileBaseName: [${fileDirName}][${fileBaseName}] - fname[${fname}]"

	awk '{print $0 > $2".txt"}' fileBaseName
	awk '{print > "n_"FILENAME}' RS= " " "cust_*.txt"
	awk 'FNR>3 {print > "fin_"FILENAME}' "n_cust*.txt"

done


i get errors on fileBaseName and fileDirName and as expected some errors in the awk commands....is the fileBaseName and fileDirName failure has to do that i'm under cygwin?

cheers and thank you again!

Last edited by Franklin52; 10-27-2011 at 04:12 AM.. Reason: Please use code tags, thank you
# 4  
Old 10-27-2011
Hi,

You were facing an error because you were using only the filename, not the filename and path and also, the variable names must be between: ${}

Code:
find /home/tester/datasets/ -name "hour_*.txt" -type f | \
while read fname
do
	fileBaseName = `basename "${fname}" `
	fileDirName = `dirname "${fname}" `

#	echo "fileBaseName: [${fileDirName}][${fileBaseName}] - fname[${fname}]"

	awk -v outputPath="${fileDirName}" '{print $0 > outputPath "/" $2 ".txt"}' "${fname}"
	awk -v outputPath="${fileDirName}" '{print > outputPath "/" "n_" FILENAME}' RS= " " "${fileDirName}/cust_*.txt"
	awk -v outputPath="${fileDirName}" 'FNR>3 {print > outputPath "/" "fin_" FILENAME}' "${fileDirName}/n_cust*.txt"
done

This version uses the paths and filenames.

One more comment, I have not tested it!

I hope it helps.
# 5  
Old 10-27-2011
Hi,

It seems that the problem occurs due to the fileBaseName and fileDirName...they don't actually hold any values and i'm keep getting the error:

Quote:
execAWK.sh line 7 : fileBaseName : command not found
execAWK.sh line 8 : fileDirName : command not found
are these variables embedded and globally used by a shell script or just your own?

thank you again
# 6  
Old 10-27-2011
The error is because there are spaces in the variable assignment - change it to:
Code:
	fileBaseName=`basename "${fname}" `
	fileDirName=`dirname "${fname}" `

# 7  
Old 10-27-2011
When I copied your script, I did not see that there was a space between the variable name, equal sign and the command:

Code:
find /home/tester/datasets/ -name "hour_*.txt" -type f | \
while read fname
do
	fileBaseName=`basename "${fname}"`
	fileDirName=`dirname "${fname}"`

#	echo "fileBaseName: [${fileDirName}][${fileBaseName}] - fname[${fname}]"

	awk -v outputPath="${fileDirName}" '{print $0 > outputPath "/" $2 ".txt"}' "${fname}"
	awk -v outputPath="${fileDirName}" '{print > outputPath "/" "n_" FILENAME}' RS= " " "${fileDirName}/cust_*.txt"
	awk -v outputPath="${fileDirName}" 'FNR>3 {print > outputPath "/" "fin_" FILENAME}' "${fileDirName}/n_cust*.txt"
done

Code:
fileBaseName = `basename "${fname}"` # Wrong
fileBaseName=`basename "${fname}"`   # Correct

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Create multiple directories with awk

Hello all. Newbie here. In a directory, I have 50 files and one additional file that is a list of the names of the 50 files. I would like to create a directory for each of the 50 files, and I need the 50 directory names to correspond to the 50 file names. I know this can be done by running... (6 Replies)
Discussion started by: Zeckendorff
6 Replies

2. Shell Programming and Scripting

awk, multiple files input and multiple files output

Hi! I'm new in awk and I need some help. I have a folder with a lot of files and I need that awk do something in each file and print a new file with the output. The input file name should be modified when I print the outpu files. Thanks in advance for help! :-) ciao (5 Replies)
Discussion started by: gabrysfe
5 Replies

3. UNIX for Dummies Questions & Answers

Deleting multiple directories inside multiple directories

Hi, Very unfamiliar with unix/linux stuff. Our admin is on vacation so, need help very quickly. I have directories (eg 40001, 40002, etc) that each have one subdirectory (01). Each subdir 01 has multiple subdirs (001, 002, 003, etc). They are same in each dir. I need to keep the top and... (7 Replies)
Discussion started by: kkouraus1
7 Replies

4. Shell Programming and Scripting

FTP multiple files from multiple directories

I have multiple files that starts as TRADE_LOG spread across multiple folders in the given structure.. ./dir1/1/TRADE_LOG*.gz ./dir2/10/TRADE_LOG*.gz ./dir11/12/TRADE_LOG*.gz ./dir12/13/TRADE_LOG*.gz when I do ftp uisng mput from the "." dir I am getting the below given error mput... (1 Reply)
Discussion started by: prasperl
1 Replies

5. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Hi, I'd like to process multiple files. For example: file1.txt file2.txt file3.txt Each file contains several lines of data. I want to extract a piece of data and output it to a new file. file1.txt ----> newfile1.txt file2.txt ----> newfile2.txt file3.txt ----> newfile3.txt Here is... (3 Replies)
Discussion started by: Liverpaul09
3 Replies

6. UNIX for Dummies Questions & Answers

best method of replacing multiple strings in multiple files - sed or awk? most simple preferred :)

Hi guys, say I have a few files in a directory (58 text files or somthing) each one contains mulitple strings that I wish to replace with other strings so in these 58 files I'm looking for say the following strings: JAM (replace with BUTTER) BREAD (replace with CRACKER) SCOOP (replace... (19 Replies)
Discussion started by: rich@ardz
19 Replies

7. Shell Programming and Scripting

extract multiple cloumns from multiple files; skip rows and include filenames; awk

Hello, I am trying to write a bash shell script that does the following: 1.Finds all *.txt files within my directory of interest 2. reads each of the files (25 files) one by one (tab-delimited format and have the same data format) 3. skips the first 10 rows of the file 4. extracts and... (4 Replies)
Discussion started by: manishabh
4 Replies

8. Shell Programming and Scripting

Multiple search string in multiple files using awk

Hi, filenames: contains name of list of files to search in. placelist contains the names of places to be searched in all files in "filenames" for i in $(<filenames) do egrep -f placelist $i if ] then echo $i fi done >> outputfile Output i am getting: (0 Replies)
Discussion started by: pinnacle
0 Replies

9. AIX

Script to perform some actions on multiple files

I have this Korn script that I wrote (with some help) that is run by cron. I basically watches a file system for a specific filename to be uploaded (via FTP), checks to make sure that the file is no longer being uploaded (by checking the files size), then runs a series of other scripts. The... (2 Replies)
Discussion started by: heprox
2 Replies

10. UNIX for Dummies Questions & Answers

Perform a command to multiple files

How do I perform a command to multiple files? For example, I want to look at all files in a directory and print the ones that do not contain a certain string. How do I go about doing this? (4 Replies)
Discussion started by: mcgrawa
4 Replies
Login or Register to Ask a Question

Featured Tech Videos