Reading in all files from parent directory (GAWK)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reading in all files from parent directory (GAWK)
# 1  
Old 07-18-2011
Reading in all files from parent directory (GAWK)

Hi all,

I'm very, very new to scripting (let alone SHELL) and was wondering if anyone could help me out as I seem to be in a spot of bother.

I collect data (.dat files) which are automatically seperated into several sub directories, so the file paths I'm reading in at the moment would be something like

'/u/Picarro/DataLog/2011/june/18/CF*nc.dat'

where I use "CF*nc.dat" to read in all files from that folder (18th June in this case). Now this is problematic as I need to read in all files from a whole year and combine them, i.e combine all the files into one .csv files for processing.

Is there any way to read in ALL *.dat files in a parent directory e.g. read in all the files for june

'/u/Picarro/DataLog/2011/june'

without having to refer to the date?

Thanks! Smilie
# 2  
Old 07-18-2011
Welcome to the forum and the wonderful world of scripting.

You can use 'find' for that purpose:

Code:
find /u/Picarro/DataLog/2011/june -type f -name "*.dat"

will list all files (-type f) , (not directories), inside /u/Picarro/DataLog/2011/june, that end with '.dat'

You can redirect that list of files into a file for further processing:
Code:
find /u/Picarro/DataLog/2011/june -type f -name "*.dat" > june2011.lst

This User Gave Thanks to mirni For This Post:
# 3  
Old 07-18-2011
If you want to join all those files into single file, then this should work:
Code:
find /u/Picarro/DataLog/2011/june -type f -name "*dat" -exec cat {} >> june.dat \;

This User Gave Thanks to bartus11 For This Post:
# 4  
Old 07-18-2011
you guys are heroes. Much appreciated!

---------- Post updated at 12:21 PM ---------- Previous update was at 11:46 AM ----------

slight problem guys, I'll fill you in on the whole task so you can try and understand.
My script atm is this;

Code:
#!/bin/bash

# input files from each day of data in june and combine into one big file
find /u/Picarro/DataLog/2011/june -type f -name "*dat" -exec cat {} >> june.dat \;

#use new combined data as input file
IN_ALL='/u/Picarro/DataLog/2011/june/june.dat' 
	
# the csv file to create for all data called 'june.csv' in the respective directory
OUT_all='/u/Picarro/DataLog/2011/Awk/june.csv'		

# gawk files to create csv file
GAWK='/u/Picarro/DataLog/2011/Awk/Format_trial.csv.awk'

#produce the OUT file from the IN file(s)
$GAWK $IN_all >> $OUT_all

So I'm trying to use the new combined data you guys just provided me with to create a .csv file edited with another GAWK script I wrote {I wrote the GAWK script to edit the actual text e.g. combine some columns etc.)

Unfortunately it won't create the .csv file with the new data, is my code wrong or is there a way of making it create and read in the combined .dat file before the .csv is created?

Not sure if any of that made sense to you, it barely does in my head lol. Let me know if you need any clarification

Thanks again
# 5  
Old 07-18-2011
Try using this:
Code:
find /u/Picarro/DataLog/2011/june -type f -name "CF*nc.dat" -exec cat {} >> /u/Picarro/DataLog/2011/june/june.dat \;


Last edited by bartus11; 07-18-2011 at 08:33 AM.. Reason: changed the -name operand
# 6  
Old 07-18-2011
What exactly do you have there in your awk script /u/Picarro/DataLog/2011/Awk/Format_trial.csv.awk?
Does your awk script work on one .dat file ?
Are you getting any error or does it run ok, just the output is empty?

What does the actual data file look like and what is your desired output?
# 7  
Old 07-18-2011
The Format_trial.csv.awk script;

Code:
#!/bin/gawk -f


# This file is to restructure the picarro data into the correct .csv columns for R


# create a header with same headings as variable in table
# also set other variables before parsing data
BEGIN   {
        OFS="," 	# tells awk that the output separator is a comma
        ORS=""  	# tells awk to not print newline after each print command so all records are
					# on the same line until we want a new line "\n"			
		getline}	# removes 1st line of input file ie header so we can replace it with correct one

# rearrange the yyyy-mm-dd | hh:mm:ss date and time to single date column of yyyy/mm/dd hh:mm:ss needed for openair
{print substr($1,9,2) "/" substr($1,6,2) "/" substr($1,1,4) " " substr($2,1,5)}		

# print the rest of variables as columns 
{print (" ", $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, $19, $20)
}


$1 $2 != prev {
    	
	{print "\n"}	# newline after each 5 seconds of data has been parsed
  
; prev=$1 $2}

Yes this awk script works on one file, it creates a combined .csv file with the correct format. The problem is trying to input all the files at once (with one command) as opposed to having to call in every days file.

Yes the output (for the 'new' combined script) is just an empty 'june.csv' file, maybe it's not recognising the june.dat as the input?

The desired data file is just a data frame containing 20 variables. The awk file was essentially used to combine the data and time columns into one column with the correct format I need to input into an R package.

Cheers guys!

---------- Post updated at 01:59 PM ---------- Previous update was at 01:52 PM ----------

Ah I see where I went wrong. I was using 'IN_ALL' to call in the file and then 'IN_all' to produce the file! My how I'm learning how precise programming can be!

Another quick thing guys, this programme is going to be run everyday to create the .csv file. However, whenever I run it, the data is just added to the old data, so I essentially have the same results twice if I run the programme again, and 3 times if I run it again and so on. Is there anyway to tell awk to write over the old file?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find/searching files in subdirectories excluding the fiels in Parent Directory

Hi All, requirement is to find and remove the files from sub directories but it should exclude the files from parent directory. At present i am using the below one but it finds and remove files from both parent and sub directories. find ${PATH} -type f \( -name securitas\* -o -name \*gz... (1 Reply)
Discussion started by: Naveenkk
1 Replies

2. Shell Programming and Scripting

Copying files from parent directory only

Hello, Please can someone assist on a issue I am having. I want to find specific files in the parent directory only that have been modified over the last 2 days and copy them to another location. NOTE: The version of AIX I am using does not have MAXDEPTH. I have currently written the... (3 Replies)
Discussion started by: Dolph
3 Replies

3. Shell Programming and Scripting

Help on Backing up all the files in the subdirectories under a parent directory

Hi, I am not too familiar with Unix scripting but I have to write code to find all the files under all the sub directories under a parent directory of unix location and move them to the corresponding Windows location. For eg: I have \home\sreenu\Files\ Under neath this I have multiple sub... (3 Replies)
Discussion started by: raj.sreenu
3 Replies

4. Shell Programming and Scripting

How to list all Subdirectories and files with its full path in a parent directory?

How to list all Subdirectories and files with its full path in a parent directory? (1 Reply)
Discussion started by: johnveslin
1 Replies

5. UNIX for Dummies Questions & Answers

Need Help in reading N days files from a Directory & combining the files

Hi All, Request your expertise in tackling one requirement in my project,(i dont have much expertise in Shell Scripting). The requirement is as below, 1) We store the last run date of a process in a file. When the batch run the next time, it should read this file, get the last run date from... (1 Reply)
Discussion started by: dsfreddie
1 Replies

6. Shell Programming and Scripting

reading files from a directory.

Can some body help me to code this? go to a specific directory.(/home/abcd/test) file1.txt, file2.txt, ... .. filen.txt read the files in side the folder 'test' and print the content of each file into other folder in the same directory lets say(testresult) with the same file name... (4 Replies)
Discussion started by: rocking77
4 Replies

7. Shell Programming and Scripting

Find files inside the parent directory only

Hi All, The following find command lists the files which are 45 minutes older. But it searches for the sub directories also. $ find . -type f -mmin +45 -print ./hello.txt ./test/hi.txt ./temp/now.txt ls hello.txt test temp How can i modify the find command in such way that it finds... (4 Replies)
Discussion started by: Tuxidow
4 Replies

8. UNIX for Dummies Questions & Answers

Listing files in a non-parent directory

Hi, Edit: The title should really read listing files in a non-parent directory, sorry! Im trying to get one of my Bash scripting assignments done for uni and now I'm stuck. This is probably going to be one of those kick yourself moments but, in my script I have a variable usrDir which... (2 Replies)
Discussion started by: Adzi
2 Replies

9. Shell Programming and Scripting

gawk - reading two files & re arrange the columns

Hi, I am trying to read 2 files and writing to the 3rd file if I find the same elements in 2 files. my first file is 1 0 kb12124819 766409 1.586e-01 1 0 kb17160939 773886 8.674e-01 1 0 kb4475691 836671 8.142e-01 1 0 ... (2 Replies)
Discussion started by: ezhil01
2 Replies

10. Shell Programming and Scripting

Reading files in directory

Hi Everyone , have a nice day i need a help on this thing algo is something like in certain path like /root/user1 i have many files , i need a code which could open every file one by one and then each file has contents like this <moid>CcnCounters=CAPv3-Received-Total-Requests, Source =... (3 Replies)
Discussion started by: Dastard
3 Replies
Login or Register to Ask a Question