Dealing with Empty files, AWK and Loops


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Dealing with Empty files, AWK and Loops
# 1  
Old 06-06-2012
Question Dealing with Empty files, AWK and Loops

I write this bit of code to calculate the mean and variance for all the files in one directory and print the mean and variance in a separate folder but with the same file name.
Code:
FILES="data/*"
for X in $FILES
do
	name=$(basename $X)    
	awk '{x[NR]=$0; s+=$0; n++} 
	END{mean=s/n; for (i in x){ss += (x[i]-mean)^2} var=(ss/n); print mean, "," var}' > count/${name}
done

if there is no data there it gives me an error as following
Quote:
awk: cmd. line:1: fatal: division by zero attempted
awk: cmd. line:1: fatal: division by zero attempted
still it creates the file but empty
then I need to combine all into one file. The way I have managed to do so for is
Code:
awk '{printf "%s,%s\n",FILENAME,$0}' count/* > all-results.txt

but I have some problems here and I would be grateful if you can help

when I am converting the files into columns, the files that are empty dont appear in my list
I need my file names to be there but with "-,-" or "NA, NA" instead of mean and variance

i was thing like this ... ( not sure whether my logic is write and dont know the syntax)
Code:
FILES="data/*"
for X in $FILES
do
	if file empty 
	then 
		print NA, NA > count/${name} 
	else 
		awk '{x[NR]=$0; s+=$0; n++} 
		END{mean=s/n; for (i in x){ss += (x[i]-mean)^2} var=(ss/n); print mean, "," var}' > count/${name}
	fi
done

awk '{printf "%s,%s\n",FILENAME,$0}' count/* > all-results.txt

but I am not sure whether I have to change my loop or the awk bit for combining yet. any idea?Smilie

and then I am wondering whether I can optimise the code by combining those two bits together so instead of creating a whole set of files a another directory and then convert them to a single file, directly same them in a single file.

Thank you again in advance for you help
A-V

Last edited by vbe; 06-06-2012 at 11:32 AM.. Reason: typos
# 2  
Old 06-06-2012
My 2 cents:

Quote:
FILES="data/*"
Are you sure you have no sub-directories?...


There are file test operators, so use them e.g: -b, -d. -f, -s
Code:
if [ -s $X ]
   then
       awk '{x[NR]=$0; s+=$0; n++}
       ..
   else
      echo  "NA, NA" ....
  fi

This User Gave Thanks to vbe For This Post:
# 3  
Old 06-06-2012
Thank you so much for you help

I was using -n instead of -s and it was not working and i was trying to use print instead of echo.

do you think it would be able to push the next awk into this code ?
Code:
awk '{printf "%s,%s\n",FILENAME,$0}' count/* > all-results.txt

so that I will be able to get it directly instead of going through making sets of files in a folder and then put them into a single file?
# 4  
Old 06-06-2012
Look here:
File test operators
This User Gave Thanks to vbe For This Post:
# 5  
Old 06-06-2012
Code:
$ cat meanvar.awk

BEGIN { OFS="," }

F != FILENAME {
        if(n>0)
        {
                mean=s/n;
                for(i in x)
                {
                        ss += (x[i]-mean)^2;
                        delete x[i];
                }
                var=(ss/n);

                print F, mean, var;
        }

        ss=0;   s=0;    n=0;    F=FILENAME
}

{
        x[++n]=$0;
        s+=$0;
        next
}


END {
        if(n>0)
        {
                mean=s/n;
                for(i in x)
                {
                        ss += (x[i]-mean)^2;
                        delete x[i];
                }
                var=(ss/n);

                print F, mean, var;
        }
}

$ tail -n 100 file*
==> file1 <==
1
2
3
4
5
6
7
8
9
10

==> file2 <==
1
1
1
2
3
3
3
4

==> file3 <==

==> file4 <==
5
5
5
5
5
5
5
5
5
5

$ awk -f meanvar.awk file*

file1,5.5,8.25
file2,2.25,1.1875
file4,5,0

$

These 2 Users Gave Thanks to Corona688 For This Post:
# 6  
Old 06-06-2012
this is the best source ... thanks a lot

I was looking for them but didnt know what they are called

---------- Post updated at 11:14 AM ---------- Previous update was at 10:52 AM ----------

Corona688 : that is great way to present... i didnt think it would be possible to do like that and its very readable...

one step before that i count the mean and variance, I have another awk which gets the length to the entries per line which i managed to add in the other for-if loop

Code:
awk '{print NF}' $X

basically, the results from this line is being fed to the code but with your organized awk code, I dont know where they should go...
do they have to go before the

Quote:
ss += (x[i]-mean)^2;
can I have awk like inside awk code?

Thanks again for your help
# 7  
Old 06-06-2012
Code:
# Output separator is ,
BEGIN { OFS="," }

# Whenever the filename changes, print results and start over
F != FILENAME {
        if(n>0)
        {
                mean=s/n;
                for(i in x)
                {
                        ss += (x[i]-mean)^2;
                        delete x[i];
                }
                var=(ss/n);

                print F, COLS, mean, var;
        }

        ss=0;   s=0;    n=0;    F=FILENAME
}

# First line of every file, grab the number of columns
FNR==1 { COLS=NF }

# Every single line, store result and add to sum
{
        x[++n]=$0;
        s+=$0;
}

# Also a special case for when the last file ends.
END {
        if(n>0)
        {
                mean=s/n;
                for(i in x)
                {
                        ss += (x[i]-mean)^2;
                        delete x[i];
                }
                var=(ss/n);

                print F, COLS, mean, var;
        }
}


Last edited by Corona688; 06-06-2012 at 01:34 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Check file from multiple files is empty using awk

I am passing multiple files in awk & since one of the file is empty(say file3) so the same gets skipped & logic goes for toss. Need suggestion/help in checking and putting additional checks for the same awk -F, 'FNR==1 {++filecounter} filecounter==1 {KRL=$2;next} filecounter==2... (8 Replies)
Discussion started by: siramitsharma
8 Replies

2. Shell Programming and Scripting

Awk: Dealing with whitespace in associative array indicies

Is there a reliable way to deal with whitespace in array indicies? I am trying to annotate fails in a database using a table of known fails. In a begin block I have code like this: # Read in Known Fail List getline < "'"$failListFile"'"; getline < "'"$failListFile"'"; getline <... (6 Replies)
Discussion started by: Michael Stora
6 Replies

3. Shell Programming and Scripting

Dealing with multiple files

Korn Shell I have hundreds of small files like below created every day. A midnight cron job moves them to the location /u04/temp/logs But sometimes I have to manually move these files based a certain dates or time. I have two basic requirements 1.Using mv command I want to move all .dat... (2 Replies)
Discussion started by: kraljic
2 Replies

4. UNIX for Dummies Questions & Answers

Dealing with Double Loops, Arrays and GREP

Can someone please help me to learn how to deal with loops, arrays and grep? I have two arrays (lets say I and j) each in a separate file And have file with lines of data I need to extract, such as Ruby Smith: some text here Ruby Smith: some other text here Ruby Brown: some text here Ruby... (10 Replies)
Discussion started by: A-V
10 Replies

5. UNIX Desktop Questions & Answers

awk using 2 input files instead of while loops

Hi Friends, I have two files as input with data that looks like this: file1.txt 1 2 3 4 file2.txt a,aa b,bb c,cc d,dd e,ee f,ff instead of me doing 2 while loops to get the combinations while read line_file1 (2 Replies)
Discussion started by: kokoro
2 Replies

6. Shell Programming and Scripting

Iterating over subdirectories and dealing with files within them

Hello, I am working on a coding project for a class and to test the program I have created, I have come up with 100 different test cases. The program takes four text files as input, so each of the test cases is contained in a folder with four files. I have a folder called 'tests', within which... (1 Reply)
Discussion started by: dpryor
1 Replies

7. Shell Programming and Scripting

Dealing with files with spaces in the name

Hello, I'm a computer science major and I'm having problems dealing with file names with spaces in them. Particularly I'm saving a file name in a variable and then using the variable in a compare function i.e. a='te xt.txt' b='file2.txt' cmp $a $b If anyone could help me with this particular... (10 Replies)
Discussion started by: jakethegreycat
10 Replies

8. Shell Programming and Scripting

Dealing with log files

Hi , My requirement is that i need to search for a number of strings in a log file and print them with line numbers.The search should be date wise. The sample log file is : Jan 17 02:45:34 srim6165 MQSIv500: (UKBRKR1P_B.LZ_ BENCHMARKS)BIP2648E: Message backed out to a queue; node... (6 Replies)
Discussion started by: charudpss
6 Replies

9. Shell Programming and Scripting

perl: When dealing with files that do not exist

I have a process run weekly where I must convert data formats for about thirty files. I read a text file that provides all of the filenames and switch settings. My perl code is: for ($j = 1; $j <= $k; $j++) { open(FIN2,$fin2) || die "open: $!"; do other stuff } Every once in... (2 Replies)
Discussion started by: joeyg
2 Replies

10. Shell Programming and Scripting

While loops and awk

I am trying to make a script that will replace backslashes in a file, but only if the occurance is a pathname. In the file, there are a lot of regular expressions as well, so I'm trying to preserve the integrity of those regular expressions, but convert Windows relative paths. I'm using bash and... (1 Reply)
Discussion started by: Loriel
1 Replies
Login or Register to Ask a Question