Dealing with Empty files, AWK and Loops


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Dealing with Empty files, AWK and Loops
# 8  
Old 06-06-2012
it gives all my means and variance 0

my data is all in one columns of either words like

Quote:
hello this is me
hello this is me again
or a set of numbers like

Quote:
1 2 4 5 6 7
8 9 0 1 2 4
that's y I was using the awk nf
# 9  
Old 06-06-2012
Something just occurred to me. You realize by adding $0, you're really only getting the first column, yes? I don't think your original program was doing quite what you thought it did.

How am I supposed to sum up a set of words like "hello this is me"?

What calculations do you expect to be done for those two lines?
# 10  
Old 06-06-2012
you might be right I am experimenting with it so might not be correct

I get a set of files that have data per-line. they are either words or numbers separated by space.

what I think I was doing is to get the count of the words, or numbers per line using awk NF
then calculate the mean and variance of these lines per file
and finally combine all of them into one single file for mean and variance

Quote:
step 1
hello this is me
hello this is me again

step 2 using awk nf
4
5

step 3
calculate mean and variance for all files in the directory

step 4
put them all in one file
# 11  
Old 06-06-2012
Yes...

So, what numbers would you expect for a file containing these lines:

Code:
hello this is me
hello this is me again

and, what numbers would you expect for a file containing these lines:

Code:
1 2 4 5 6 7
8 9 0 1 2 4

And how would you calculate them? Don't say it, show it. Otherwise, I'm just guessing here...
# 12  
Old 06-06-2012
for text one will be

first
Quote:
4
5
using awk nf then counting mean and variance

for the numbers the same
Quote:
6
6
using awk nf and then mean and variance
# 13  
Old 06-06-2012
I see, so you don't care about their values at all, just how many columns there are! That's an important difference. Smilie Let's try this again...

Code:
# Output separator is ,
BEGIN { OFS="," }

# Whenever the filename changes, print results and start over
F != FILENAME {
        if(n>0)
        {
                mean=s/n;
                for(i in x)
                {
                        ss += (x[i]-mean)^2;
                        delete x[i];
                }
                var=(ss/n);

                print F, COLS, mean, var;
        }

        ss=0;   s=0;    n=0;    F=FILENAME
}

# First line of every file, grab the number of columns
FNR==1 { COLS=NF }

{
        x[++n]=NF;
        s+=NF;
        next
}

# Also a special case for when the last file ends.
END {
        if(n>0)
        {
                mean=s/n;
                for(i in x)
                {
                        ss += (x[i]-mean)^2;
                        delete x[i];
                }
                var=(ss/n);

                print F, COLS, mean, var;
        }
}

# 14  
Old 06-06-2012
they are not columns and i still get 0 and the number which shows as cols is not accurate because i might have different number of words or digits in one line ...
they are mainly txt files broken into lines and few that have numbers but steal need to be threated as text and they are
i get a file and break it to lines when i reach a "."
Code:
sed -n -e ":a" -e "$ s/\n/ /gp;N;b a" data.txt | tr '\. ' '\n '  > afile

i have set of these files
Quote:
file 1
this is me
this is me again

file 2
hello hello hello
hello

file 3
2 3 5
73 543 567 56788
which should give following numbers
Quote:
file 1
3
4

file 2
3
1

file 3
3
4
then these should be fed for calculating the mean and variance

i have this which works but I want to see how to make it look like the one that you wrote nicely

Code:
FILES="data/*"
for X in $FILES
do
	name=$(basename $X) 
	if [ -s $name ]
  		then
	awk '{print NF}' $X |awk '{x[NR]=$0; s+=$0; n++} 
	END{mean=s/n; for (i in x){ss += (x[i]-mean)^2} var=(ss/n); print mean, "," var}' > count/${name}
   else
      echo  "NA, NA" > count/${name}
  fi
done

and then put them all in one file
Code:
awk '{printf "%s,%s\n",FILENAME,$0}' count/* > all-results.txt


Last edited by A-V; 06-06-2012 at 06:45 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Check file from multiple files is empty using awk

I am passing multiple files in awk & since one of the file is empty(say file3) so the same gets skipped & logic goes for toss. Need suggestion/help in checking and putting additional checks for the same awk -F, 'FNR==1 {++filecounter} filecounter==1 {KRL=$2;next} filecounter==2... (8 Replies)
Discussion started by: siramitsharma
8 Replies

2. Shell Programming and Scripting

Awk: Dealing with whitespace in associative array indicies

Is there a reliable way to deal with whitespace in array indicies? I am trying to annotate fails in a database using a table of known fails. In a begin block I have code like this: # Read in Known Fail List getline < "'"$failListFile"'"; getline < "'"$failListFile"'"; getline <... (6 Replies)
Discussion started by: Michael Stora
6 Replies

3. Shell Programming and Scripting

Dealing with multiple files

Korn Shell I have hundreds of small files like below created every day. A midnight cron job moves them to the location /u04/temp/logs But sometimes I have to manually move these files based a certain dates or time. I have two basic requirements 1.Using mv command I want to move all .dat... (2 Replies)
Discussion started by: kraljic
2 Replies

4. UNIX for Dummies Questions & Answers

Dealing with Double Loops, Arrays and GREP

Can someone please help me to learn how to deal with loops, arrays and grep? I have two arrays (lets say I and j) each in a separate file And have file with lines of data I need to extract, such as Ruby Smith: some text here Ruby Smith: some other text here Ruby Brown: some text here Ruby... (10 Replies)
Discussion started by: A-V
10 Replies

5. UNIX Desktop Questions & Answers

awk using 2 input files instead of while loops

Hi Friends, I have two files as input with data that looks like this: file1.txt 1 2 3 4 file2.txt a,aa b,bb c,cc d,dd e,ee f,ff instead of me doing 2 while loops to get the combinations while read line_file1 (2 Replies)
Discussion started by: kokoro
2 Replies

6. Shell Programming and Scripting

Iterating over subdirectories and dealing with files within them

Hello, I am working on a coding project for a class and to test the program I have created, I have come up with 100 different test cases. The program takes four text files as input, so each of the test cases is contained in a folder with four files. I have a folder called 'tests', within which... (1 Reply)
Discussion started by: dpryor
1 Replies

7. Shell Programming and Scripting

Dealing with files with spaces in the name

Hello, I'm a computer science major and I'm having problems dealing with file names with spaces in them. Particularly I'm saving a file name in a variable and then using the variable in a compare function i.e. a='te xt.txt' b='file2.txt' cmp $a $b If anyone could help me with this particular... (10 Replies)
Discussion started by: jakethegreycat
10 Replies

8. Shell Programming and Scripting

Dealing with log files

Hi , My requirement is that i need to search for a number of strings in a log file and print them with line numbers.The search should be date wise. The sample log file is : Jan 17 02:45:34 srim6165 MQSIv500: (UKBRKR1P_B.LZ_ BENCHMARKS)BIP2648E: Message backed out to a queue; node... (6 Replies)
Discussion started by: charudpss
6 Replies

9. Shell Programming and Scripting

perl: When dealing with files that do not exist

I have a process run weekly where I must convert data formats for about thirty files. I read a text file that provides all of the filenames and switch settings. My perl code is: for ($j = 1; $j <= $k; $j++) { open(FIN2,$fin2) || die "open: $!"; do other stuff } Every once in... (2 Replies)
Discussion started by: joeyg
2 Replies

10. Shell Programming and Scripting

While loops and awk

I am trying to make a script that will replace backslashes in a file, but only if the occurance is a pathname. In the file, there are a lot of regular expressions as well, so I'm trying to preserve the integrity of those regular expressions, but convert Windows relative paths. I'm using bash and... (1 Reply)
Discussion started by: Loriel
1 Replies
Login or Register to Ask a Question