I've run into another problem that I've been unable to solve. With everyone's help last time, the script worked perfectly! This problem takes a little more finesse, and the bash script I thought up didn't work, so I've canned it. I'd like to try awk if possible. Here's my problem:
I have a multitude of sequential files like:
That continues to a certain number (in this case 47 of these .dat files, so the last one is _r47.dat). Inside of each file, there are four columns:
Here's the tricky part. The first column in each of the .dat files is the same, and I don't really care about the second or third column. What I would like is a script that looks at a_r02.dat and a_r01.dat, computes the different in the fourth column between the two files, and prints that (along with the value of the first column) into a different file, and then continues by computing the difference of the fourth column between a_r03.dat and a_r02.dat and prints that out. I'm not sure if I've explained this well, so I'll try for an example. Suppose two files are:
a_r01.dat a_r02.dat
The script should compute the difference between the fourth column of each row and print an output.dat file that looks like:
After it is done, it should continue by computing the same thing for a_r03 and a_r02, all the way down the line (until it terminates after running out of files), and each time, should put the difference in a new column in the output.dat file. So after a time, the output.dat should look like (using only column headers divided by a | symbol):
If my math is right, if I have 10 .dat files, the output.dat should have the first column and then 9 other columns of 4th row differences (between the input .dat files).
I hope I've explained this appropriately, and please let me know if anyone has any questions. I'm hoping that awk can do this, but if it is easier using perl or bash (or any other program), please let me know and I can easily get access to it. Thank you so much for your help!
If you want the output sorted you can pipe the output to sort:
With gawk you can avoid the sort command if you set the undocumented WHINY_USERS variable:
I've tested out the script, and it seems I've not explained the problem quite right. I'm sorry that I haven't described the problem well enough. Let me try it again.
Firstly, I think I confused people with the last "code" bit in my initial post. I don't want the values separated by a " | " line, just spaces will do. I suppose I got carried away in my explanation, they were just meant as dividers so people knew that I wanted the values separated. So, that being said, the first column of the output.dat file should be exactly like the first column of all the input files.
Ultimately, what I would like to do is put the output.dat file in gnuplot and tell it to "plot 'output.dat' u 1:2 w l" and then replot 'output.dat' u 1:3 w l", and so on (just to give you an idea of what I want to do with the data).
So I would like the first column of the output.dat to be an exact copy of the first column of any of my input files (the first column is always the same). The second column of output.dat is the difference between the 4th column of a_r01.dat and a_r02.dat, the third column is the difference between a_r02 and a_r03, fourth is a_r04 - a_r03, etc and so on until I run out of .dat files.
I hope I'm not coming off as too whiny, that's not my intent at all. I really do appreciate everyone's help around here, most of those that frequent these boards have coding skills I could only dream of!
Sorry I don't get it. I've changed the field separator and this is my output with 3 files:
If that's not what you desire, post the desired output from the given 3 sample files.
That's it! It works perfectly. I was seeing some kind of funky input for the first few lines, and I think it has to do with a bug in the code. It became significantly easier to read once the "|" was gone and I could see the cause of the bug. Thanks you very much for your help
Edit : One more quick question: Would the script change significantly if I just had it do the difference between a1.txt and all the others? Like a2 - a1, a3 - a1, a4 - a1, etc? How would that look? Thanks again!
That's it! It works perfectly. I was seeing some kind of funky input for the first few lines, and I think it has to do with a bug in the code. It became significantly easier to read once the "|" was gone and I could see the cause of the bug. Thanks you very much for your help
Edit : One more quick question: Would the script change significantly if I just had it do the difference between a1.txt and all the others? Like a2 - a1, a3 - a1, a4 - a1, etc? How would that look? Thanks again!
Not really, remove this command a[$1]=$4 from the code:
Hi,
I need to find the difference between 2 files in unix and write the result in the new file
File1:
A
B
File2:
X 123 hajkd
Y 345 adjfka
A 123 djafjhd
B 678 dsndjks
Output file:
X 123 hajkd
Y 345 adjfka
Thanks. (6 Replies)
Hi!
I'm new in awk and I need some help.
I have a folder with a lot of files and I need that awk do something in each file and print a new file with the output. The input file name should be modified when I print the outpu files.
Thanks in advance for help!
:-)
ciao (5 Replies)
Hi all,
i have 50 files .data should be same in these 50 files , so my task is to find the difference. i need a logic , which finds difference between all files and print in output file with file name where it found that difference .
i tried below logic , but its not giving me what i want.
let... (2 Replies)
It seems like a common task, but I haven't been able to find the solution.
vitallog.txt
1310,John,Hancock
13211,Steven,Mills
122,Jane,Doe
138,Thoms,Doe
1500,Micheal,May
vitalinfo.txt
12122,Jane,Thomas
122,Janes,Does
123,Paul,Kite
**OUTPUT**
vitalfiltered.txt
12122,Jane,Thomas... (2 Replies)
Hi,
I have a directory /home/datasets/ which contains a bunch (720) of subdirectories called hour_1/ hour_2/ etc..etc.. in each of these there is a single text file called (hour_1.txt in hour_1/ , hour_2.txt for hour_2/ etc..etc..) and i would like to do some text processing in them.
Each of... (20 Replies)
Hi,
Could anyone help me to solve this problem?
I have two files "f1" and "f2" having 2 fields in each, a) file size and b) file name. The data are almost same in both the files except for few and new additional lines. Now, I have to find out and print the output as, the difference in the... (3 Replies)
Hi,
I'd like to process multiple files. For example:
file1.txt
file2.txt
file3.txt
Each file contains several lines of data. I want to extract a piece of data and output it to a new file.
file1.txt ----> newfile1.txt
file2.txt ----> newfile2.txt
file3.txt ----> newfile3.txt
Here is... (3 Replies)
Hi guys,
say I have a few files in a directory (58 text files or somthing)
each one contains mulitple strings that I wish to replace with other strings
so in these 58 files I'm looking for say the following strings:
JAM (replace with BUTTER)
BREAD (replace with CRACKER)
SCOOP (replace... (19 Replies)
Hello,
I am trying to write a bash shell script that does the following:
1.Finds all *.txt files within my directory of interest
2. reads each of the files (25 files) one by one (tab-delimited format and have the same data format)
3. skips the first 10 rows of the file
4. extracts and... (4 Replies)
Hi,
filenames:
contains name of list of files to search in.
placelist
contains the names of places to be searched in all files in "filenames"
for i in $(<filenames)
do
egrep -f placelist $i
if ]
then
echo $i
fi
done >> outputfile
Output i am getting: (0 Replies)