What output file is (or output files are) supposed to be created?
It looks like you want to run your awk script 250 times each with a pair of input files, but the output from each of those 250 runs goes to a single output file and that single output file is overwritten (not appended to) each time your awk script is run.
Is the intent to create 250 different output files with a name corresponding to the names of the input files, or do you want one output file containing the concatenated contents of the 250 individual file comparisons? If it is one output file that you want, does there need to be some header to each section of the output specifying the input files processed to produce the following section of data in that output file? And, if so, what is the format of that header?
This User Gave Thanks to Don Cragun For This Post:
The awk does run on pairs and after running the awk is to create 250 different output files with a name corresponding to the names of the input files. Thank you .
Last edited by cmccabe; 10-01-2016 at 11:15 AM..
Reason: added details
The awk does run on pairs and after running the awk is to create 250 different output files with a name corresponding to the names of the input files. Thank you .
And what are the names of those output files supposed to be???
This User Gave Thanks to Don Cragun For This Post:
If the below are the files being used in the awk, comparing F13_ref_FP_10bp.txt to F13_epilepsy.vcf then the output should be F13_epilepsy_comparison.txt.
H19_ref_FP_10bp.txt to H19_marfan.vcf then the output should be H19_marfan_comparison.txt. The input into the awk and the output is tab delimited. Thank you .
If we go back to post #3 in this thread, you said that you have a file named out containing pairs of input filenames:
and from post #1 we have pairs of filenames:
From these examples am I correct in assuming that we can skip creation of the out file and just look at /home/cmccabe/Desktop/comparison/validation/files/*_*.txt files and know that there will be a corresponding file in /home/cmccabe/Desktop/comparison/validation/files/ with a name that has the same unique string before the first underscore character and ending in the string .vcf? Note that the name from post #3 quoted above marked in red does not end in .vcf. Was that a typo, or do some names in that directory not end in .vcf?
Is the assumption that there is only one file with the string before the first underscore in both of those directories correct? Or, do you want a script that depends on you creating a file named /home/cmccabe/Desktop/comparison/ref_val/out that contains lines containing three values (file1 filename, file2 filename, and output file filename)?
Is it OK to use an awk script that processes all 250 sets of input files in one invocation instead of invoking awk 250 times?
This User Gave Thanks to Don Cragun For This Post:
All of the REF files end in .txt and are located in a folder at /home/cmccabe/Desktop/comparison/reference/10bp.
All of the VAL files end in .vcf and are located in a folder at /home/cmccabe/Desktop/comparison/validation/files. I did have a typo in post #3.
Quote:
Is the assumption that there is only one file with the string before the first underscore in both of those directories correct?
Yes, there will only be one file in each separate directory with the string before the first _. So there is no need for an out file other then to know which samples were processed.
Quote:
Is it OK to use an awk script that processes all 250 sets of input files in one invocation instead of invoking awk 250 times?
There may not always be 250 sets of input, that # is variable, but yes all of them can be processed at once rather than each set individually. is that what you mean? Thank you .
Last edited by cmccabe; 10-01-2016 at 02:16 PM..
Reason: added details
.
.
.
it looks like the script reads all the vcf files from REF and puts them in a variable FN.
No, unless the statement on REF's contents in post#1 was not true. Given it IS true, FN assumes three file names ending in .txt
Quote:
How do the txt files from VAL get used by the awk.
They are not. The proposal assumes that for every .txt- file name's prefix ID a respective .vcf file exists, possibly in another path (as mentioned in my post).
Quote:
The awk looks at each REF file and compares it to each VAL file looking for what’s common and what’s different. If a difference is found it identifies which file the missing data came from.
.
.
.
The operation of the awk script is not the topic of this thread, nor is the desired output.
Please! check if any of the proposals hitherto provide the needed input file pairs, might be adaptable to also provide the needed output file name, and comment on their aptitude.
And, please please please, get some structure into your future requests and relieve us from guessing!
Hi All,
I was trying a shell script. I was unable to store file contents to a variable in the script. I have tried the below but unable to do it.
Input = `cat /path/op.diary`
Input = $(<op.diary)
I am using ksh shell. I want to store the 'op.diary' file contents to the variable 'Input'... (12 Replies)
I'm working on a script in which gives certain details in its output depending on user-specified options. So, what I'd like to do is something like:
if
then
awkcmd='some_awk_command'
else
awkcmd='some_other_awk_command'
fi
Then, later in the script, we'd do something like:
... (5 Replies)
Hi,
My aim is to get the md5 hash of a file and store it in a variable.
var1="md5sum file1"
$var1
The above outputs fine but also contains the filename, so somthing like this 243ASsf25 file1
i just need to get the first part and put it into a variable.
var1="md5sum file1"... (5 Replies)
I am working on a script for Mac OS X that, among many other things, gets a list of all the installed Applications. I am pulling the list from the system_profiler command and formatting it using grep and awk. The problem is that I want to be able to use each result individually later in the script.... (3 Replies)
Hi all, im having snags creating a variable which uses commands like cut and grep. In the instance below im simply trying to take a value from another file and assign it to a variable. When i do this it only prints the $a rather than the actual value. I know its simple but does anyone have any... (1 Reply)
i want to store the output of 'tail -5000 file' to a variable.
If i want to access the contents of that variable, it becomes kinda difficult because when the data is stored in the variable, everything is mushed together. you dont know where a line begins or ends.
so my question is, how can i... (3 Replies)
Hi folks,
I'm using bash and would like to do the following. I would like to read some values from the file and store it in the variable and use it.
My file is 1.txt and its contents are
VERSION=5.6
UPDATE=4
I would like to read "5.6" and "4" and store it in a variable in shell... (6 Replies)
Hi,
i have some files in one directory(say some sample dir) whose names will be like the following.
some_file1.txt
some_file2.txt.
i need to get the last modified file size based on file name pattern like some_
here i am able to get the value of the last modified file size using the... (5 Replies)
HI
I am trying to store the output of this awk command
awk -F, {(if NR==2) print $1} test.sr
in a variable when I am trying v= awk -F, {(if NR==2) print $1} test.sr
$v = awk -F, {(if NR==2) print $1} test.sr
but its not working out .
Any suggestions
Thanks
Arif (3 Replies)