awk to match and apply condtions to matchijng files in directories


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to match and apply condtions to matchijng files in directories
# 8  
Old 10-15-2016
Please show us the output from the commands:
Code:
cd /home/cmccabe/Desktop/comparison; ls -l missing/*.txt test_tvc/*.bed

This User Gave Thanks to Don Cragun For This Post:
# 9  
Old 10-15-2016
Hi,

Try changing
Code:
if [[ -f file1 ]]

to

Code:
if [[ -f $file1 ]]


Last edited by greet_sed; 10-15-2016 at 05:16 PM.. Reason: Add color
These 3 Users Gave Thanks to greet_sed For This Post:
# 10  
Old 10-15-2016
Here is the code as well as the output of the ls

Code:
#!/bin/bash

for file in /home/cmccabe/Desktop/comparison/missing/*.txt
do
    file1="/home/cmccabe/Desktop/comparison/test_tvc/${file%%.txt}.bed"
    if [[ -f $file1 ]]
    then
         awk 'FNR==NR{A[$2]=$0;Q=FILENAME;next} ($2 in A){if($10>30 && $11>49){print A[$2] >> "/home/cmccabe/Desktop/comparison/Match_in_both_files_and_meet_criteria";print "Match found in both the files named " Q " and " FILENAME " is: " A[$2];delete A[$2]}} END{print "NON-matched lines between file named "Q " and " FILENAME " are: ";for(i in A){print A[i] >> "/home/cmccabe/Desktop/comparison/out_no_match_found_values";print A[i]}}'  $file $file1 > /home/cmccabe/Desktop/comparison/final/Output_final_file.txt
    fi
done

Code:
cd /home/cmccabe/Desktop/comparison; ls -l missing/*.txt test_tvc/*.bed
-rw-rw-r-- 1 cmccabe cmccabe   756 Oct 11 16:43 missing/F113.txt
-rw-rw-r-- 1 cmccabe cmccabe  1214 Oct 11 16:43 missing/H123.txt
-rw-rw-r-- 1 cmccabe cmccabe   352 Oct 11 16:44 missing/S111.txt
-rw-rw-r-- 1 cmccabe cmccabe 12692 Oct 15 10:36 test_tvc/F113_tvc.bed
-rw-rw-r-- 1 cmccabe cmccabe 12183 Oct 11 16:33 test_tvc/H123_tvc.bed
-rw-rw-r-- 1 cmccabe cmccabe 11845 Oct 11 16:37 test_tvc/S111_tvc.bed

# 11  
Old 10-15-2016
Quote:
Originally Posted by cmccabe
Here is the code as well as the output of the ls

Code:
#!/bin/bash

for file in /home/cmccabe/Desktop/comparison/missing/*.txt
do
    file1="/home/cmccabe/Desktop/comparison/test_tvc/${file%%.txt}.bed"
    if [[ -f $file1 ]]
    then
         awk ... ... > /home/cmccabe/Desktop/comparison/final/Output_final_file.txt
    fi
done

Code:
cd /home/cmccabe/Desktop/comparison; ls -l missing/*.txt test_tvc/*.bed
-rw-rw-r-- 1 cmccabe cmccabe   756 Oct 11 16:43 missing/F113.txt
-rw-rw-r-- 1 cmccabe cmccabe  1214 Oct 11 16:43 missing/H123.txt
-rw-rw-r-- 1 cmccabe cmccabe   352 Oct 11 16:44 missing/S111.txt
-rw-rw-r-- 1 cmccabe cmccabe 12692 Oct 15 10:36 test_tvc/F113_tvc.bed
-rw-rw-r-- 1 cmccabe cmccabe 12183 Oct 11 16:33 test_tvc/H123_tvc.bed
-rw-rw-r-- 1 cmccabe cmccabe 11845 Oct 11 16:37 test_tvc/S111_tvc.bed

Taking your first .txt file as an example, let us see what your code is doing (remember that set -xv is your friend when trying to debug a shell script).

The for loop sets file to:
Code:
/home/cmccabe/Desktop/comparison/missing/F113.txt

Then you use the assignment:
Code:
    file1="/home/cmccabe/Desktop/comparison/test_tvc/${file%%.txt}.bed"

which sets file1 to:
Code:
/home/cmccabe/Desktop/comparison/test_tvc//home/cmccabe/Desktop/comparison/missing/F113.bed

and then your if statement correctly determines that there is no file with that name and skips the awk statement.

So maybe you would have more luck finding files to process (and therefore producing output), if you would change:
Code:
    file1="/home/cmccabe/Desktop/comparison/test_tvc/${file%%.txt}.bed"

to:
Code:
    file1=${file##*/}	# Strip off directory.
    file1="/home/cmccabe/Desktop/comparison/test_tvc/${file1%.txt}_tvc.bed"

I haven't even tried to figure out what your one-line awk script does, but I do note that with your sample directory listings you will be running this awk code three times and each time you run it, the output produced by the previous run will be destroyed. (Did you perhaps want >> instead of > as the redirection at the end of that script? Or maybe you want to redirect the output from the for loop to that file instead of repeatedly redirecting the output from the awk script. Which you want depends on whether you want to add to output from previous runs of your script or have each run of your script save only the results from that run.)

And, despite what greet_sed said, the if statements:
Code:
    if [[ -f $file1 ]]
    if [[ -f file1 ]]

(with or without the $) should have exactly the same effect when using the double square bracket conditional expressions. greet_sed was correct in saying that you need to use:
Code:
    if [[ -f $file1 ]]

instead of:
Code:
    if [[ -f file1 ]]

If you had been using one of the test commands:
Code:
    if [ -f "$file1" ]
    if test -f "$file1"

instead of conditional expressions, then not only would the $ be required, but also double-quotes should be added to protect against filenames containing field separation characters.
These 2 Users Gave Thanks to Don Cragun For This Post:
# 12  
Old 10-16-2016
Please try out this :

Code :

Code:
ls /home/cmccabe/Desktop/comparison/missing/*.txt >/mydir/file1
cut -d '.' -f1 /mydir/file1 > /mydir/file2
ls /home/cmccabe/Desktop/comparison/test_tvc/*.bed > /mydir/file3

for i in `cat /mydir/file2`
do
   for j in `cat /mydir/file3`
    do
       echo "$j" | grep "^$i"
           if [ "$?" == "0" ]
            then
               if[ "$10" > "30" && "$11" > "49" ]
               then
               echo -e "$i\n"
               fi
           else
              echo -e "no match is found \n"
           fi
   done
done
 rm /mydir/file1 /mydir/file2 /mydir/file3

Basically first redirecting the *.txt s and *.beds in 2 different files and taking out the values before *.txt s in 3rd file removing *.txt from each line.
Later making 3rd file as the primary file and comparing its each line (using for loop) with each line of 2nd files values i.e *.beds(using for loop),by line starting with primary file's each lines .
Once the criteria meet, check the exit status is 0 then go for checking the 2nd condition "$10>30 && $11>49" and if both are met then display primary file's each lines else mention "No Match found".At last removing the temporary files created.

Thanks,
Sanghamitra

Last edited by Sanghamitra C.; 10-16-2016 at 02:26 AM.. Reason: Added explanation of my code
This User Gave Thanks to Sanghamitra C. For This Post:
# 13  
Old 10-16-2016
Quote:
Originally Posted by Sanghamitra C.
Please try out this :
Code :
Code:
ls /home/cmccabe/Desktop/comparison/missing/*.txt >/mydir/file1
cut -d '.' -f1 /mydir/file1 > /mydir/file2
ls /home/cmccabe/Desktop/comparison/test_tvc/*.bed > /mydir/file3

for i in `cat /mydir/file2`
do
   for j in `cat /mydir/file3`
    do
       echo "$j" | grep "^$i"
           if [ "$?" == "0" ]
            then
               if[ "$10" > "30" && "$11" > "49" ]
               then
               echo -e "$i\n"
               fi
           else
              echo -e "no match is found \n"
           fi
   done
done
 rm /mydir/file1 /mydir/file2 /mydir/file3

Basically first redirecting the *.txt s and *.beds in 2 different files and taking out the values before *.txt s in 3rd file removing *.txt from each line.
Later making 3rd file as the primary file and comparing its each line (using for loop) with each line of 2nd files values i.e *.beds(using for loop),by line starting with primary file's each lines .
Once the criteria meet, check the exit status is 0 then go for checking the 2nd condition "$10>30 && $11>49" and if both are met then display primary file's each lines else mention "No Match found".At last removing the temporary files created.
Thanks,
Sanghamitra
Hello Sanghamitra C.,

Welcome to forums, hope you will enjoy learning/shraing knowledge here. Not sure if you have tested above code or not. There could be few points which we could to make above code better.
i- echo "$j" | grep "^$i", could be changed to if [[ "$j" == "$j" ]]. Because we need to check either file names are equal or not.
ii- if[ "$10" > "30" && "$11" > "49" ], for this code in shell $10 or $11 fields are not considered like that, they work in this format in awk. You could use cutto take the 10th and 11th field's values.
iii-for i in `cat /mydir/file2` and for j in `cat /mydir/file3`codes could be done by whileloops for an example.
Code:
while read i
do
    while read j
    do
    .............(all code here)
    done < "/mydir/file3"
done < "/mydir/file2"
.............(rest of the code)

Thanks,
R. Singh
These 2 Users Gave Thanks to RavinderSingh13 For This Post:
# 14  
Old 10-16-2016
Quote:
Originally Posted by Don Cragun
[..]and, despite what greet_sed said, the if statements:
Code:
    if [[ -f $file1 ]]
    if [[ -f file1 ]]

(with or without the $) should have exactly the same effect when using the double square bracket conditional expressions. If you had been using test commands:
Code:
    if [ -f "$file1" ]
    if test -f "$file1"

instead of conditional expressions, then the $ would be required and double-quotes should be added to protect against filenames containing field separation characters.
Hi Don, that does not seem to be an accurate statement.

The $ is still required for variable expansions within double bracket expressions (as well as within single brackets (test commands); a difference would be the double quote protection that would be needed in the case of single brackets)

A situation where $-signs are not required for basic variable expansions are within arithmetic expressions, but that is not the case here.

So IMO greet_sed was right after all.

Last edited by Scrutinizer; 10-16-2016 at 05:44 AM..
These 3 Users Gave Thanks to Scrutinizer For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk move select fields to match file prefix in two directories

In the awk below I am trying to use the file1 as a match to file2. In file2 the contents of $5,&6,and $7 (always tab-delimited) and are copied to the output under the header Quality metrics. The below executes but the output is empty. I have added comments to help and show my thinking. Thank you... (0 Replies)
Discussion started by: cmccabe
0 Replies

2. Shell Programming and Scripting

awk to match field between two files and use conditions on match

I am trying to look for $2 of file1 (skipping the header) in $2 of file2 (skipping the header) and if they match and the value in $10 is > 30 and $11 is > 49, then print the line from file1 to a output file. If no match is foung the line is not printed. Both the input and output are tab-delimited.... (3 Replies)
Discussion started by: cmccabe
3 Replies

3. Shell Programming and Scripting

sed - pattern match - apply substitution

Greetings Experts, I am on AIX and in process of creating a re-startable script that connects to Oracle and executes the statements. The sample contents of the file1 is CREATE OR REPLACE VIEW DB_V.TAB1 AS SELECT * FROM DB_T.TAB1; .... CREATE OR REPLACE VIEW DB_V.TAB10 AS SELECT * FROM... (9 Replies)
Discussion started by: chill3chee
9 Replies

4. Shell Programming and Scripting

awk - Compare files in two different directories

Hi, My script works fine when I have both input files in the same directory but when I put on of the input file in another directory, the output does not show up. SCRIPT: awk ' BEGIN { OFS="\t" out = "File3.txt"} NR==FNR && NF {a=$0; next} function print_77_99() { if... (3 Replies)
Discussion started by: High-T
3 Replies

5. Homework & Coursework Questions

Finding the directories with same permission and then apply some default UNIX commands

Write a Unix shell script named 'mode' that accepts two or more arguments, a file mode, a command and an optional list of parameters and performs the given command with the optional parameters on all files with that given mode. For example, mode 644 ls -l should perform the command ls -l on all... (5 Replies)
Discussion started by: femchi
5 Replies

6. Shell Programming and Scripting

Finding the directories with same permission and then apply some default UNIX commands

HI there. My teacher asked us to write a code for this question Write a Unix shell script named 'mode' that accepts two or more arguments, a file mode, a command and an optional list of parameters and performs the given command with the optional parameters on all files with that given mode. ... (1 Reply)
Discussion started by: femchi
1 Replies

7. Shell Programming and Scripting

apply record separator to multiple files within a directory using awk

Hi, I have a bunch of records within a directory where each one has this form: (example file1) 1 2 50 90 80 90 43512 98 0909 79869 -9 7878 33222 8787 9090 89898 7878 8989 7878 6767 89 89 78676 9898 000 7878 5656 5454 5454 and i want for all of these files to be... (3 Replies)
Discussion started by: amarn
3 Replies

8. UNIX for Dummies Questions & Answers

Do UNIX Permission apply to sub directories?

Hi Guys, Can you tell me if unix permissions apply to sub dirs? Dir is /home/ops/batch/files/all /home is rwxrwxrwx ops is rwxrwxrwx batch is rwxr-wr-w files is rwxrwxrwx all is rwxrwxrwx Having problems writing to all (does the userid nee to be the batch owner... (1 Reply)
Discussion started by: Grueben
1 Replies

9. Shell Programming and Scripting

Apply 'awk' to all files in a directory or individual files from a command line

Hi All, I am using the awk command to replace ',' by '\t' (tabs) in a csv file. I would like to apply this to all .csv files in a directory and create .txt files with the tabs. How would I do this in a script? I have the following script called "csvtabs": awk 'BEGIN { FS... (4 Replies)
Discussion started by: ScKaSx
4 Replies

10. Shell Programming and Scripting

AWK Script - Count Files In Directories

Hey, I'm very new to AWK and am trying to write a script that counts the number of files in all subdirectories. So, basically, my root has many subdirectories, and each subdirectory has many files. How can I get the total count? I haven't been able to figure out how to loop through the... (1 Reply)
Discussion started by: beefeater267
1 Replies
Login or Register to Ask a Question