Help with selecting files from "diff" output


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with selecting files from "diff" output
# 1  
Old 05-12-2013
Help with selecting files from "diff" output

I have two directories Dir_A and Dir_A_Arc. Need help with a shell script.

The shell script needs to take the path to these two directories as parameters $1 and $2.

The script needs to check if any files exist in these directories and if either of the directories are empty then exit normally.

If files are present in both directories then run a diff something like

Code:
diff -r -q /path/to/Dir_A /path/to/Dir_A_Arc

From the comparison result

a. The files that are similar in both directories needs to deleted from Dir_A

b. Files that are new in Dir_A and files that differ in both directories needs to kept In Dir_A

Have the below script so far .


Code:
#!/usr/bin/bash
echo "$1 is the root directory "

echo "$2 is the archive directory"

if [ -e $1/*.sql ] && [ -e $2/*.sql ]

then 

echo "files exists in $1 and $2"

exit 0

else 

echo "no files exists"
fi

diff -r -q $1 $2 ;

rm -r `diff -sq  $1 $2 | awk '/are identical$/{print $2}'` ;

---------- Post updated at 04:56 PM ---------- Previous update was at 04:47 PM ----------

The output for the above script is:
Code:
dir/Dir_A is the root directory
dir/Dir_A_arc is the archive directory
Only in dir/Dir_A/sp: dev.txt 
Only in dir/Dir_A/sp: dev_22.txt
Only in dir/Dir_A/sp: dev_33.txt
Files dir/Dir_A/spp/Document.txt and dir/Dir_A_arc/spp/Document.txt differ
Files dir/Dir_A/spp/text.txt and dir/Dir_A_arc/spp/text.txt differ
Only in dir/Dir_A_arc/sp: dev_444.txt

Only in dir/Dir_A_arc/sp: dev_555.txt
rm: missing operand
Try `rm --help' for more information

The highlighted files are the ones I have to retain in Dir_A.

Last edited by Scott; 05-12-2013 at 07:12 PM.. Reason: Please use code tags
# 2  
Old 05-13-2013
I would use comm, sort, find to compare file lists of each subtree, sed to separate the both from the a only and b only lists for futher processing or reporting, and cmp to decide if the both files differ:
Code:
comm <(
    cd xx
    export LC_ALL=C
    find * -type f|sort -u
  ) <(
    cd xx_Arc
    export LC_ALL=C
    find * -type f|sort -u
  )| sed '
    s/^\t\t//
    t
    s/^\t//
    t b
    w xx_only
    d
    :b
    w xx_Arc_only
    d
   ' | while read f
do
  if [ "" = "$(cmp xx/$f xx_Arc/$f 2>&1)" ]
  then
    ....
  fi
done

LC_ALL=C ensures a binary sort for comm. Diff is for humans, mostly. Bash, or ksh on /dev/fd/# systems, to get the <().
# 3  
Old 05-21-2013
@DGPickett ...Thanks a lot for helping me out..!! :-)
# 4  
Old 06-17-2013
I am having issues with the filenames with "space" in them can you help me out ?
# 5  
Old 06-18-2013
Sorry, need more robust effort; use quotes around $f, like
Code:
if [ "" = "$(cmp xx/'$f' xx_Arc/'$f' 2>&1)" ]

Since the whole is already in double quotes, a single quote does not upset that quoting and it does not prevent expanding $f, since the single quote, in double quotes, is a literal for itself, not a live quote, yet. The expanded $f is in single quotes, so spaces are OK unless it gets passed through some shell again not properly quoted. For instance, a shell script should accept parameters as "$1" not barefoot, in case of meta-characters.

Now, if you have a file with a quote character in the name, anothe flavor of dealing with it is to convert anymets-characterslike space and quote into '?', a wild card for a single character but otherwise not a space or quote. In this case, just stick a '| tr ' ' '?' " before the "| while read " and all the spaces become '?'. Just make sure that the last use of it is barefoot so the shell can expand it and virtually quote it as $1 or whatever to the C program. In actuality, by that time it is converted to a null terminated string pointed to by some member of the argv[] array of character pointers. Quoting become start here and null terminate there in machine language. The C open() call and such can deal with the embedded spaces just fine, it is the shell that divides it into two or more arguments when it finds a $IFS character.
# 6  
Old 06-18-2013
Hi.

There are standard (Linux) utilities to compare directories. Whether they will fulfill all your desires is up to you to find out.

See find same size file for a demonstration of fdupes and rdfind.

Best wishes ... cheers, drl

Last edited by drl; 06-18-2013 at 08:31 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies

2. Shell Programming and Scripting

Delete all log files older than 10 day and whose first string of the first line is "MSH" or "<?xml"

Dear Ladies & Gents, I have a requirement to delete all the log files in /var/log/test directory that are older than 10 days and their first line begin with "MSH" or "<?xml" or "FHS". I've put together the following BASH script, but it's erroring out: for filename in $(find /var/log/test... (2 Replies)
Discussion started by: Hiroshi
2 Replies

3. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

4. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

5. UNIX for Dummies Questions & Answers

Explanation of "total" field in "ls -l" command output

When I do a listing in one particular directory (ls -al) I get: total 43456 drwxrwxrwx 2 root root 4096 drwxrwxrwx 3 root root 4096 -rwxrwxr-x 1 nobody nobody 3701594 -rwxrwxr-x 1 nobody nobody 3108510 -rwxrwxr-x 1 nobody nobody 3070580 -rwxrwxr-x 1 nobody nobody 3099733 -rwxrwxr-x 1... (1 Reply)
Discussion started by: proactiveaditya
1 Replies

6. Solaris

significance of "+" char in SunOS "ls -l" output

Hi, I've noticed that the permissions output from "ls -l" under SunOS differs from Linux in that after the "rwxrwxrwx" field, there is an additional "+" character that may or may not be there. What is the significance of this character? Thanks, Suan (6 Replies)
Discussion started by: sayeo
6 Replies

7. Shell Programming and Scripting

error "integer expression expected" when selecting values

dear members, I am having some difficulties with an automation script that I am writing. We have equipments deployed over our network that generate status messages and I was trying an automated method to collect all information. I did a expect script that telnet all devices, logs, asks for... (4 Replies)
Discussion started by: jorlando
4 Replies

8. Debian

Debian: doubt in "top" %CPU and "sar" output

Hi All, I am running my application on a dual cpu debian linux 3.0 (2.4.19 kernel). For my application: <sar -U ALL> CPU %user %nice %system %idle ... 10:58:04 0 153.10 0.00 38.76 0.00 10:58:04 1 3.88 0.00 4.26 ... (0 Replies)
Discussion started by: jaduks
0 Replies

9. UNIX for Dummies Questions & Answers

diff on c-source file always returns "files differ"

I have a c-source file that is evidently seen by unix as a binary file. When doing a diff between it and older versions with substantial differences, diff will only return "files differ". I have tried cat-ing the file to another file; tried using the "-h" on the diff; I have tried ftp-ing it... (7 Replies)
Discussion started by: C-Prog-Man
7 Replies

10. Shell Programming and Scripting

reformat the output from "diff" command

Hi all, I use the diff command and got the output: $> diff -e file1.txt file2.txt 15a 000675695 Yi Chen Chen 200520 EASY 50 2/28/05 0:00 SCAD Debit Card Charge . 12a 000731176 Sarah Anderson 200520 EASY 25 2/28/05 0:00 SCAD Debit Card Charge . 11a... (5 Replies)
Discussion started by: CamTu
5 Replies
Login or Register to Ask a Question