Extracting non multiple files via script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting non multiple files via script
# 1  
Old 10-28-2012
Extracting non multiple files via script

Hi,
Can somebody help me?
I am testing a demo with the given function
Code:
PATH525[1]="/uscms/home/emily/READme/extra/data/"
TEMP=temp
FileName=DataFileName

CopyFiles() {
#    PATHNAME="$paths"                                                                                                                                

#    if [ "$2" = "525" ]; then                                                                                                                        
#       PATHNAME="$PATH525[*]"                                                                                                                        
#   elif [ "$2" = "533" ]; then                                                                                                                       
#         PATHNAME="$PATH533[*]"                                                                                                                      
#    elif [ "$2" = "ZY" ]; then                                                                                                                       
#       PATHNAME="$PATHZY[*]"                                                                                                                         
#   fi                                                                                                                                                
#  echo "pathname is $PATHNAME"                                                                                                                       

    echo 'Being Called'
    for FileNameIndx in "${PATH525[@]}"
      do
      if [[ ! -e "dest_path/$FileNameIndx" ]]; then
          ls -ltr "$FileNameIndx" | grep root | awk '{print string path $9}' string="$CONSTANT" path="$FileNameIndx"  >> "$File0"
          echo 'file0 :'$File0
          sort -nrk5 < $File0 | awk -F_ '!x[$3]++' > $FileName
          echo 'FileName :' $FileName
          echo "$FileNameIndx is copied"
      else
          echo "Check the FileName in ${PATHNAME[@]}"
      fi
      echo "---------------------------------------------------------"
      echo ">>> DataFiles are from :" ${PATH533[@]}
      echo "---------------------------------------------------------"
    done
}

Now, the "PATH525" has following content
Code:
 
-rw-r--r-- 1 emily us_cms  9 Oct 27 10:28 vgtee_1_ujh.root
-rw-r--r-- 1 emily us_cms 100 Oct 27 10:28 vgtee_1_ujf.root
-rw-r--r-- 1 emily us_cms 12 Oct 27 10:28 vgtee_2_ujf.root
-rw-r--r-- 1 emily us_cms 10 Oct 27 10:28 vgtee_3_ujf.root
-rw-r--r-- 1 emily us_cms  6 Oct 27 10:28 vgtee_3_ujh.root
-rw-r--r-- 1 emily us_cms  7 Oct 27 10:28 vgtee_4_ujh.root
-rw-r--r-- 1 emily us_cms  9 Oct 27 10:28 vgtee_5_ujh.root

And I was expecting the final result should be all files except the duplicates.
(by duplicate, I mean the occurance of *3*)
But is it not working SmilieSmilie

I get the content of the "DataFileName" as :
Code:
 /uscms/home/emily/READme/extra/data/vgtee_5_ujh.root
/uscms/home/emily/READme/extra/data/vgtee_3_ujf.root

Whereas I WANT IS FOLLOWING
Code:
 
/uscms/home/emily/READme/extra/data/vgtee_1_ujf.root
/uscms/home/emily/READme/extra/data/vgtee_2_ujf.root
/uscms/home/emily/READme/extra/data/vgtee_3_ujf.root
/uscms/home/emily/READme/extra/data/vgtee_4_ujh.root
/uscms/home/emily/READme/extra/data/vgtee_5_ujh.root

Also please take notice of the fact that in case of DUPLICATION of file, I would LIKE TO HAVE BIGGER SIZE FILE TO BE IN DATAFILENAME file


Thanks in advance,
emily
# 2  
Old 10-28-2012
So you want to consider chars 1 - 7 of the filename only in order to find "duplicates" (No two or more digit integers possible?), and, if found, use the larger size file name?
I don't see any attempt to use either criterion in your code snippet? BTW, vgtee_1 would be a duplicate as well, wouldn't it?
# 3  
Old 10-28-2012
Hi,
Yeah you are right.
Now, I manage to do what I wanted. Following is the code snipet for that
Code:
 
  for FileNameIndx in "${PATH535[@]}"
      do
      if [[ ! -e "dest_path/$FileNameIndx" ]]; then
          ls -ltr "$FileNameIndx" | grep root | awk -F_ '{print $3,$0}' OFS=\t | sort -n | cut -f2- >> $File0"_0"
          #ls -ltr "$FileNameIndx" | grep root | awk '{print string path $9}' string="$CONSTANT" path="$FileNameIndx"  >> "$File0"                    
          sort -nrk5 < $File0"_0" | awk -F_ '!x[$3]++' >> $File0"_1"
          grep -in "vg" $File0"_1" | awk '{print path string $9}' string="/" path="$FileNameIndx" >> $FileName
          echo "$FileNameIndx is copied"
      else
          echo "Check the FileName in ${PATHNAME[@]}"
      fi
      echo "---------------------------------------------------------"
      echo ">>> DataFiles are from :" ${PATH535[@]}
      echo "---------------------------------------------------------"
    done

well, it is bit lengthy as I am beginner with script. But it works fine for me..Smilie

And about the "vgtree_1", I would prefer the
Code:
 100 Oct 27 10:28 vgtee_1_ujf.root

with large file size.

thanks
emily,

---------- Post updated at 08:40 AM ---------- Previous update was at 06:04 AM ----------

Quote:
Originally Posted by RudiC
So you want to consider chars 1 - 7 of the filename only in order to find "duplicates" (No two or more digit integers possible?), and, if found, use the larger size file name?
I don't see any attempt to use either criterion in your code snippet? BTW, vgtee_1 would be a duplicate as well, wouldn't it?
Hi RudiC,
yes you are right, occurance of *1* is also duplication.
And now, the problem I am facing is following. The code snippet that I showed in my following mail is working fine but it leave a blank row on the top. I wonder if that can be removed somehow.
I would get the wrong results because of that.

can you help me?

Thanks,
Emily
# 4  
Old 10-28-2012
I'm not sure I understand your code snippet.

The following will do the job if run in the target directory; it works on my linux/bash/mawk system:
Code:
ls -l|sort -k8 -k5,5rn|awk '/root/ && !Exist[substr($9,1,7)]++'
-rw-r--r-- 1 emily us_cms 100 Oct 27 10:28 vgtee_1_ujf.root
-rw-r--r-- 1 emily us_cms 12 Oct 27 10:28 vgtee_2_ujf.root
-rw-r--r-- 1 emily us_cms 10 Oct 27 10:28 vgtee_3_ujf.root
-rw-r--r-- 1 emily us_cms  7 Oct 27 10:28 vgtee_4_ujh.root
-rw-r--r-- 1 emily us_cms  9 Oct 27 10:28 vgtee_5_ujh.root

You may want to add the full path to the output as you did in your own example.
# 5  
Old 10-28-2012
Quote:
Originally Posted by RudiC
I'm not sure I understand your code snippet.

The following will do the job if run in the target directory; it works on my linux/bash/mawk system:
Code:
ls -l|sort -k8 -k5,5rn|awk '/root/ && !Exist[substr($9,1,7)]++'
-rw-r--r-- 1 emily us_cms 100 Oct 27 10:28 vgtee_1_ujf.root
-rw-r--r-- 1 emily us_cms 12 Oct 27 10:28 vgtee_2_ujf.root
-rw-r--r-- 1 emily us_cms 10 Oct 27 10:28 vgtee_3_ujf.root
-rw-r--r-- 1 emily us_cms  7 Oct 27 10:28 vgtee_4_ujh.root
-rw-r--r-- 1 emily us_cms  9 Oct 27 10:28 vgtee_5_ujh.root

You may want to add the full path to the output as you did in your own example.
Hi,
yes thats wat I want with the full path. But the issue is that $FileName has first row as blank. I do not want that. So is there any way to avoid such blank row in the $FileName.

merci,
thanks
# 6  
Old 10-28-2012
Not sure. $filename is appended to in your loop. Did you remove the file before it is written to in the loop? You also may want to analyse the output of the
Code:
grep -in "vg" $File0"_1"

before awk'ing it to $filename. And, how do you compose $filename? It may point to an unexpected target...?

Je t'en prie!

P.S.: you can use head and tail to remove lines from files, but this should be considered a last resort.

Last edited by RudiC; 10-28-2012 at 11:54 AM.. Reason: ultimate idea
# 7  
Old 10-28-2012
Quote:
Originally Posted by RudiC
Not sure. $filename is appended to in your loop. Did you remove the file before it is written to in the loop? You also may want to analyse the output of the
Code:
grep -in "vg" $File0"_1"

before awk'ing it to $filename. And, how do you compose $filename? It may point to an unexpected target...?

Je t'en prie!
Hi,
Yes, I do remove the file before it is written.
I define it variable in the mail script as
FileName=DataFileName

If the content of the File0_1 is blank, my script wont run further. So anyhow, I will get WARNING Message. but yes, I should make it bit robust.
I guess, I answered he last question.

emily,
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting specific files from multiple .tgz files

Hey, I have number of .tgz files and want to extract the file with the ending *results.txt from each one. I have tried for file in *.tgz; do tar --wildcards -zxf $file *results.txt; doneas well as list=$(ls *.tgz) for i in $list; do tar --wildcards -zxvf $i *.results.txt; done... (1 Reply)
Discussion started by: jfern
1 Replies

2. Shell Programming and Scripting

Extracting lines based on identifiers into multiple files respectively

consider the following is the contents of the file cat 11.sql drop procedure if exists hoop1 ; Delimiter $$ CREATE PROCEDURE hoop1(id int) BEGIN END $$ Delimiter ; . . . . drop procedure if exists hoop2; Delimiter $$ CREATE PROCEDURE hoop2(id int) BEGIN END $$ (8 Replies)
Discussion started by: vivek d r
8 Replies

3. Shell Programming and Scripting

extracting information from multiple files

Hello there, I am trying to extract (string) information ( a list words) from 4 files and then put the results into 1 file. Currently I am doing this using grep -f list.txt file1 . and repeat the process for the other 3 files. The reasons i am doing that (a) I do know how to code (b) each file... (4 Replies)
Discussion started by: houkto
4 Replies

4. Shell Programming and Scripting

Extracting/condensing text from multiple files to multiples files

Hi Everyone, I'm really new to all this so I'm really hoping someone can help. I have a directory with ~1000 lists from which I want to extract lines from and write to new files. For simplicity lets say they are shopping lists and I want to write out the lines corresponding to apples to a new... (2 Replies)
Discussion started by: born2phase
2 Replies

5. UNIX for Dummies Questions & Answers

Finding and Extracting uniq data in multiple files

Hi, I have several files that look like this: File1.txt Data1 Data2 Data20 File2.txt Data1 Data5 Data10 File3.txt Data1 Data2 Data17 File4.txt (6 Replies)
Discussion started by: Fahmida
6 Replies

6. Shell Programming and Scripting

Extracting columns from multiple files with awk

hi everyone! I'd like to extract a single column from 5 different files and put them together in an output file. I saw a similar question for 2 input files, and the line of code workd very well, the code is: awk 'NR==FNR{a=$2; next} {print a, $2}' file1 file2 I added the file3, file4 and... (10 Replies)
Discussion started by: orcaja
10 Replies

7. UNIX for Dummies Questions & Answers

Extracting columns from multiple files with awk

hi everyone! I already posted it in scripts, I'm sorry, it's doubled I'd like to extract a single column from 5 different files and put them together in an output file. I saw a similar question for 2 input files, and the line of code workd very well, the code is: awk 'NR==FNR{a=$2; next}... (1 Reply)
Discussion started by: orcaja
1 Replies

8. UNIX for Advanced & Expert Users

Extracting files with multiple links-perl

i want to write a perl script that gets/displays all those files having multiple links (in current directory) (4 Replies)
Discussion started by: guptesanket
4 Replies

9. Shell Programming and Scripting

Help in extracting multiple files and taking average at same time

Hi, I have 20 files which have respective 50 lines with different values. I would like to process each line of the 50 lines in these 20 files one at a time and do an average of 3rd field ($3) of these 20 files. This will be output to an output file. Instead of using join to generate whole... (8 Replies)
Discussion started by: ahjiefreak
8 Replies

10. Shell Programming and Scripting

bash - batch script for extracting one file from multiple tar files

so i have hundreds of files named history.20071112.tar (history.YYYYMMDD.tar) and im looking to extract one file out of each archive called status_YYYYMMDDHH:MM.lis here is what i have so far: for FILE in `cat dirlist` do tar xvf $FILE ./status_* done dirlist is a text... (4 Replies)
Discussion started by: kuliksco
4 Replies
Login or Register to Ask a Question