Sponsored Content
Top Forums Shell Programming and Scripting A simpler way to do this (save a list of files based on part of their name) Post 302840527 by Don Cragun on Monday 5th of August 2013 11:03:39 PM
Old 08-06-2013
The goal of The UNIX and Linux Forums is to help you learn how to do "stuff" on your own; not to write programs for you. I gave you a sample script to get you started, and from your message #9 in this thread it sounded like you were well on your way to getting a working solution. (And posting a 384Kb zipped archive that expands to over 1Mb without a clear indication of the desired outcome of processing it takes more space and time that most volunteers are willing to donate.)

From what you have shown here in message #10, you are learning quickly. Smilie I will make a few more comments that may help you speed this up a little bit: First, in the pipeline:
Code:
df -h | ls *.out.txt | sort -t_ -k$KEY_FIELD,$KEY_FIELD'n' | awk -F_ -v f=$KEY_FIELD -v c=$FILE_COUNT 'NR > c {exit} {printf("%s", $0)}'

what would happen if you remove the code shown in red? The ls utility doesn't read from standard input, so it seems that the df command in this pipeline should make no difference in the output of this pipeline. (It will just make the pipeline run slower.)

Second you seem to go to a lot of effort to store the output of this pipeline in an array and then spend a lot of time trying to extract individual file names from the array. It looks like the array will only have one element because the printf in your awk command doesn't put a space between the names of the files it prints. If you would change the printf statement from:
Code:
printf("%s", $0)

to:
Code:
printf(" %s", $0)

you could reference filenames in the array more simply by using ${FILELIST[0]} through ${FILELIST[$((FILE_COUNT-1))]}.

But, why have an array at all. Why not just process the files one at a time as they come out of awk? As an example, what would happen if you replaced:
Code:
   # sort the list of filenames and output the top number "n" as specified in argument $3
   FILE_LIST=( $(df -h | ls *.out.txt | sort -t_ -k$KEY_FIELD,$KEY_FIELD'n' | awk -F_ -v f=$KEY_FIELD -v c=$FILE_COUNT 'NR > c {exit} {printf("%s", $0)}') )

   # loop up to file count to parse output and copy files that were found by sort
   for (( LOOP_CT=1; LOOP_CT<=$FILE_COUNT; LOOP_CT++ ))
   do

      # parse output string on .out.txt to locate individual files
      FILE_TEMP=`echo $FILE_LIST | awk -v N=$LOOP_CT 'BEGIN {FS=".out.txt"} {print $N}'`
      # restore file extension
      FILE_NAME=$FILE_TEMP'.out.txt'

      echo $FILE_NAME

      # copy file and corresponding ini weight set to continue
      # copy file to continue
      cp -p './'$FILE_NAME './'$FOLD'_continue/'$SET_TYPE'/'$FILE_NAME

      #  find random ini set number
      RAND_SET=`echo $FILE_NAME | awk 'BEGIN {FS="_"} {print $5}'`
      # copy random ini weight file to continue
      cp -p '../rnd_ini/'$FOLD'/ri_'$RAND_SET'_'*'.wts'  './'$FOLD'_continue/'$SET_TYPE'/'

   done

with the much simpler:
Code:
      ls *.out.txt | sort -t_ -k$KEY_FIELD,${KEY_FIELD}n |
      awk -F_ -v c="$FILE_COUNT" '
        NR > c {exit}
        {print $0, $5}' |
      while read FILE_NAME RAND_SET
      do
        # copy files that were found by sort
        echo "file_name: $FILE_NAME rand_set: $RAND_SET"

        # copy file and corresponding ini weight set to continue
        # copy file to continue
        cp -p './'$FILE_NAME './'$FOLD'_continue/'$SET_TYPE'/'$FILE_NAME

        # copy random ini weight file to continue
        cp -p '../rnd_ini/'$FOLD'/ri_'$RAND_SET'_'*'.wts'  './'$FOLD'_continue/'$SET_TYPE'/'
      done

Note that there is no array here, there is only one invocation of awk (instead of n+1 invocations to process n files), and RAND_SET is pulled from the file name at the file name at a time when we already have the fields in the file name split out (so we only have to split the name once). You can also get rid of some unneeded temporary variables since OUTPUT was not (and still is not) referenced after being set, and FILE_TEMP is no longer used.
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

can I save list of files in memory and not in text file?

Hello all im using allot with the method of getting file list from misc place in unix and copy them into text file and then doing misc action on this list of files using foreach f (`cat file_list.txt`) do something with $f end can I replace this file_list.txt with some place in memory? ... (1 Reply)
Discussion started by: umen
1 Replies

2. UNIX for Dummies Questions & Answers

Report of duplicate files based on part of the filename

I have the files logged in the file system with names in the format of : filename_ordernumber_date_time eg: file_1_12012007_1101.txt file_2_12022007_1101.txt file_1_12032007_1101.txt I need to find out all the files that are logged multiple times with same order number. In the above eg, I... (1 Reply)
Discussion started by: sudheshnaiyer
1 Replies

3. Shell Programming and Scripting

strike last part from list of files

Hi, I have list of files as following: /home/abc/x/23344.php /home/axx/zz/ddddd/abc/7asda/2434.php /home/zzz/7x/y/114.php /home/assssc/x/yasyday/23664.php ( last part in each line is <somenumber.php> I need to somehow get this from the above: /home/abc/x/... (6 Replies)
Discussion started by: fed.linuxgossip
6 Replies

4. Shell Programming and Scripting

Compare two files based on integer part only

Please see how can I do this: File A (three columns): X1,Y1,1.01 X2,Y2,2.02 X3,Y3,4.03 File B (three columns): X1,Y1,1 X2,Y2,2 X3,Y3,4.0005 Now I have to compare file A and B based on the integer part of column 3. Means first 2 rows should be OK and the third row should not satisfy... (12 Replies)
Discussion started by: yale_work
12 Replies

5. Shell Programming and Scripting

find the line starting with a pattern and save a part in variable

Hi i have a file which has mutiple line in it. inside that i have a pattern similar to this /abc/def/hij i want to fine the pattern starting with "/" and get the first word in between the the symbols "/" i.e. "abc" in this case into a variable. thanks in advance (13 Replies)
Discussion started by: kichu
13 Replies

6. UNIX for Dummies Questions & Answers

List only files based on a pattern

Hi Gurus, I need to list only the files with out certain extension. For eg from the following list of files: I need to only list: Thanks Shash (7 Replies)
Discussion started by: shash
7 Replies

7. Shell Programming and Scripting

List duplicate files based on Name and size

Hello, I have a huge directory (with millions of files) and need to find out duplicates based on BOTH file name and File size. I know fdupes but it calculates MD5 which is very time-consuming and especially it takes forever as I have millions of files. Can anyone please suggest a script or... (7 Replies)
Discussion started by: prvnrk
7 Replies

8. Shell Programming and Scripting

Save value from output of Corestat and save in a list for each core

I am trying to modify the "corestat v1.1" code which is in Perl.The typical output of this code is below: Core Utilization CoreId %Usr %Sys %Total ------ ----- ----- ------ 5 4.91 0.01 4.92 6 0.06 ... (0 Replies)
Discussion started by: Zam_1234
0 Replies

9. UNIX for Dummies Questions & Answers

Rename files based on a list

Hi, I have a directory with a lot of files like this: a.bam b.bam c.bam I like to rename these files based on a list where the name of the files in the first column will be replasced by the names in the second column. Here is my list which is a tab-delimited text file: a x b y c ... (4 Replies)
Discussion started by: a_bahreini
4 Replies

10. Shell Programming and Scripting

Save an specific part of a expect_out in a variable

I have a expect file like this #!/opt/tools/unsupported/expect-5.39/bin/expect spawn ssh -l user ip expect_after eof {exit 0} set timeout 10 log_file /report.txt expect "Password:" { send "pasword\r" } expect "$ " { send "date\r" } expect "$ " { send "readlink /somelink\r" } set... (7 Replies)
Discussion started by: bebehnaz
7 Replies
All times are GMT -4. The time now is 05:46 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy