Process a specific number of files ina list


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Process a specific number of files ina list
# 1  
Old 02-06-2014
Process a specific number of files ina list

Hello,

I have a list of files that was created with,

Code:
FILES='./'$FOLD'/'$FOLD'_continue/'$OPTIMIZE_ON'/'*'out.txt'

I am doing a loop on this list

Code:
for INPUT in $FILES
do
...
done

but I may not want to process everything. Is there a simple way to just process the first 5,10,n, etc in this list.

I process each file name with awk to extract two numbers from each file name,
Code:
#remove path from filename
FILENAME=`echo $INPUT | awk 'BEGIN {FS="/"} {print $5}'`
echo $FILENAME

# store random ini set
RAND_SET=`echo $FILENAME | awk 'BEGIN {FS="_"} {print $5}'`
echo "RAND_SET"  $RAND_SET

# find MAE min epoch values
MAE_EPOCH=`echo $FILENAME | awk 'BEGIN {FS="_"} {print $2}'`

#remove leading E
MAE_EPOCH=${MAE_EPOCH#"E"}
#create value at -100 epochs
let "MAE_EPOCH= $MAE_EPOCH"
echo "MAE_EPOCH"  $MAE_EPOCH

I end up with the numbers $MAE_EPOCH and $RAND_SET for each file. What I would really like to do is to scan all the files, extract the MAE_EPOCH, and then process the best some number, like the top 5 (based on the lowest values for MAE_EPOCH). I need to know the value of RAND_SET associated with each MAE_EPOCH value.

File names look like,
Code:
108.72_E1300_101.62_E3000_39_ri_OA_f0_S4A_v2_42.41.1_ON_0.25lr.out.txt

The text in blue is what I am capturing. I would need both the MAE_EPOCH and the corresponding RAND_SET value for each file I am going to process. I guess I would loop on the file set and then store the data for the files I need to process, but I'm not so sure how to do that kind of thing in bash.

Help would be greatly appreciated,

LMHmedchem

Last edited by Scrutinizer; 02-06-2014 at 07:14 PM.. Reason: Additional code tags
# 2  
Old 02-06-2014
If you cd into the directory with the files, you can craft a single pipeline to do the work (at least, as I understood it): ls to generate a list, grep to filter by name, sort to numerically sort by MAE_EPOCH, head to limit the number of results, and awk to extract whatever portions of the underscore-delimited records are needed.

The output of that pipeline can then be fed to a while-read loop for processing.

Regards,
Alister
This User Gave Thanks to alister For This Post:
# 3  
Old 02-07-2014
At one point, I had this set up to process the first few files from

Code:
FILES='./'$FOLD'/'$FOLD'_continue/'$OPTIMIZE_ON'/'*'out.txt'

but when the file names looked like,

Code:
99.82_E100_85.20_E100_26_ri_OA_f2_S2A_v8_47.46.1_ON_0.25lr.out.txt
99.88_E100_86.71_E100_17_ri_OA_f2_S2A_v8_47.46.1_ON_0.25lr.out.txt
100.22_E100_86.31_E100_39_ri_OA_f2_S2A_v8_47.46.1_ON_0.25lr.out.txt
100.24_E200_87.52_E100_47_ri_OA_f2_S2A_v8_47.46.1_ON_0.25lr.out.txt

The file starting with 100.22_E100 got processed first, presumably since 1 is smaller than 9. As I think about it, I described it wrong. It is the first field (the real number) that I would wand to sort on and then retrieve the other numbers. For the files above, if I wanted the top two results I would first look at the number in the first field,

Code:
99.82
99.88
100.22
100.24

Based on the values, I would want to extract,

Code:
MAE_EPOCH=100, RAND_SET=26 for the top file
MAE_EPOCH=100, RAND_SET=17 for the second file

These are the numbers I need to pass to the next program.

Is there some reason to use grep to filter the name instead of doing ls *.out.txt?

I guess I would be using some combination of -t -k -n with sort, like

ls *.out.txt | sort -t_ -k 1 -n | head -n 2 | awk

This would give me the top two files in the list above?

I'm sure I will have to play around with this, but thanks for the head start. I am a bit unclear on how to pass the result of the pipe into my loop. Do you have a link for an example of something like that?

LMHmedchem

---------- Post updated 02-07-14 at 02:00 PM ---------- Previous update was 02-06-14 at 10:13 PM ----------

Well I have it working with this,
Code:
#!/bin/bash

NUMBER_TO_PROCESS=2
BACKUP=100

cd ./the_folder_with_the_files

# find all files .out.txt and sort on the real number in the first position
# return the file names for the top $NUMBER_TO_PROCESS in sorted list
FILES=$(ls *.out.txt | sort -t_ -k 1 -n | head -n $NUMBER_TO_PROCESS)

# loop on all file names returned
for INPUT in $FILES
do

   # current file
   echo $INPUT

   # store random ini set
   RAND_SET=`echo $INPUT | awk 'BEGIN {FS="_"} {print $5}'`
   echo "RAND_SET"  $RAND_SET

   # find MAE min epoch value
   MAE_EPOCH=`echo $INPUT | awk 'BEGIN {FS="_"} {print $2}'`
   # remove leading E
   MAE_EPOCH=${MAE_EPOCH#"E"}
   # backup if specified
   let "MAE_EPOCH= $MAE_EPOCH - $BACKUP"
   echo "MAE_EPOCH"  $MAE_EPOCH

done

This gives the behavior that I am looking for.

I left the awk stuff in the loop since there are a couple of items in the file name string that need to be retrieved and assigned to variables. Does this make sense?

Is there some reason to filter the file list with grepinstead of using the glob?

LMHmedchem

Last edited by Don Cragun; 02-07-2014 at 01:27 AM.. Reason: Add CODE and ICODE tags.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

List files with number to select based on number

Hi experts, I am using KSH and I am need to display file with number in front of file names and user can select it by entering the number. I am trying to use following command to display list with numbers. but I do not know how to capture number and identify what file it is to be used for... (5 Replies)
Discussion started by: mysocks
5 Replies

2. UNIX for Dummies Questions & Answers

[Solved] How to find particular files ina directory?

Hi, I am trying to write a script to find some files in a directory Example: if i have files like 2014-02-01_aaaa.txt 2014-02-01_bbbb.txt 2014-02-01_cccc.txt 2014-02-01_dddd.txt and some other files how can i just check to see if there four files exits or not i tried some thing like this... (5 Replies)
Discussion started by: vikatakavi
5 Replies

3. UNIX for Dummies Questions & Answers

grep specific number from a list of numbers

Hello. I have 9060 files labelled File1 to File9060. They are in numerical order. When I grep a file eg. File90 it will show me all files that contain the pattern "File90", eg File901 or File9001. I can only get specific files for File1000 or higher. How can I resolve this problem? Is there a... (5 Replies)
Discussion started by: godzilla07
5 Replies

4. Shell Programming and Scripting

Need script to count specific word and iterate over number of files.

Hi Experts, I want to know the count of specific word in a file. I have almost 600+ files. So I want to loop thru each file and get the count of the specific word. Please help me on achieving this... Many thanks (2 Replies)
Discussion started by: elamurugu
2 Replies

5. Shell Programming and Scripting

highly specific search and replace for a large number of files

hey guys, I have a directory with about 600 files. I need to find a specific word inside a command and replace only that instance of the word in many files. For example, lets say I have a command called 'foo' in many files. One of the input arguments of the 'foo' call is 'bar'. The word 'bar'... (5 Replies)
Discussion started by: ksubrama
5 Replies

6. UNIX for Dummies Questions & Answers

Unix command to count the number of files with specific characters in name

Hey all, I'm looking for a command that will search a directory (and all subdirectories) and give me a file count for the number of files that contain specific characters within its filename. e.g. I want to find the number of files that contain "-a.jpg" in their name. All the searching I've... (6 Replies)
Discussion started by: murphysm
6 Replies

7. Shell Programming and Scripting

Creating large number of files of specific size

Hi I am new to shell scripting.I want to create a batch file which creates a desired number of files with a specific size say 1MB each to consume space.How can i go about it using for loop /any other loop condition using shell script? Thanks (3 Replies)
Discussion started by: swatideswal
3 Replies

8. UNIX for Dummies Questions & Answers

How to reduce multiple files into a specific number of files

Can anyone please let me know how do I reduce files into a specific number of files by cat'ing files? For example: 15 files must be reduced to 1 or 5 or 9 (possible values 1 to 14) (5 Replies)
Discussion started by: aryanbelank
5 Replies

9. Shell Programming and Scripting

Retreive content between specific lines ina file

Hi I have a text file which has two sets of lines repeating for "n" number of times.Some data is printed between the two lines.I want to retrieve all the data thats there in between those two set of lines.I have the string value of those two set of lines. To be much more clearer ... (4 Replies)
Discussion started by: chennaitomcruis
4 Replies

10. Shell Programming and Scripting

awk command to find the count of files ina directory

hi Gurus, can anyone provide a awk command to get teh count of number of file sin a specific directory. appreciate any kind of information.. thanks (11 Replies)
Discussion started by: sish78
11 Replies
Login or Register to Ask a Question