Finding the right file with multiple sort criteria


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Finding the right file with multiple sort criteria
# 1  
Old 08-04-2014
Finding the right file with multiple sort criteria

Hello,

I have files in a directory with names like,
Code:
./f0/84.40_E1200_85.39_E1300_f0_r00_1300-ON-0.25_S7A_v4_47.19.1.out.txt
./f0/84.40_E1200_85.83_E1200_f0_r00_1200-ON-0.25_S7A_v4_47.19.1.out.txt
./f0/84.60_E1100_86.45_E1100_f0_r00_1100-ON-0.25_S7A_v4_47.19.1.out.txt
./f0/85.20_E1000_87.26_E1000_f0_r00_1000-ON-0.25_S7A_v4_47.19.1.out.txt
./f0/86.42_E900_88.14_E900_f0_r00_900-ON-0.25_S7A_v4_47.19.1.out.txt
./f0/88.88_E800_90.07_E800_f0_r00_800-ON-0.25_S7A_v4_47.19.1.out.txt

I need to find the smallest value for the first '_' delimited field (84.40 in this case). Where there are more than one file with this value, like above, I need the one with the smallest value of the int in 1300-ON-0.25, 1200-ON-0.25, etc. For this example, I would want to find the file and assign that name to a bash variable.

84.40_E1200_85.83_E1200_f0_r00_1200-ON-0.25_S7A_v4_47.19.1.out.txt

There will never be more than 2 files with the same value for the field I am looking at.

Another caveat is that there are times where I would be looking at the float in the third '_' delimited field instead of the first. If $STOP_ON==T, I would sort on the third number and if $STOP_ON=V I would sort on the first.

I guess what I would do here is something like,
Code:
if [ "$STOP_ON" == "T" ]; then
   # this removes the path from the front of the filename, sorts real in position 3
   FILES=$(ls  $CURRENT_DIR'/'*'out.txt' | \
           awk 'BEGIN {FS="/"} {print $6}' | \
           sort -t_ -k 3 -n | \
           head -n 2)
fi
if [ "$STOP_ON" == "V" ]; then
   # this removes the path from the front of the filename, sorts real in position 1
   FILES=$(ls  $CURRENT_DIR'/'*'out.txt' | \
           awk 'BEGIN {FS="/"} {print $6}' | \
           sort -t_ -k 1 -n | \
           head -n 2)
fi

This would strip off the path, sort on the float that I need and then pick off the top two. I could then process the list to find the one with the lowest value for the second sorting criteria.

This seems rather involved and there is another problem in that there may not be two files with the same value for the first sort filed as there are in this example. That means I would first have to check the results of the above code for that as well.

It seems as if there should be a better way to do this. Please let me know if I have botched my explanation and want me to try again.

LMHmedchem
# 2  
Old 08-04-2014
If your field's (the ones which you want to sort on) positions and width are fixed,

Code:
sort -t "_" -nk1.6,1.7 -nk1.9,1.10 -nk7.1,7.4 file

Even if it doesn't, I think you can manipulate the above command to match your criteria.
This User Gave Thanks to clx For This Post:
# 3  
Old 08-04-2014
The position and width for the floats are fixed, but the second sorting criteria will not have a fixed width.

One thing I don't get at the moment is that my script seems to think that there is only one element in FILES.

If I print,

echo ${#FILES[@]}

I get 1.

If I print,

echo ${FILES[0]}

I get,

84.40_E1200_85.39_E1300_f0_r00_1300-ON-0.25_S7A_v4_47.19.1.out.txt 84.40_E1200_85.83_E1200_f0_r00_1200-ON-0.25_S7A_v4_47.19.1.out.txt

I thought that when you did something like FILES=$(ls ...) you would end up with an array if there was more than one file returned by ls. For some reason, it is treating the list of files as a single string in one element. I tried adding {OFS=" "} to the awk call, but that doesn't do anything.

For now, I can split up the single string in ${FILES[0]} on space, but it seems like I shouldn't have to do that.

LMHmedchem
# 4  
Old 08-04-2014
Assigning an array would (in bash) require to embrace the term with ( ... ). See the differences:
Code:
FILES=($(ls))
echo ${#FILES[@]}
33
echo ${!FILES[@]}
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

as opposed to
Code:
FILES=$(ls)
echo ${#FILES[@]}
1
echo ${!FILES[@]}
0

This User Gave Thanks to RudiC For This Post:
# 5  
Old 08-07-2014
I think I have this sorted out, thanks again for the assistance.

LMHmedchem
# 6  
Old 08-07-2014
To make your sorting on field 1 or 3 a bit easier, you might want to consider
Code:
[ "$STOP_ON" == "V" ]; IX=$((3-$?*2)); sort -nt_ -k$IX,$IX file

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to sort file with certain criteria (bash)?

I am running a command that is part of a script and this is what I am getting when it is sorted by the command: command: ls /tmp/test/*NDMP*.z /tmp/test/CARS-GOLD-NET_CHROMJOB-01-XZ-ARCHIVE-NDMP.z /tmp/test/CARS-GOLD-NET_CHROMJOB-01-XZ-NDMP.z... (2 Replies)
Discussion started by: newbie2010
2 Replies

2. Shell Programming and Scripting

Globbling with multiple criteria (UNIX Shell)

I am new to UNIX Shell. I want to list the files names in the current directory that are not start with 'AB' and have at least two characters. For example, say I have those files in the current directory: AB, AC, AD, AE, B, C. After executing the command, AC, AD, AE will be listed on the screen. ... (6 Replies)
Discussion started by: Ray Sun
6 Replies

3. Shell Programming and Scripting

Finding multiple zero's in a file

Hi all, i have a text file like the below example--- 146 7600 147 23996 43024 50700581 28998 1767165 10 3784 12 1344 0 0 0 545 641 166646 723 90136 24 1046 46 2948 OR 4340 ... (15 Replies)
Discussion started by: gemnian.g
15 Replies

4. Shell Programming and Scripting

Finding multiple column values and match in a fixed length file

Hi, I have a fixed length file where I need to verify the values of 3 different fields, where each field will have a different value. How can I do that in a single step. (6 Replies)
Discussion started by: naveen_sangam
6 Replies

5. Shell Programming and Scripting

Help with egrep or grep command to meet multiple criteria

Hello, I"m a newbie :). I hope I can learn from the scripting expert. I'm trying to use egrep and grep commands to get the total count by meeting both criteria. So far, I haven't been able to do it. if robot = TLD and barcode = AA, then final count should be 2 if robot = TLD and... (9 Replies)
Discussion started by: MinBee
9 Replies

6. Shell Programming and Scripting

Combine multiple string into 1 string group by certain criteria

Hi all, I am newbie in unix. Just have some doubts on how to join multiple lines into single line. I have 1 file with following contents. R96087641 HostName-kul480My This is no use any more %% E78343970 LocalPath-/app/usr/SG (Blank in this line) %% E73615740... (4 Replies)
Discussion started by: whchee
4 Replies

7. UNIX for Dummies Questions & Answers

How to sort alphabetically after finding values

I have a list of people in a usage log and need to print the names and phone numbers of people with over 500 logins. I'd also like to display these names alphabetically. I have their total logins set to a variable named total. So far, I have very little in my awk script to do this: FS=":"... (4 Replies)
Discussion started by: doubleminus
4 Replies

8. UNIX for Dummies Questions & Answers

Help needed to sort multiple columns in one file

Hi, I would like to know given that I have 3 columns. Let say I have first 3 columns to do operation and these operation output is printed out each line by line using AWK and associative array.Currently in the output file, I do a sort by -r for the operation output. The problem comes to... (1 Reply)
Discussion started by: ahjiefreak
1 Replies

9. Shell Programming and Scripting

Searching for multiple criteria in log files?

I would like a simple shell script that will allow me to display to screen all unsuccessful su attempts in my sulog file, for the present date. I have been trying several different combinations of commands, but I can't quite get the syntax correct. The mess I have right now (don't laugh) is... (4 Replies)
Discussion started by: Relykk
4 Replies

10. UNIX for Dummies Questions & Answers

finding multiple file types with "-o"

i was just wondering if any one had a good example of finding mutliple file types with the -o option or any other alternatives. find . \( -name "*.txt" -o -name "*.tag" \) for some reason i'm not having much luck and the man page isn't very descriptive. what i am trying to do is find all... (6 Replies)
Discussion started by: Shakey21
6 Replies
Login or Register to Ask a Question