sort script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sort script
# 8  
Old 10-11-2012
Appreciating and supporting what Don and Jim say, but, on the other hand, understanding that and why you are looking for a quick fix, I brought up the following that might do the task. I'm not sure it will run correctly under cygwin - no chance to test. Runs fine, at least with your sample data, on my linux system.
Put the sort order into a file
Code:
Cpyridne_N
Cphenol_O
fusering
nasC

, making sure the items in there match your heading items (no error checking done in here). It needs to open infile twice, but reads only the header line in the first case.
Code:
$(awk   'NR==FNR{Ar[++n]=$1;next}
         FNR=1 {exit}
         END    {printf "sort";
                 for (i=1;i<=n;i++)
                   for (j=1;j<=NF;j++)
                     if (Ar[i]==$j) {printf " -k%d,%d",  j, j; break};
                 printf "\n"}                                     
        ' sortorder infile ) infile

Execute as is, i.e. the entire cmd in $(...). It will sort the headers below everything else; if you can't live with that, use head and tail commands to reverse.

Last edited by RudiC; 10-11-2012 at 06:01 AM..
This User Gave Thanks to RudiC For This Post:
# 9  
Old 10-11-2012
I had started working on a solution similar to RudiC's suggestion last night, but fell asleep before finishing it. RudiC left out one important element; for the fields you're sorting you need to specify a numeric sort.

The following works on OS X, but I don't have a Cygwin system to test. This keeps the header line at the start of the file. The way it is written, it produces debugging information in a file named debug.out that shows the fields selected by the given sort keys, shows the sort command that is used to perform the sort, and lists the fields from each record that will be sorted by that command. (Actually, the entire record will be used as a final sort key if all of the selected keys match in some records, but since the first field in each line in your input files is a sequence number, that field will always be enough to disambiguate any records that match up to that point.)
Code:
#!/bin/ksh
awk -v dbg=1 '
BEGIN{  FS = OFS = "\t"}
FNR==NR{# We are in the 1st file.  Each line is the name of a field to be used
        # as a sort key, with the 1st line being the primary sort key.
        key[++nk] = $1
        next
}
FNR==1{ # We are on the 1st line of the 2nd file.  Determine the sort command
        # to use to implement the desired sort order.  All keys are to be
        # treated as ascending order numeric fields.
        sortcmd = "sort -t \"" FS "\" -n"
        for(i = 1; i <= nk; i++) {
                # For each key...
                for(j = 1; j <= NF; j++) {
                        if($j == key[i]) {
                                # We have a match...
                                if(dbg)printf("key[%d](%s) is field %d\n",
                                        i, key[i], j) > "debug.out"
                                if(dbg)keyf[i] = j
                                sortcmd = sortcmd " -k" j "," j
                                break
                        }
                }
                if(j > NF) {
                        # This key does not have a matchine field heading.
                        printf("sorter: No heading matches key[%d] (%s)\n",
                                i, key[i])
                        ec = 1
                }
        }
        if(ec) exit ec
        if(dbg)printf("sortcmd is \"%s\"\n", sortcmd) > "debug.out"
        print
        next
}
{       # We have a data line.  Feed it to sort.
        if(dbg) {
                printf("line %d key info: %s", FNR, $keyf[1]) > "debug.out"
                for(i = 2; i <= nk; i++) printf("\t%s", $keyf[i]) > "debug.out"
                printf("\t%s\n", $1) > "debug.out"
        }
        print | sortcmd
}
END{    close(sortcmd)
}' keys data

If you don't want the debugging information, you can disable it by changing:
Code:
awk -v dbg=1 '

early in the script to:
Code:
awk -v dbg=0 '

or
Code:
awk '

or by removing all of the statements that start with if(dbg).

Last edited by Don Cragun; 10-11-2012 at 12:23 PM.. Reason: Fix auto-spell checker induced typo.
This User Gave Thanks to Don Cragun For This Post:
# 10  
Old 10-11-2012
Wow, thanks allot for working this out. This will really save me allot of time. It looks like it would be reasonable to make simple changes, like to alphanumeric sorting, or to change the sort order.

After a few changes to make this into a callable script run in bash, this is what I ended up with.
Code:
#!/usr/bin/bash

# call with $1 list of column headers to be sorted on, one header per line
# call with $2 name of file to be sorted

# will be prefixed to name of data file to create output file
OUTPUPREFIX="_makesdf"

# parse arguments
KEYFILE=$1
DATAFILE=$2

# make sure input is has unix EOL
dos2unix -q $KEYFILE
dos2unix -q $DATAFILE

# change to dbg=1 for debug output to logfile
#awk -v dbg=1 '
awk -v dbg=0 '
BEGIN{  FS = OFS = "\t"}
FNR==NR{# We are in the 1st file.  Each line is the name of a field to be used
        # as a sort key, with the 1st line being the primary sort key.
        key[++nk] = $1
        next
}
FNR==1{ # We are on the 1st line of the 2nd file.  Determine the sort command
        # to use to implement the desired sort order.  All keys are to be
        # treated as ascending order numeric fields.
        sortcmd = "sort -t \"" FS "\" -n"
        for(i = 1; i <= nk; i++) {
                # For each key...
                for(j = 1; j <= NF; j++) {
                        if($j == key[i]) {
                                # We have a match...
                                if(dbg)printf("key[%d](%s) is field %d\n",
                                        i, key[i], j) > "debug.out"
                                if(dbg)keyf[i] = j
                                sortcmd = sortcmd " -k" j "," j
                                break
                        }
                }
                if(j > NF) {
                        # This key does not have a matching field heading.
                        printf("sorter: No heading matches key[%d] (%s)\n",
                                i, key[i])
                        ec = 1
                }
        }
        if(ec) exit ec
        if(dbg)printf("sortcmd is \"%s\"\n", sortcmd) > "debug.out"
        print
        next
}
{       # We have a data line.  Feed it to sort.
        if(dbg) {
                printf("line %d key info: %s", FNR, $keyf[1]) > "debug.out"
                for(i = 2; i <= nk; i++) printf("\t%s", $keyf[i]) > "debug.out"
                printf("\t%s\n", $1) > "debug.out"
        }
        print | sortcmd
}
END{    close(sortcmd)
}' $KEYFILE  $DATAFILE > $OUTPUPREFIX"_"$DATAFILE

I have a local sort file with the list of headers to sort on, and this scripts lives with the rest of my path tools (/usr/local/bin/) so I can call it from the shell or another script.

Thanks again,

LMHmedchem
# 11  
Old 10-12-2012
@Don Cragun: impressive suggestion, esp. the debug stuff. Seen it before, admired it before, inclined to adopt it.
I had thought about the numeric sort, but as there are non numeric fields in the file as well, I disregarded it for the first attempt. By a slight enhancement we can make my suggestion accept "per field sort options", and this should be doable for Don's code as well:
Code:
$(awk   'NR==FNR{Ar[++n]=$1; SO[n]=$2; next}
         FNR=1 {exit}
         END    {printf "sort";
                 for (i=1;i<=n;i++)
                   for (j=1;j<=NF;j++)
                     if (Ar[i]==$j) {printf " -k%d,%d%s",  j, j, SO[i]; break};
                 printf "\n"}                                     
        ' sortorder infile ) infile

will evaluate the options as given in the sortorder file, e.g. numeric reverse:
Code:
Cpyridne_N
Cphenol_O  nr
fusering   nr
nasC

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Shell script to call and sort awk script and output

I'm trying to create a shell script that takes a awk script that I wrote and a filename as an argument. I was able to get that done but I'm having trouble figuring out how to keep the header of the output at the top but sort the rest of the rows alphabetically. This is what I have now but it is... (1 Reply)
Discussion started by: Eric7giants
1 Replies

2. Shell Programming and Scripting

Script to Sort into columns

Hi geeks! I want to convert the following: EPC-NotificationData: sms:2348034503643 EPC-GroupIds: 300H:10:22-01-2014T07:30:14,22-04-2014T07:30:14 To: EPC-NotificationData: sms:2348034503643, EPC-GroupIds: 300H:10:22-01-2014T07:30:14,22-04-2014T07:30:14 I want them to be on the same... (13 Replies)
Discussion started by: infinitydon
13 Replies

3. Shell Programming and Scripting

Sort help: How to sort collected 'file list' by date stamp :

Hi Experts, I have a filelist collected from another server , now want to sort the output using date/time stamp filed. - Filed 6, 7,8 are showing the date/time/stamp. Here is the input: #---------------------------------------------------------------------- -rw------- 1 root ... (3 Replies)
Discussion started by: rveri
3 Replies

4. UNIX for Dummies Questions & Answers

sort script

hi guys i looking for someone to help me with a script i want to sort all de files from /bin by size (from max to min) and the size and path of first 3 files to be written in /home/user/bin_size .And i want to put that script in crontab to execute every monday at 20:00 Can someone help me... (2 Replies)
Discussion started by: G30
2 Replies

5. UNIX for Advanced & Expert Users

Script to sort the files and append the extension .sort to the sorted version of the file

Hello all - I am to this forum and fairly new in learning unix and finding some difficulty in preparing a small shell script. I am trying to make script to sort all the files given by user as input (either the exact full name of the file or say the files matching the criteria like all files... (3 Replies)
Discussion started by: pankaj80
3 Replies

6. Shell Programming and Scripting

need Unix script to sort

Hi i have a file like this oprvdw vrc002093j.ksh oprvdw vrc002092j.ksh oprvrc vrc045016j.ksh oprvrc vrc055141j.ksh svemietl bdw0231185.sh svemietl bdw0231145.sh and i need a script which dispalys in below format: oprvdw : vrc002093j.ksh vrc002092j.ksh oprvrc :... (0 Replies)
Discussion started by: p_satyambabu
0 Replies

7. Shell Programming and Scripting

Using sort with awk script

I have a file with four fields and an awk script that strips out one field displaying the remaining three. I have added headings for each of these fields such as Player - Year - RBIs then below it comes the data. What I am trying to do is sort the RBIs field in my script from most to least at the... (9 Replies)
Discussion started by: Trellot
9 Replies

8. Shell Programming and Scripting

Script to sort data

Hi All, I have a .csv file with 3 columns called nLats, nLongs, and fRes. in following format : "nLats","nLongs","fRes" 0,0,-1 0,1,-1 0,2,-1 0,3,-1 0,4,-1 ......... ......... 0,143,-1 nLats increments at nLongs=143 1,0, -1 1,1, -1 .......... .......... 1,143,-1... (1 Reply)
Discussion started by: wizardy_maximus
1 Replies

9. UNIX for Dummies Questions & Answers

sort script/command

ok. i am doing a project where i have hand typed in the titles of nearly 500 DVD titles, each one is on a seperate line. but they arent in any type of alphebetical order, and i need them sorted in that format (A-Z or a-z) ..... i know that the 'sort' command can be used but also know the... (6 Replies)
Discussion started by: Chadbot
6 Replies

10. Shell Programming and Scripting

sort utility in script ?

Hi friends, I want to use sort command in script. I used the following syntax in my scipt, sort -t '|' +3 tempcdrext4.cdr > temp.mocdr It give me a error " Input file specified two times." but this command work fine in the prompt without any problem. Can sombody please tell me who... (2 Replies)
Discussion started by: maheshsri
2 Replies
Login or Register to Ask a Question