I have a lot of data that need to be sorted alphanumerically. I began using sort -du and it solved almost all my problems. However, when I encountered files with data like this it began to fail:
I then tried this to see if I could get it working:
This worked just fine but then there is one other issue. I want the alphabetic to take precedence over the numeric. What if I had a file like this:
I would want zebra to be on the bottom. So if the files are identical in name but different in number I want them sorted numerically. If the files are different alphabetically, I want the alphabetic sort to take precedence. I have tried playing with sort and sed and can't figure this out.
Is this possible with bash, or do you need a perl script to do it?
Last edited by Scott; 07-29-2013 at 11:07 AM..
Reason: Fixed code tags
When the first character of the numerically sorted key (in this case, the first character of each line, /) is not a valid numeric component, it is as if the number is zero. Since every line begins with an invalid numeric character, they all evaluate to zero. All of these ties are broken by a full-length alphabetic sort.
In short, for this data, your suggestion is no different from a simple sort without any options.
Further, consider the implications of using -t _ with -k 1,4n. Even if every line began with a valid numeric, how can a numeric key possibly span fields delimited by an underscore? The first underscore will always terminate the number. So, in this case, -k 1,4n is equivalent to -k 1n.
I want the alphabetic to take precedence over the numeric. What if I had a file like this:
I would want zebra to be on the bottom. So if the files are identical in name but different in number I want them sorted numerically. If the files are different alphabetically, I want the alphabetic sort to take precedence. I have tried playing with sort and sed and can't figure this out.
You're solution is nearly complete. You're decorating with the trailing sequence of digits, sorting on that, and then cutting it out before output. The only thing you need is another field for the preceding text. Instead of ...
... one possibility is to use ...
This extra field consisting of the name without the trailing number is necessary because the trailing number is not part of the name for the purposes of alphabetical sorting.
I would accomplish this using the same tools you've used, but with a different sed approach:
This code (like yours) assumes that there are never any spaces in a pathname. If this is not true, it can be modified to use a different delimiter.
If all of your files are in the same directory (or you want the directory name to be part of the primary sort key, all of your filenames contain 3 underscore characters, and you want the 1st three underscore separated fields sorted alphanumerically and the 4th field to be sorted numerically as the final sort key, the following simple sort command does what you want:
With the sample input you showed, this command produces the output:
Last edited by Don Cragun; 07-29-2013 at 12:46 PM..
Reason: Fix typo
Hi, I have a file like
aa bb dmns|860 dmns|756 ee ff
aa bb dmns|310 dmns|260 ee ff
aa bb dmns|110 dmns|77 ee ff
aa bb dmns|756 dmns|860 ee ff
aa bb dmns|110 dmns|77 ee ff
aa bb dmns|233 dmns|79 ee ff
aa bb dmns|79 dmns|233 ee ff
I want to sort the values in column3 and column4... (2 Replies)
I want to sort a file which contains alphanumeric string.
bash-3.00$ cat abc
mtng1so
mtng2so
mtng11so
mtng9so
mtng23so
mtng7so
hstg2so
hstg9so
hstg1so
hstg11so
hstg13so
bash-3.00$
Want output like this, using one liner.
hstg1so (1 Reply)
Hi All,
I have files with a column which has values and ranges, for example
colA colB
ERD1 3456
ERD2
ERD3 4456
I want to have the following output
colA colB colC
ERD1 3456 3456
ERD2 526887 526890
ERD3 4456 4456
Being a newbie to... (2 Replies)
Hi so I have these files where the first thing in them says something along the lines of "This document was accessed 'date' blah blah", I was thinking of a way to extract that date and then sort the files based on that date.
My question is how do I get rid of the words in that statement so that... (6 Replies)
Hi folks,
I have a value like A12,i could able to change this into integer using typeset as below
typeset -i A12
But, I need your advice to change the values like 1A2 or 12A into integer.
Thanks in advance.
Thanks,
Sathish (3 Replies)
Hi!
Could anyone so kindly help me a code to eliminate from a txt file, obtained by collecting and merge several web-page, every word (string) containing non alphabetical, numeric and punctuation character (i.e NON a-zA-Z0-9, underscore and punctuation mark)?
Thanks a lot for the help to... (5 Replies)
Hi ,
I have a requirement where one column have to be sorted (delimiter is pipe)
for eg:
My input filed is as below
1|FIAT|0010103|23011|01/01/2000|31/12/9999|1.15
2|232|613|1
2|234|743|1
2|234|793|1
2|234|893|1
1|FIAT|0010103|23012|01/01/2000|31/12/9999|1.15
2|230|006|0
2|230|106|0... (9 Replies)
I have a requirement where I need to check if
where r1v07l09ab is a software release.
I should always check for this to be true to continue the release deployment because an older release should not be deployed by mistake. I mean only the release greater than the current release should be... (3 Replies)
Hi :)
I am writing a ksh
I have a string of general format
A12B3456CD78
the string is of variable length
the string always ends with numbers (here it is 78.. it can be any number of digits may be 789 or just 7)
before these ending numbers are alphabets (here it is CD can even be... (3 Replies)
I have a file I'm trying to sort such as
fred1
fred2
fred10
fred11
...
when I sort I get
fred1
fred10
fred11
fred2
...
using sort can any give me the syntax to sort this is dict order
e.g., (4 Replies)