So, I added a "group id" to be included in the sort and solved it that way, and I don't pipe to sort individual arrays (which I suspect was a bug)
Just in case my code now looks like this:
and on the output I do
to clean out those sort-related columns
Hello,
I am currently trying to edit an ldif file. The ldif specification states that a newline followed by a space indicates the subsequent line is a continuation of the line. So, in order to search and replace properly and edit the file, I open the file in textwrangler, search for "\r " and... (14 Replies)
I know uniq exists, but am not sure how to remove repeating lines when they are groups of two different lines repeating themselves, without using sort. I need them to be sorted in the original order, just to remove repeats.
cd /media/AUDIO/WAVE/9780743518673/mp3
~/Desktop/mp3-to-m4b... (1 Reply)
Hi Guys,
First post! I've seen a few options but dont know the most efficient:
I have a directory with a 150,000+ text files in it
I want to merge them into files contain 10,000 files with a carriage return in between.
Thanks
P
The following is an example but doesnt limit the... (2 Replies)
Hello,
I'm working with a file that has three columns. The first one represents a certain channel and the third one a timestamp (second one is not important). Example input is as follows:
2513 12 10.771
2513 13 10.771
2513 14 10.771
2513 15 10.771
2644 8 10.771
... (6 Replies)
G'day all,
I'm have tons of image files I need to process, but I don't need to process all of them and it would take a long time to process them all if I don't have to.
The images are arranged in folders like this...
folder1/RawData
folder2/RawData
folder3/RawData
...
folderN/RawData
... (2 Replies)
I have two files.
File 1 is a two-column index file, e.g.
comp11084_c0_seq6:130-468(-) comp12746_c0_seq3:140-478(+)
comp11084_c0_seq3:201-539(-) comp12746_c0_seq2:191-529(+)
File 2 is a sequence file with headers named with the same terms that populate file 1. ... (1 Reply)
Hello to all,
I'm trying to print the value corresponding to the words A, B, C, D, E. These words could appear sometimes and sometimes not inside each group of lines. Each group of lines begins with "ZYX".
My issue with current code is that should print values for 3 groups and only is... (6 Replies)
Hi, I have some data I have taken from the internet in the following scheme:
name
direction
webpage
phone number
open hours
menu url
book url
name
...
Of course the only line that is mandatory is the name wich is the one I want to sort by.
I have the following sed & awk script that... (3 Replies)
Discussion started by: devmsv
3 Replies
LEARN ABOUT REDHAT
sort
sort(3pm) Perl Programmers Reference Guide sort(3pm)NAME
sort - perl pragma to control sort() behaviour
SYNOPSIS
use sort 'stable'; # guarantee stability
use sort '_quicksort'; # use a quicksort algorithm
use sort '_mergesort'; # use a mergesort algorithm
use sort 'defaults'; # revert to default behavior
no sort 'stable'; # stability not important
use sort '_qsort'; # alias for quicksort
my $current = sort::current(); # identify prevailing algorithm
DESCRIPTION
With the "sort" pragma you can control the behaviour of the builtin "sort()" function.
In Perl versions 5.6 and earlier the quicksort algorithm was used to implement "sort()", but in Perl 5.8 a mergesort algorithm was also
made available, mainly to guarantee worst case O(N log N) behaviour: the worst case of quicksort is O(N**2). In Perl 5.8 and later, quick-
sort defends against quadratic behaviour by shuffling large arrays before sorting.
A stable sort means that for records that compare equal, the original input ordering is preserved. Mergesort is stable, quicksort is not.
Stability will matter only if elements that compare equal can be distinguished in some other way. That means that simple numerical and
lexical sorts do not profit from stability, since equal elements are indistinguishable. However, with a comparison such as
{ substr($a, 0, 3) cmp substr($b, 0, 3) }
stability might matter because elements that compare equal on the first 3 characters may be distinguished based on subsequent characters.
In Perl 5.8 and later, quicksort can be stabilized, but doing so will add overhead, so it should only be done if it matters.
The best algorithm depends on many things. On average, mergesort does fewer comparisons than quicksort, so it may be better when compli-
cated comparison routines are used. Mergesort also takes advantage of pre-existing order, so it would be favored for using "sort()" to
merge several sorted arrays. On the other hand, quicksort is often faster for small arrays, and on arrays of a few distinct values,
repeated many times. You can force the choice of algorithm with this pragma, but this feels heavy-handed, so the subpragmas beginning with
a "_" may not persist beyond Perl 5.8. The default algorithm is mergesort, which will be stable even if you do not explicitly demand it.
But the stability of the default sort is a side-effect that could change in later versions. If stability is important, be sure to say so
with a
use sort 'stable';
The "no sort" pragma doesn't forbid what follows, it just leaves the choice open. Thus, after
no sort qw(_mergesort stable);
a mergesort, which happens to be stable, will be employed anyway. Note that
no sort "_quicksort";
no sort "_mergesort";
have exactly the same effect, leaving the choice of sort algorithm open.
CAVEATS
This pragma is not lexically scoped: its effect is global to the program it appears in. That means the following will probably not do what
you expect, because both pragmas take effect at compile time, before either "sort()" happens.
{ use sort "_quicksort";
print sort::current . "
";
@a = sort @b;
}
{ use sort "stable";
print sort::current . "
";
@c = sort @d;
}
# prints:
# quicksort stable
# quicksort stable
You can achieve the effect you probably wanted by using "eval()" to defer the pragmas until run time. Use the quoted argument form of
"eval()", not the BLOCK form, as in
eval { use sort "_quicksort" }; # WRONG
or the effect will still be at compile time. Reset to default options before selecting other subpragmas (in case somebody carelessly left
them on) and after sorting, as a courtesy to others.
{ eval 'use sort qw(defaults _quicksort)'; # force quicksort
eval 'no sort "stable"'; # stability not wanted
print sort::current . "
";
@a = sort @b;
eval 'use sort "defaults"'; # clean up, for others
}
{ eval 'use sort qw(defaults stable)'; # force stability
print sort::current . "
";
@c = sort @d;
eval 'use sort "defaults"'; # clean up, for others
}
# prints:
# quicksort
# stable
Scoping for this pragma may change in future versions.
perl v5.8.0 2002-06-01 sort(3pm)