Hi I would like to sort a csv file. It has 50 fields and approx 1400000 lines. I want to sort by three columns as follows. Say sort on coulmn 5, if entries are equal sort on column 3 if this is also equal sort on column 6.
It seems that this is possible for 5 then 3 i.e.
but it doesn't seem to work for 5 then 3 then 6 i.e.
(that is assuming this is doing what I think).
I know in perl the subroutine to do this would be, this works for smaller files but when I try to run my large file it crashes due to memory issues.
my questions
Would I neeed to write a subroutine to do this ?
I have a large csv file that looks like this:
14 ,M0081,+000000001,200302,+00000100500,
14 ,M0081,+000000004,200301,+00000100500,
14 ,M0081,+000000005,200305,+00000100500,
14 ,M0081,+000000000,200205,+00000100500,
14 ,M0081,+000000008,200204,+00000100500,
... (2 Replies)
HI ALL,
i have a problem when i do a sort sum with many fields.
Is there a limit for fields?
Do you know a solution?
thanks in advance.
the shell is:
# SORT1
SORT1_rcode=777
if ; then
echo "USE $DARSEQ/OTPU.FTPEPREC RECORD F,1000 " > $DARPARSRT/TPEKL508.SORT1_$$.srt
... (6 Replies)
Hi I have following fixed width file and I have to sort on 2 fields
ABC 111222333002555 77788
ABC 111222333004555 77788
ABC 111222333001555 77788
ABC 111222333003555 77788
ABC 111222333005555 77788
one is from field1 to field 3 "ABC" and another is on 14 to 16 "002" (based on first... (1 Reply)
Hi experts,
I am trying sort command with my data but still not getting the expected results.
For example, I have 5 fields data here
c,18:12:45,c,c,c
d,12:34:34,d,d,d
a,13:50:10,a,a,a
b,13:50:50,b,b,b
a,13:50:50,a,a,a
b,14:10:01,b,b,b
c,10:12:45,c,c,c
I want to get
... (3 Replies)
Hey,
I have a file i want to sort. It contains these kind of lines:
FirstName LastName buyID buyType
Eg:
John Doe 22 Car
Jane Simpson 4 Headset
John Doe 11 Telephone
Now if I use the sort command on it
cat purchases.txt | sort -k1,1 -k2,2
it would also sort the third and 4th field:... (4 Replies)
I have a file with contents below
123,502
123,506
123,702
234,101
235,104
456,104
456,100
i want to sort such that i get a unique value in column A, and for those with multiple value in A, i want the lowest value in B.
output should be
123,502
234,101
235,104
456,100 (3 Replies)
I have another file with three columns A,B,C as below
123,1,502
123,2,506
123,3,702
234,4,101
235,5,104
456,6,104
456,7,100
i want to sort such that i get a unique value in column A, and for those with multiple value in A, i want the lowest value in C.
output should be
Code:... (3 Replies)
hi,
i'm having a file stg_ff.txt
which contains 10 fields,which contains millions of records
i need to cat the first 10 rows in the file after doing a sorting on the first two fields i n the file.
can any body help me on this.
regards
Angel (4 Replies)
Please advice in this.
Input file
100,vvvt
201,unb
100,sos
301,abc
99,gang
desired output
99,gang
100,vvvt
100,sos
201,unb
301,abc
Means if first fields are same ( here 100) the do not sort. (4 Replies)
Hi
I have a file as below
<field1> <field2> <field3> ... <field_num1> <field_num2>
Trying to sort based on difference of <field_num1> and <field_num2> in desceding order and print all fields.
I tried this and it doesn't sort on the difference field .. Appreciate your help.
cat... (9 Replies)
Discussion started by: newstart
9 Replies
LEARN ABOUT MOJAVE
sort
sort(3pm) Perl Programmers Reference Guide sort(3pm)NAME
sort - perl pragma to control sort() behaviour
SYNOPSIS
use sort 'stable'; # guarantee stability
use sort '_quicksort'; # use a quicksort algorithm
use sort '_mergesort'; # use a mergesort algorithm
use sort 'defaults'; # revert to default behavior
no sort 'stable'; # stability not important
use sort '_qsort'; # alias for quicksort
my $current;
BEGIN {
$current = sort::current(); # identify prevailing algorithm
}
DESCRIPTION
With the "sort" pragma you can control the behaviour of the builtin "sort()" function.
In Perl versions 5.6 and earlier the quicksort algorithm was used to implement "sort()", but in Perl 5.8 a mergesort algorithm was also
made available, mainly to guarantee worst case O(N log N) behaviour: the worst case of quicksort is O(N**2). In Perl 5.8 and later,
quicksort defends against quadratic behaviour by shuffling large arrays before sorting.
A stable sort means that for records that compare equal, the original input ordering is preserved. Mergesort is stable, quicksort is not.
Stability will matter only if elements that compare equal can be distinguished in some other way. That means that simple numerical and
lexical sorts do not profit from stability, since equal elements are indistinguishable. However, with a comparison such as
{ substr($a, 0, 3) cmp substr($b, 0, 3) }
stability might matter because elements that compare equal on the first 3 characters may be distinguished based on subsequent characters.
In Perl 5.8 and later, quicksort can be stabilized, but doing so will add overhead, so it should only be done if it matters.
The best algorithm depends on many things. On average, mergesort does fewer comparisons than quicksort, so it may be better when
complicated comparison routines are used. Mergesort also takes advantage of pre-existing order, so it would be favored for using "sort()"
to merge several sorted arrays. On the other hand, quicksort is often faster for small arrays, and on arrays of a few distinct values,
repeated many times. You can force the choice of algorithm with this pragma, but this feels heavy-handed, so the subpragmas beginning with
a "_" may not persist beyond Perl 5.8. The default algorithm is mergesort, which will be stable even if you do not explicitly demand it.
But the stability of the default sort is a side-effect that could change in later versions. If stability is important, be sure to say so
with a
use sort 'stable';
The "no sort" pragma doesn't forbid what follows, it just leaves the choice open. Thus, after
no sort qw(_mergesort stable);
a mergesort, which happens to be stable, will be employed anyway. Note that
no sort "_quicksort";
no sort "_mergesort";
have exactly the same effect, leaving the choice of sort algorithm open.
CAVEATS
As of Perl 5.10, this pragma is lexically scoped and takes effect at compile time. In earlier versions its effect was global and took
effect at run-time; the documentation suggested using "eval()" to change the behaviour:
{ eval 'use sort qw(defaults _quicksort)'; # force quicksort
eval 'no sort "stable"'; # stability not wanted
print sort::current . "
";
@a = sort @b;
eval 'use sort "defaults"'; # clean up, for others
}
{ eval 'use sort qw(defaults stable)'; # force stability
print sort::current . "
";
@c = sort @d;
eval 'use sort "defaults"'; # clean up, for others
}
Such code no longer has the desired effect, for two reasons. Firstly, the use of "eval()" means that the sorting algorithm is not changed
until runtime, by which time it's too late to have any effect. Secondly, "sort::current" is also called at run-time, when in fact the
compile-time value of "sort::current" is the one that matters.
So now this code would be written:
{ use sort qw(defaults _quicksort); # force quicksort
no sort "stable"; # stability not wanted
my $current;
BEGIN { $current = sort::current; }
print "$current
";
@a = sort @b;
# Pragmas go out of scope at the end of the block
}
{ use sort qw(defaults stable); # force stability
my $current;
BEGIN { $current = sort::current; }
print "$current
";
@c = sort @d;
}
perl v5.18.2 2013-11-04 sort(3pm)