Sort with conditions


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort with conditions
# 8  
Old 02-25-2016
Don,

Sorry for the consufion.

Please I need it sorte by column 1 And 7 after removed the duplicate récords

Thanks a lot
# 9  
Old 02-25-2016
Quote:
Originally Posted by jiam912
Don,

Sorry for the consufion.

Please I need it sorte by column 1 And 7 after removed the duplicate récords

Thanks a lot
This still doesn't' really make sense. You say you want the output sorted by column 1 and 7, but after removing what you consider to be duplicates, there will never be more than one value in column 7 for any value appearing in column 1 (so using a secondary sort key on column 7 is a no-op in a specification for your desired output). And the output you said you want in post #6 is sorted in reverse order; not increasing order on column 1.

To get the sample output you showed us in post #6, try:
Code:
#!/bin/ksh
sort -nr -k1,1 -k7,7 file |
awk '
$1 == d || $3 > 1 || $4 > 6.26 {
	next
}
d = $1'

To get the output you said you wanted (sorted in increasing order on column 1), try:
Code:
#!/bin/ksh
sort -n -k1,1 -k7,7r file |
awk '
$1 == d || $3 > 1 || $4 > 6.26 {
	next
}
d = $1'

And, if you really don't need the output sorted, just using awk and skipping the sort will run faster:
Code:
awk '
$3 > 1 || $4 > 6.26 || ($1 in f1 && $7 <= f7[$1]) {
	next
}
{	f1[$1] = $0
	f7[$1] = $7
}
END {	for(k in f1)
		print f1[k]
}' file

This User Gave Thanks to Don Cragun For This Post:
# 10  
Old 02-26-2016
Don,

Thanks a lot for your help...

Is there the possibility to apply the conditions in column 3 and 4, only for duplicate records and not for single record.
Code:
3653952085          2          1       25.0       2544  36539520852 051085342
3653952085          1          1       0.42       2544  36539520851 051082559
3653952073          2         14       1.58       2544  36539520732 051090142
3653952073          1          1       1.73       2544  36539520731 051083656
3653952061          1          1       0.58       2544  36539520611 051083819
3653952061          2          1       0.94       2544  36539520612 051090257
3653952049          2          7       0.25       2544  36539520491 051091125
3653952049          3          1       7.35       2544  36539520491 051091132
3653952049          1          1       0.22       2544  36539520491 051091118
3653952050          1          1       8.20       2544  36539520491 051091122
3653952051          1          5       0.22       2544  36539520491 051091123

Output desired

Code:
Code:
3653952085          1          1       0.42       2544  36539520851 051082559
3653952073          1          1       1.73       2544  36539520731 051083656
3653952061          2          1       0.94       2544  36539520612 051090257
3653952049          1          1       0.22       2544  36539520491 051091118
3653952050          1          1       8.20       2544  36539520491 051091122
3653952051          1          5       0.22       2544  36539520491 051091123

using awk.

Thanks again
# 11  
Old 02-26-2016
Hello jiam912,
Of course there it is possible to meet your new requirements with awk. Why don't you try modifying the awk script I suggested in post #9 in this thread and see if you can get it to do what you want. If you aren't able to make it work, show us what you have tried, and we'll help you fix it.
# 12  
Old 02-27-2016
HI Don

Here what I did to get the output desired.. It is a long way but works Smilie

Please if you can do a sort and faster way, please help me.

Using this code i got all records duplicated from my complete list

Code:
awk '{if (x[$1]) { x_count[$1]++; print $0; if (x_count[$1] == 1) { print x[$1] } } x[$1] = $0}' input > file2

I use your code ( to remove the duplicate wit conditions )
Code:
awk '
$3 > 1 || $4 >= 6.25 || ($1 in f1 && $7 <= f7[$1]) {
next
}
{    f1[$1] = $0
    f7[$1] = $7
}
END { for (k in f1)
    print f1[k]    
    }' file2  | sort -nk6,6 > file3

Get records rejected ( by conditions)
Code:
grep -vFf file3 file2 > file4

Create output free errors.
Code:
grep -vFf file4 input > output

Thanks again
# 13  
Old 02-27-2016
Hi jiam912,
Interesting solution.

Still no sort, but this also seems to work:
Code:
#!/bin/ksh
awk '
$3 <= 1 && $4 <= 6.26 && (!($1 in f1) || $7 > f[$1]) {
	f1[$1] = $0
	f7[$1] = $7
}
($3 > 1 || $4 > 6.26) && !($1 in f1) && (!($1 in d1) || $7 > d7[$1]) {
	d1[$1] = $0
	d7[$1] = $7
}
END {	for(k in f1)
		print f1[k]
	for(k in d1)
		if(!(k in f1))
			print d1[k]
}' input

The f*[] arrays gather the records with the largest column 7 value for each column 1 value ignoring records that have values in columns 3 or 4 that are out of range (just like in my previous script, but with the logic reversed).

The d*[] arrays gather an abbreviated list of the records with the largest column 7 value for each column 1 value only for records that have values in columns 3 or 4 that are out of range. (I say abbreviated because after it encounters a record where a column 1 value has been found where columns 3 and 4 are in range, it stops gathering data for records where columns 3 or 4 are out of range.)

At the end it prints the accumulated f1[] array entries and then prints the accumulated d1[] array entries that didn't have an overriding entry in f1[].

It still just needs one process to do the work and still only needs to read the input once.

Hope this helps.

Last edited by Don Cragun; 02-27-2016 at 10:10 AM.. Reason: Fix auto spell-check fix error.
This User Gave Thanks to Don Cragun For This Post:
# 14  
Old 02-27-2016
Don,

Thanks a lot it works great... Thanks also for the explanation.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Use sort to sort numerical column

How to sort the following output based on lowest to highest BE? The following sort does not work. $ sort -t. -k1,1n -k2,2n bfd.txt BE31.116 0s 0s DOWN DAMP BE31.116 0s 0s DOWN DAMP BE31.117 0s 0s ... (7 Replies)
Discussion started by: sand1234
7 Replies

2. UNIX for Beginners Questions & Answers

Difference of Sort -n -k2 -k3 & Sort -n -k2,3

Hi, Could anyone kindly show me a link or explain the difference between sort -n -k2 -k3 & sort -n -k2,3 Also, if I like to remove the row with repetition at both $2 and $3, Can I safely use sort -u -k2 -k3 Example; 100 20 30 100 20 30 So, both $2 and $3 are same and I... (2 Replies)
Discussion started by: Indra2011
2 Replies

3. Shell Programming and Scripting

Sort help: How to sort collected 'file list' by date stamp :

Hi Experts, I have a filelist collected from another server , now want to sort the output using date/time stamp filed. - Filed 6, 7,8 are showing the date/time/stamp. Here is the input: #---------------------------------------------------------------------- -rw------- 1 root ... (3 Replies)
Discussion started by: rveri
3 Replies

4. Shell Programming and Scripting

Help with sort word and general numeric sort at the same time

Input file: 100%ABC2 3.44E-12 USA A2M%H02579 0E0 UK 100%ABC2 5.34E-8 UK 100%ABC2 3.25E-12 USA A2M%H02579 5E-45 UK Output file: 100%ABC2 3.44E-12 USA 100%ABC2 3.25E-12 USA 100%ABC2 5.34E-8 UK A2M%H02579 0E0 UK A2M%H02579 5E-45 UK Code try: sort -k1,1 -g -k2 -r input.txt... (2 Replies)
Discussion started by: perl_beginner
2 Replies

5. Shell Programming and Scripting

Errors in if conditions with to many OR conditions

Hi ALL I have a script where in i need to check for several values in if conditons but when i execute the script it throws error such as "TOO MANY ARGUMENTS" if then msg="BM VAR Issue :: bmaRequestVAR=$bmaRequestVAR , nltBMVAR=$nltBMVAR , bmaResponseVAR=$bmaResponseVAR ,... (10 Replies)
Discussion started by: nikhil jain
10 Replies

6. Shell Programming and Scripting

If conditions need

Dear Expert, Below code is for to take the backup of database by daily time stamp. I need vital help to make my script automatic sending me email if it sucess or fail. echo on @REM Seamonkey's quick date batch (MMDDYYYY format) @REM Setups %date variable @REM First parses month, day, and... (6 Replies)
Discussion started by: Alone
6 Replies

7. Shell Programming and Scripting

Alternate to sort --random-sort

sort --random-sort The full command is path=`find /testdir -maxdepth 1 -mindepth 1 -type d | ***Some sort of sort function*** | head -1` I have a list I want to randomly sort. It works fine in ubuntu but on a 'osx lion' sort dosen't have the --random-sort option. I don't want to... (5 Replies)
Discussion started by: digitalviking
5 Replies

8. UNIX for Advanced & Expert Users

Script to sort the files and append the extension .sort to the sorted version of the file

Hello all - I am to this forum and fairly new in learning unix and finding some difficulty in preparing a small shell script. I am trying to make script to sort all the files given by user as input (either the exact full name of the file or say the files matching the criteria like all files... (3 Replies)
Discussion started by: pankaj80
3 Replies

9. Shell Programming and Scripting

How to Sort Floating Numbers Using the Sort Command?

Hi to all. I'm trying to sort this with the Unix command sort. user1:12345678:3.5:2.5:8:1:2:3 user2:12345679:4.5:3.5:8:1:3:2 user3:12345687:5.5:2.5:6:1:3:2 user4:12345670:5.5:2.5:5:3:2:1 user5:12345671:2.5:5.5:7:2:3:1 I need to get this: user3:12345687:5.5:2.5:6:1:3:2... (7 Replies)
Discussion started by: daniel.gbaena
7 Replies

10. UNIX for Dummies Questions & Answers

2 or more if conditions

Hello, I have a file as follows: col no:1 2 3 4 5 6 7 8 9 10 11 a 4 226 226 ch:95024048-95027592, 1y224 of 3545 223 224 ident b 53 235 235 ch:148398-148401255, 1y184 of 3187 180 186 ident awk... (3 Replies)
Discussion started by: dr_sabz
3 Replies
Login or Register to Ask a Question