Sorting unique by column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sorting unique by column
# 1  
Old 09-08-2014
Sorting unique by column

I am trying to sort, do uniq by 1st column and report this 4 columns tab delimiter table , eg
Code:
chr10:112174128 rs2255141       2E-10   Cholesterol, total
chr10:112174128 rs2255141       7E-16   LDL
chr10:17218291  rs10904908      3E-11   HDL Cholesterol
chr10:17218291  rs970548        8E-9    TG
chr1:109275684  rs629301        2E-170  Cholesterol, totat

I want an output this this

Code:
chr10:112174128 rs2255141       2E-10/7E-16   Cholesterol, total; LDL
chr10:17218291  rs10904908/rs970548      3E-11/8E-9   HDL Cholesterol; TG
chr1:109275684  rs629301        2E-170  Cholesterol, total

Can you help ?
# 2  
Old 09-08-2014
Any attempts from your side?
# 3  
Old 09-08-2014
Yes,

Code:
awk 'BEGIN{printf "\t"}{A[$1] = A[$1] ?  A[$1] OFS $4 : $4}FNR==1{printf FILENAME OFS}END{printf RS; for(i in A)print i, A[i]}' OFS='\t' myfile.txt | sort

# 4  
Old 09-08-2014
Try
Code:
awk     '       {F2[$1]=F2[$1]"/"$2
                 F3[$1]=F3[$1]"/"$3
                 F4[$1]=F4[$1]";"$4}
         END    {for (S1 in F2)
                         print  S1,
                                substr(F2[S1],2),
                                substr(F3[S1],2),
                                substr(F4[S1],2)}
        ' FS="\t" OFS="\t" file | sort
chr10:112174128    rs2255141/rs2255141    2E-10/7E-16    Cholesterol, total;LDL
chr10:17218291     rs10904908/rs970548    3E-11/8E-9     HDL Cholesterol;TG
chr1:109275684     rs629301               2E-170         Cholesterol, totat

This User Gave Thanks to RudiC For This Post:
# 5  
Old 09-08-2014
If we sort first can save using memory like this:

Code:
sort infile | awk '
F1 && F1!=$1 {print F1,F2,F3,F4; F2=F3=F4=x}
{ F1=$1
  F2=(F2?F2"/":x)$2
  F3=(F3?F3"/":x)$3
  F4=(F4?F4";":x)$4 }
END {print F1,F2,F3,F4}' FS='\t' OFS='\t'
chr1:109275684  rs629301                2E-170          Cholesterol, totat
chr10:112174128 rs2255141/rs2255141     2E-10/7E-16     Cholesterol, total;LDL
chr10:17218291  rs10904908/rs970548     3E-11/8E-9      HDL Cholesterol;TG

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Count unique column

Hello, I am trying to count unique rows in my file based on 4 columns (2-5) and to output its frequency in a sixth column. My file is tab delimited My input file looks like this: Colum1 Colum2 Colum3 Colum4 Coulmn5 1.1 100 100 a b 1.1 100 100 a c 1.2 200 205 a d 1.3 300 301 a y 1.3 300... (6 Replies)
Discussion started by: nans
6 Replies

2. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Hello Team, I need your help on the following: My input file a.txt is as below: 3330690|373846|108471 3330690|373846|108471 0640829|459725|100001 0640829|459725|100001 3330690|373847|108471 Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Discussion started by: angshuman
4 Replies

3. Shell Programming and Scripting

Sorting out unique values from output of for loop.

Hi , i have a belwo script which is used to get sectors per track value extarcted from Solaris machine: for DISK in /dev/dsk/c*t*d*s*; do value=`prtvtoc "$DISK" | sed -n -e '/Dimensions/,/Flags/{/Dimensions/d; /Flags/d; p; }' | sed -n -e '/sectors\/track/p'`; if ; then echo... (4 Replies)
Discussion started by: omkar.jadhav
4 Replies

4. UNIX for Dummies Questions & Answers

Sorting and saving values based on unique entries

Hi all, I wanted to save the values of a file that contains unique entries based on a specific column (column 4). my sample file looks like the following: input file: 200006-07file.txt 145 35 10 3 147 35 12 4 146 36 11 3 145 34 12 5 143 31 15 4 146 30 14 5 desired output files:... (5 Replies)
Discussion started by: ida1215
5 Replies

5. Shell Programming and Scripting

[Solved] Sorting a column based on another column

hello, I have a file as follows: F0100010 A C F0100040 A G BTA-28763-no-rs 77.2692 F0100020 A G F0100030 A T BTA-29334-no-rs 11.4989 F0100030 A T F0100020 A G BTA-29515-no-rs 127.006 F0100040 A G F0100010 A C BTA-29644-no-rs 7.29827 F0100050 A... (9 Replies)
Discussion started by: Homa
9 Replies

6. Shell Programming and Scripting

Finding unique entries without sorting

Hi Guys, I have two files that I am using: File1 is as follows: wwe khfgv jfo jhgfd hoaha hao lkahe This is like a master file which has entries in the order which I want. (4 Replies)
Discussion started by: npatwardhan
4 Replies

7. UNIX for Dummies Questions & Answers

need help sorting/deleting non-unique things

I don't really know much about UNIX commands, so if someone could help me understand how to do this, I'd really appreciate it. I have a text file with data that looks like this (filename: numbers.txt): 1 1 1 1 1 1 1 1 1 2 1 1_2 2_1 1 1 1 1 1 1 1 1 2 1 2 1_2 2_1 1 1 1 1 1 1 1 1 2 1 2 1_2 2_1... (12 Replies)
Discussion started by: zac100
12 Replies

8. UNIX for Dummies Questions & Answers

Sorting with unique piping for a lot of files

Hi power user, if I have this file: file1.txt: 1111 1111 2222 2222 3333 3333 3333 4444 4444 4444 when I run the sort file1.txt | uniq > data1.txt the result is (2 Replies)
Discussion started by: anjas
2 Replies

9. Shell Programming and Scripting

sorting file and unique commnad..

hello everyone.. I was wondering is there a effective way to sort file that contains colomns and numeric one. file 218900012192 8938929 8B8DF3664 1E7E2D59D5 0000 26538 1234 74024415 218900012979 8938929 8B8DF3664 1E7E2D59D5 0000 26538 1234 74024415 218900012992 8938929 8B8DF3664... (2 Replies)
Discussion started by: amon
2 Replies
Login or Register to Ask a Question