![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| stop unix find on a directory structure after finding 1st occurrence | jm0221 | Shell Programming and Scripting | 3 | 06-06-2008 07:19 PM |
| To find pid from port number | gmat | HP-UX | 6 | 12-31-2007 07:43 AM |
| how to find the number of files | harish409 | UNIX for Dummies Questions & Answers | 5 | 10-09-2007 09:09 AM |
| how to find serial number | chomca | AIX | 3 | 05-26-2006 10:00 AM |
| How to find number of processes ? | ArabOracle.com | SUN Solaris | 2 | 02-14-2006 03:29 AM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread |
Rating:
|
Display Modes |
|
||||
|
Code:
Hope this is what you intended for Thanks for the reply! The above commands will take more time ( really so much of time ! ) Am testing with sample of 10 billion records. So many process and kernel data structures will take days together to complete ![]() |
|
||||
|
I found the unix command to be useful.
cat filename | sort | uniq -c | sort -nr What if I am using a tab-delimited text file with multiple columns? Example: S100A16 hsa-miR-125a-3p S100A16 0.0011959 0.768059 PBXIP1 hsa-miR-125a-3p PBXIP1 0.0199898 0.700326 CYB5R3 hsa-miR-125a-3p CYB5R3 0.0081174 0.748953 BEST3 hsa-miR-125a-3p BEST3 0.00148927 0.756234 FAM101A hsa-miR-125a-3p FAM101A 0.0196212 0.783555 KIAA0195 hsa-miR-125a-3p KIAA0195 0.0019755 0.747427 LLGL2 hsa-miR-125a-3p LLGL2 0.0248212 0.876563 FBLN5 hsa-miR-125a-3p FBLN5 0.0162988 0.776446 IFITM3 hsa-miR-125a-3p IFITM3 0.00896808 0.478704 SSH3 hsa-miR-125a-3p SSH3 0.0301693 0.836054 EXTERNAL_NAME SEQ EXTERNAL_NAME p-value(1 vs. 2) Ratio(1 vs. 2) EMILIN1 hsa-miR-369-5p EMILIN1 0.0254294 0.720094 ADD3 hsa-miR-369-5p ADD3 0.0184075 0.742096 AIFM2 hsa-miR-369-5p AIFM2 0.00646348 0.829228 GPT2 hsa-miR-369-5p GPT2 0.00473291 0.706895 and I want the output to read 10 hsa-miR-125a-3p 4 hsa-miR-369-5p 1 SEQ Initially, I made a new file, composing only the column of interest, using perl (column 1 here), named it filenam_list; then applied the cat filenam_list | sort | uniq -c | sort -nr > filename_counts rm filenam_list Is there a more efficient way of doing this? I'm sure there has to be. I repeated the procedure on 7 files and I have to it again. |
|
||||
|
I think I found and answer to my own question. You specify the column of interest
Quote:
|
|
||||
|
If you don't have anything against (n|g)awk, simply do:
Code:
awk '{_[$2]+=1}END{for(i in _) print _[i], i}' file
|
| Sponsored Links | ||
|
|
![]() |
| Bookmarks |
| Tags |
| perl, perl shift, shift, shift perl |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|