Request to check:remove duplicates and write sytematically


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Request to check:remove duplicates and write sytematically
# 1  
Old 07-23-2012
Request to check:remove duplicates and write sytematically

Hi all


I have a file with following input

It contains 5 columns

gene name drug drug ID disease approved

Quote:
AGF X 1 disease1 approved
AGF h 2 disease2 not approved
AGF
AGF
XYZ
XYZ
CBK
CBK
BGY
BGY
BGY
BGY
Now the same gene is repeated many times with different data in column2,3 ,4,5

I want to arrange dat in such a way that there shuld be one entry in the column(no repeated entries) column 2,3,4,5 shuld remain as it is

so output shuld be like this:


Quote:
AGF X 1 disease1 approved
AGF h 2 disease2 not approved


XYZ

CBK

BGY

BGY

Kindly let me know scripting regarding this.
# 2  
Old 07-24-2012
Code:
awk '!arr[$0]++' inputfile | sort > outputfile

Please try that.
# 3  
Old 07-24-2012
Request to check

Hi all

sorry the output shuld contain only once the entry in first columns like thisL
Quote:
AGF X 1 disease1 approved
h 2 disease2 not approved


XYZ

CBK

BGY

BGY
---------- Post updated at 10:32 PM ---------- Previous update was at 10:22 PM ----------

Hi Jim

The output stiil contain repeated entries its just sorted it alphabetically. using this coding

Quote:
bash-3.2$ awk '!arr[$0]++' TTDtargets2.txt | sort > TTDtargets3.txt
bash-3.2$
it shows
Quote:
1,3-Beta-Glucan synthase Anidulafungin DAP000546 Fungal infections Approved
1,3-Beta-Glucan synthase Caspofungin DAP000547 Fungal infections Approved
1,3-Beta-Glucan synthase Cilofungin DCL000331 Candida infections Discontinued
1,3-Beta-Glucan synthase Eraxis/Vfend DCL000522 Beta-D Glucan Synthase Inhibitor, Cyp P450 Mediated Alpha-lanosterol Demethylation Phase III
1,3-Beta-Glucan synthase Micafungin DAP000548 Fungal infections Approved
16S rRNA
I want the output shuld be

1,3-Beta-Glucan synthase Anidulafungin DAP000546 Fungal infections Approved
Caspofungin DAP000547 Fungal infections Approved
Cilofungin DCL000331 Candida infections Discontinued
Eraxis/Vfend DCL000522 Beta-D Glucan Synthase Inhibitor, Cyp P450 Mediated Alpha-lanosterol Demethylation Phase III
Micafungin DAP000548 Fungal infections Approved
16S rRNA [/QUOTE]

---------- Post updated 07-24-12 at 10:18 AM ---------- Previous update was 07-23-12 at 10:32 PM ----------

Quote:
Originally Posted by manigrover
Hi all

sorry the output shuld contain only once the entry in first columns like thisL


---------- Post updated at 10:32 PM ---------- Previous update was at 10:22 PM ----------

Hi Jim

The output stiil contain repeated entries its just sorted it alphabetically. using this coding



it shows


I want the output shuld be

1,3-Beta-Glucan synthase Anidulafungin DAP000546 Fungal infections Approved
Caspofungin DAP000547 Fungal infections Approved
Cilofungin DCL000331 Candida infections Discontinued
Eraxis/Vfend DCL000522 Beta-D Glucan Synthase Inhibitor, Cyp P450 Mediated Alpha-lanosterol Demethylation Phase III
Micafungin DAP000548 Fungal infections Approved
16S rRNA
[/QUOTE]
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Request to check:remove duplicates only in first column

Hi all, I have an input file like this Now I have to remove duplicates only in first column and nothing has to be changed in second and third column. so that output would be Please let me know scripting regarding this (20 Replies)
Discussion started by: manigrover
20 Replies

2. Shell Programming and Scripting

Kindly check:remove duplicates with similar data in front of it

Hi all, I have 2 files containing data like this: so if there is same entry repeated in the column like1,2,3,4 I have to check if there is different entries column like 2,4 but similar entries for duplicatein column 2 like1,3 the output shuld be like this for first file ... (5 Replies)
Discussion started by: manigrover
5 Replies

3. Shell Programming and Scripting

Request to check: compare two files , match same entries, write data before it

Hi all, I have 2 files:Column1 of first file has to be matched with column 3 of second file first file contain DATA like this in 2 columns one with gene name second with whether CAD,HT,RA T2Dor any one column 1 column2 ARFGEF2 CAD DDEF2 CAD PSCD3 CAD PSCD4 CAD CAMK1... (5 Replies)
Discussion started by: manigrover
5 Replies

4. Shell Programming and Scripting

Request to check remove duplicates but write before it

Hi alll I have a file with following kind input I want in output duplicates should not be there but there should be numbering mentioned before that like (4 Replies)
Discussion started by: manigrover
4 Replies

5. Shell Programming and Scripting

Request to check: find common and write before it

Hi all, I have 2 big files with such kind of inputs File I File II I want the output file shuld contain Please let me knw scripting regarind this (1 Reply)
Discussion started by: manigrover
1 Replies

6. Shell Programming and Scripting

Request to check:Remove duplicates

Hi all I have a file with following kind of data I want to remove duplicates according to first column so that output contains Kindly let me scripting regading this. (4 Replies)
Discussion started by: manigrover
4 Replies

7. Shell Programming and Scripting

Request to check:remove entries more than once in different column

Hi I have a file 12m 345693460 12 1234 12 1234 34 345 34 345 And I want output fiel as 12m 345693460 12 1234 34 345 hw can it be done Thanks (1 Reply)
Discussion started by: manigrover
1 Replies

8. Shell Programming and Scripting

Request to check:remove entries with N/A mentioned

Hi I have a file with following entries 122 N/A 123 5654656 123423 43534543 4544 45435 435454 N/A i Have to remove entries with N/A so that only 123 5654656 123423 43534543 4544 45435 remain in output file can anybody guide for a code/unix/perl (2 Replies)
Discussion started by: manigrover
2 Replies

9. Shell Programming and Scripting

Request to check:remove entries with blank spaces

Hi I want to remove entries with blank spaces in my input file: 123 234 456 678 56789 345346456 589 3454 345456 3454566............................ (2 Replies)
Discussion started by: manigrover
2 Replies

10. Shell Programming and Scripting

Request to check:remove entries more than once

Hi I have a file like this 1234 2345 567889 567889 2345 234899420 83743 2345 67890 67890 ................ so on I want to delete entries which are more than once like 2345, 567889 and 67890 so that these appear once (4 Replies)
Discussion started by: manigrover
4 Replies
Login or Register to Ask a Question