Sponsored Content
Full Discussion: Find common entries
Top Forums Shell Programming and Scripting Find common entries Post 302727777 by Don Cragun on Tuesday 6th of November 2012 11:08:21 PM
Old 11-07-2012
Question

Quote:
Dear Don

Thanks for your help. I checked regarding this, but there are several entries common between first file and second file.

Infact, somebody even has given me code and found matched entries between two files but even this is not working in my system and out put is unchanged inmy system but is wrkingin his system. ab it strange!

Below id the code and output he has provided which is not wrking in this way on my system here:

Code:
$ awk 'NR==FNR{X[$0]=$0;next}{s=$1;$1="";for(i in X){if($0 ~ i){gsub(i,i" (matched)",$0)}};$0=s""$0}1' file1 file2
FHIT Adenosine (matched) Monotungstate Not Available,T2D Ado-P-Ch2-P-Ps-Ado Not Available,
CHRM1 Trospium (matched) Sanctura T2D Oxyphenonium (matched) Antrenyl T2D
PDE3B 5r-6-4-2-3-Iodobenzyl-3-Oxocyclohex-1-En-1-YlAminoPhenyl-5-Methyl-4,5-Dihydropyridazin-32h-One Not Available,T1D Hg9a-9, Nonanoyl-N-Hydroxyethylglucamide Not Available,
HSP90AA19-Butyl-8-2,5-Dimethoxy-Benzyl-9h-Purin-6-Ylamine Not Available,T2D 8-2-Chloro-3,4,5-Trimethoxy-Benzyl-2-Fluoro-9-Pent-4-Ylnyl-9h-Purin-6-Ylamine Not Available,T2D
ESR1 Chlorotrianisene (matched) Anisene,BD Conjugated Estrogens (matched) Conestoral,BD
INS M-Cresol Not Available,
FAH Acetoacetic Acid Not Available,BD 4-Hydroxy-Methyl-Phosphinoyl-3-Oxo-Butanoic Acid Not Available,
LPL Tyloxapol (matched) Alevaire,
ADAM17 3S-1-4-BUT-2-YN-1-YLOXYPHENYLSULFONYLPYRROLIDINE-3-THIOL Not Available T2D 3-4-but-2-yn-1-yloxyphenylsulfonylpropane-1-thiol Not Available T2D
GUCY1A2 Nitric Oxide (matched) INOmax,RA Isosorbide Mononitrate (matched) Conpin,
B4GALT1 6-Aminohexyl-Uridine-C1,5'-Diphosphate Not Available,
LCK 4-2-Acetylamino-2-3-Carbamoyl-2-Cyclohexylmethoxy-6,7,8,9-Tetrahydro-5h-Benzocyclohepten-5ylcarbamoyl-Ethyl-2-Phosphono-Phenyl-Phosphonic Acid Not Available,T1D 4-2-Acetylamino-2-1-3-Carbamoyl-4-Cyclohexylmethoxy-Phenyl-Ethylcarbamoyl-Ethyl-2-Phosphono-Phenoxy-Acetic Acid Not Available,T1D
GMDS Guanosine-5'-Diphosphate-Rhamnose Not Available,
LCT D-Gluconhydroximo-1,5-Lactam Not Available T2D Gluconolactone Not Available T2D
CALM1 3''-Beta-Chloroethyl-2'',4''-Dioxo-3, 5''-Spiro-Oxazolidino-4-Deacetoxy-Vinblastine (matched) Not Available T2D Prenylamine Bismethin,
RET 4-BROMO-2-FLUORO-N-4E-6-METHOXY-7-1-METHYLPIPERIDIN-4-YLMETHOXYQUINAZOLIN-41H-YLIDENEANILINE Not Available,
CYP1A2 2-PHENYL-4H-BENZOHCHROMEN-4-ONE Not Available,
PPARA Clofibrate (matched) Amotril,CD Gemfibrozil (matched) Bolutol,
TGFBR1 4-3-Pyridin-2-Yl-1h-Pyrazol-4-YlQuinoline Not Available,T2D Naphthyridine Inhibitor Not Available,T2D
PPARD 11E-OCTADEC-11-ENOIC ACID Not Available T2D 2S-2-3-2-fluoro-4-trifluoromethylphenylcarbonylaminomethyl-4-methoxybenzylbutanoic acid Not Available T2D
CSNK1G3 2Z-4-AMINO-2-4-METHOXYPHENYLIMINO-2,3-DIHYDRO-1,3-THIAZOL-5-YL4-METHOXYPHENYLMETHANONE Not Available T2D 4-AMINO-2-3-CHLOROANILINO-1,3-THIAZOL-5-YL4-FLUOROPHENYLMETHANONE Not Available,
NR3C1 Flunisolide (matched) Aerobid T2D Diflorasone (matched) Florone T2D
CTSD 1h-Benoximidazole-2-Carboxylic Acid Not Available T2D N-Aminoethylmorpholine Not Available T2D
TLL2 Carbobenzoxy-Pro-Lys-Phe-YPo2-Ala-Pro-Ome Not Available,
TYR Monobenzone (matched) AgeRite Alba,
HSD11B1 3,3-dimethylpiperidin-1-yl6-3-fluoro-4-methylphenylpyridin-2-ylmethanone Not Available,RA 5S-2-1S-1-4-fluorophenylethylamino-5-1-hydroxy-1-methylethyl-5-methyl-1,3-thiazol-45H-one Not Available,RA
C5 Eculizumab (matched) Soliris,
FGF1 Sucrose Octasulfate Not Available T2D Naphthalene Trisulfonate Not Available T2D
SORD Cp-166572, 2-Hydroxymethyl-4-4-N,N-Dimethylaminosulfonyl-1-Piperazino-Pyrimidine Not Available,
EGFR Gefitinib (matched) Iressa,T2D Panitumumab (matched) Vectibix,T2D
EPHB4 N-5-chloro-1,3-benzodioxol-4-yl-6-methoxy-7-3-piperidin-1-ylpropoxyquinazolin-4-amine Not Available T2D N'-5-CHLORO-1,3-BENZODIOXOL-4-YL-N-3,4,5- TRIMETHOXYPHENYLPYRIMIDINE-2,4-DIAMINE Not Available T2D
TPR N-1s-4-Bis2-ChloroethylAmino-1-Methylbutyl-N-6-Chloro-2-Methoxy-9-AcridinylAmine Not Available T2D Trypanothione Not Available,


CCL5 Heparin (matched) Disaccharide I-S Not Available,T1D Heparin (matched) Disaccharide Iii-S Not Available,

Although I tried in different machines at my place and still it's not working.

But I found that Adenosine monotungstatate is not present in first file but still it shows matched becasue adenosine is present which will be wrong in my case as I have to match whole word present in first file with whole word in second file columns.

Let me know if you come across any solution.
This is all very interesting, but it has absolutely nothing to do with the data that you uploaded in the file named "first file.txt" in message #9 in this thread. If you upload "first file.txt" and look at it, you will find that not even one of Adenosine, Trospium, Estrogens, Tyloxapol, Oxide, Mononitrate, Vinblastine, Clofibrate, Gemfibrozil, Flunisolide, Diflorasone, Monobenzone, Eculizumab, Gefitinib, Panitumumab, Heparin, and Heparin are in that file (and all of them are part of a field that contains "(matched") in the text from your last message quoted above).

Obviously file1 and file2 used in this test are not first file.txt and Secondfile.txt that you told me to use. Please try my code with the actual data that was used for this test. If you don't have the data files that produced the output you're showing in this message, don't be surprised that the output you get from using my script produces different results.

I wish you luck, but I will not be able to provide any more assistance on this topic. Smilie
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To find all common lines from 'n' no. of files

Hi, I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file. Please help. I know it could be done with the help of... (11 Replies)
Discussion started by: The Observer
11 Replies

2. Shell Programming and Scripting

find common data

Hey guys, I have two files. file1 and file2. file1: a,1 b,2 c,343 d,343 e,4343 f,4544 file 2: a, d, e, Now i need to find the common data between these files from file1. i.e a,1 (8 Replies)
Discussion started by: jaituteja
8 Replies

3. Shell Programming and Scripting

Request to check:find out common entries

I have to compare 2 files which means 2 files with common entries in same column and separate those common entries in a diferent file as well right before those entries common so that I can separat common and Uncommon entries in rows in 2 different files. Is it possible For eg. one file ... (3 Replies)
Discussion started by: manigrover
3 Replies

4. Shell Programming and Scripting

Find common entries in 2 list and write data before it

Hi all, I have 2 files: second file I want if entries in one file will match in other file. It shuld wite approve before it so output shuld be (1 Reply)
Discussion started by: manigrover
1 Replies

5. Shell Programming and Scripting

find common entries and match the number with long sequence and cut that sequence in output

Hi all, I have a file like this ID 3BP5L_HUMAN Reviewed; 393 AA. AC Q7L8J4; Q96FI5; Q9BQH8; Q9C0E3; DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot. DT 05-JUL-2004, sequence version 1. DT 05-SEP-2012, entry version 71. FT COILED 59 140 ... (1 Reply)
Discussion started by: manigrover
1 Replies

6. Shell Programming and Scripting

Find common numbers and print yes or no

Hi I have 2 files with following data First file, sp|Q676U5|A16L1_HUMAN, Autophagy-related protein 16-1 OS=Homo sapiens GN=ATG16L1 PE=1 SV=2, Maximum coiled-coil residue probability: 0.657 in position 163. Maximum dimeric residue probability: 0.288 in position 163. ... (1 Reply)
Discussion started by: manigrover
1 Replies

7. Shell Programming and Scripting

Find the common values

Hi, I have two files with the below values. file1 305231921 1.0 ben/Ben_Determination_Appeals 1348791394 2.0 ben/Ben_Determination_Appeals] 1305231921 1.0 ben/Cancel_Refund_Payment_JLRS 1348791394 2.0 ben/Cancel_Refund_Payment_JLRS 1305231921 ... (2 Replies)
Discussion started by: Vikram_Tanwar12
2 Replies

8. Shell Programming and Scripting

Find common words

Hi, I have 10 files which needs to be print common words from those all files. Is there any command to find out. (2 Replies)
Discussion started by: munna_dude
2 Replies

9. Shell Programming and Scripting

Find common files between two directories

I have two directories Dir 1 /home/sid/release1 Dir 2 /home/sid/release2 I want to find the common files between the two directories Dir 1 files /home/sid/release1>ls -lrt total 16 -rw-r--r-- 1 sid cool 0 Jun 19 12:53 File123 -rw-r--r-- 1 sid cool 0 Jun 19 12:53... (5 Replies)
Discussion started by: sidnow
5 Replies
All times are GMT -4. The time now is 01:32 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy