Comparing multiple fields from 2 files uing awk


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Comparing multiple fields from 2 files uing awk
# 8  
Old 08-27-2012
Still looking for help

Hi
Wondering if anyone has the answer for my question posted above.
I am trying to compare 2 files; such that file sizes are not the same.
Each line of file1 had to compared to each line of file2. I know how to do it perl using 2 loops: load file2 in an array and loop through each line of file2 using 1st line of file1. However I am not able to do this in awk.
@agama "This assumes that there are the exact same number of lines in each file, and that records align between both files." this assumption is not true.
Thanks
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Awk: matching multiple fields between 2 files

Hi, I have 2 tab-delimited input files as follows. file1.tab: green A apple red B apple file2.tab: apple - A;Z Objective: Return $1 of file1 if, . $1 of file2 matches $3 of file1 and, . any single element (separated by ";") in $3 of file2 is present in $2 of file1 In order to... (3 Replies)
Discussion started by: beca123456
3 Replies

2. Shell Programming and Scripting

awk arrays comparing multiple columns across two files.

Hi, I'm trying to use awk arrays to compare values across two files based on multiple columns. I've attempted to load file 2 into an array and compare with values in file 1, but success has been absent. If anyone has any suggestions (and I'm not even sure if my script so far is on the right lines)... (4 Replies)
Discussion started by: hubleo
4 Replies

3. Shell Programming and Scripting

Download multiple files uing wget

Need Assistance . Using wget how can i download multiple files from http site. Http doesnt has wild card (*) but FTP has it . Any ideas will be appreciative. wget --timeout=120 --append-output=output.txt --no-directories --cut-dirs=1 -np -m --accept=grib2 -r http://sample.com/... (4 Replies)
Discussion started by: ajayram_arya
4 Replies

4. Shell Programming and Scripting

UNIX append field with comparing fields from multiple column

I have a csv dump from sql server that needs to be converted so it can be feed to another program. I already sorted on field 1 but there are multiple columns with same field 1 where it needs to be compared against and if it is same then append field 5. i.e from ANG SJ,0,B,LC22,LC22(0) BAT... (2 Replies)
Discussion started by: nike27
2 Replies

5. Shell Programming and Scripting

Comparing two files using four fields

Dear All, I want to compare File1 and File2 (Separated by spaces) using four fields (Column 1,2,4,5). Logic: If column 1 and 2 of File1 and File2 match exactly and if the File2 has the same characters as any of the characters present in column 4 and 5 of file1 then those lines of file1 and file2... (6 Replies)
Discussion started by: NamS
6 Replies

6. Shell Programming and Scripting

Comparing two files using four fields

I want to compare File1 and File2 (Separated by spaces) using four fields (Column 1,2,4,5). Logic: If column 1 and 2 of File1 and File2 match exactly and if the File2 has the same characters as any of the characters present in column 4 and 5 of file1 then those lines of file1 and file2 are... (1 Reply)
Discussion started by: NamS
1 Replies

7. Shell Programming and Scripting

Join fields comparing 4 fields using awk

Hi All, I am looking for an awk script to do the following Join the fields together only if the first 4 fields are same. Can it be done with join function in awk?? a,b,c,d,8,,, a,b,c,d,,7,, a,b,c,d,,,9, a,b,p,e,8,,, a.b,p,e,,9,, a,b,p,z,,,,9 a,b,p,z,,8,, desired output: ... (1 Reply)
Discussion started by: aksijain
1 Replies

8. Programming

comparing two fields from two different files in AWK

Hi, I have two files formatted as following: File 1: (user_num_ID , realID) (the NR here is 41671) 1 cust_034_60 2 cust_80_91 3 cust_406_4 .. .. File 2: (realID , clusterNumber) (total NR here is 1000) cust_034_60 2 cust_406_4 3 .. .. (11 Replies)
Discussion started by: amarn
11 Replies

9. Shell Programming and Scripting

Comparing fields in two files

Hi, i want to compare two files by one field say $3 in file1 needs to compare with $2 in file2. sample file1 - reqd_charge_code 2263881188,24570896,439 2263881964,24339077,439 2263883220,22619162,228 2263884224,24631840,442 2263884246,22612161,442 sample file2 - rg_j ... (2 Replies)
Discussion started by: raghavendra.cse
2 Replies

10. Shell Programming and Scripting

awk print fields to multiple files?

I am trying to print the output of a command to two separate files. Is it possible to use awk to print $1 to one file and $2 to another file? Thanks in advance! (1 Reply)
Discussion started by: TheCrunge
1 Replies
Login or Register to Ask a Question
SEQFMT(5)							   User Manuals 							 SEQFMT(5)

NAME
seqfmt - Sequences formats DESCRIPTION
This document illustrates some common formats used for sequences representation. EMBL ID MMVASPHOS standard; RNA; EST; 140 BP. AC X97897; DE M.musculus mRNA for protein homologous to DE vasodilator-stimulated phosphoprotein SQ Sequence 140 BP; 25 A; 58 C; 39 G; 17 T; 1 other; ttctcccaga agctgactct atggngaccc cgagagagac tgagcagaac 60 ccccgcaccc ctgcacttcc aatcaggggc gccccgggag cactccccgt 120 ccgccctccg cgcagccatg 140 // FASTA >MMVASPHOS ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc ccgccctccgcgcagccatg GCG !!NA_SEQUENCE 1.0 (No documentation) dna1.txt Length: 88 Nov 22, 2001 14:38 Type: N Check: 3818 .. 1 TAGTCGTAGT CGGAGCGATG CTGACGATGA CGATGACGAT CGTAGCTGAT 51 CGATCGAGCT GATGCTGATC GAGCTAGCTG ATCGATCG GDE #sample1 TTCAAGAGAAACAGCGGCCAAGGAAAAGACTCGGCATGATTGTCCATAGCTTACAAAGCG #sample2 TTCAAGAGAAACAGCGGCTGGGGGAAAGACTCGTCCTGATTGCCTGTAGATGGTAAAGCG GENBANK LOCUS HUMHBV1 130 bp DNA PRI 17-JUN-1993 DEFINITION Human DNA/endogenous Hepatitis B virus (HBV) DNA, left host viral junction. ACCESSION M15770 BASE COUNT 32 a 43 c 29 g 26 t ORIGIN 1 agcgggcagt gcagctgctt ggacagcagg ggtgtttctt caacccaggc 61 ctcctgtcac aacaggccca ttcaattctg aacctgcaag ccaactccaa 121 cctcttttcc cagggggaac caaaaaccct // IG ; comment U03518 AACCTGCGGAAGGATCATTACCGAGTGCGGGTCCTTTGGGCCCAACCTCCCATCCGTGTC TATTGTACCCTGTTGCTTCGGCGGGCCCGCCGCTTGTCGGCCGCCGGGGGGGCGCCTCTG TGAGTTGATTGAATGCAATCAGTTAAAACTTTCAACAATGGATCTCTTGGTTCCGGC1 NBRF >P1;CCHU cytochrome c [validated] - human MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIW GEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKATNE* PIR ENTRY CCHU #type complete TITLE cytochrome c [validated] - human ACCESSIONS A31764; A05676; I55192; A00001 SUMMARY #length 105 #molecular-weight 11749 #checksum 3247 SEQUENCE 5 10 15 20 25 30 1 M G D V E K G K K I F I M K C S Q C H T V E K G G K H K T G 31 P N L H G L F G R K T G Q A P G Y S Y T A A N K N K G I I W 61 G E D T L M E Y L E N P K K Y I P G T K M I F V G I K K K E 91 E R A D L I A Y L K K A T N E /// RAW ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc ccgccctccgcgcagccatg Warning: This format cannot handle more than one sequence per file. SWISSPROT ID 100K_RAT STANDARD; PRT; 149 AA. AC Q62671; DE 100 kDa protein (EC 6.3.2.-). SQ SEQUENCE 149 AA; 17004 MW; D06484B8BC29112E CRC64; MMSARGDFLN YALSLMRSHN DEHSDVLPVL DVCSLKHVAY VFQALIYWIK PQLERKRTRE LLELGIDNED SEHENDDDTS QSATLNDKDD ESLPAETGQN SITIRPPDDQ HLPTANTCIS RLYVPLYSSK QILKQKLLLA IKTKNFGFV // SEE ALSO
squizz(1), alifmt(5) AUTHOR
Nicolas Joly (njoly@pasteur.fr), Institut Pasteur. Unix 2009-05-19 SEQFMT(5)