Sponsored Content
Top Forums Shell Programming and Scripting Join fields from files with duplicate lines Post 302743321 by xan.amini on Wednesday 12th of December 2012 12:43:26 PM
Old 12-12-2012
Code Join fields from files with duplicate lines

I have two files,

file1.txt:
Code:
1 abc
2 def
2 dgh
3 ijk
4 lmn

file2.txt
Code:
1 opq
2 rst
3 uvw

My desired output is:
Code:
1 abc opq
2 def rst
2 dgh rst
3 ijk uvw

I have tried using the 'join' command (ie. 'join file1.txt file2.txt') but it returns the error that file1.txt is not sorted because of the repeated value (ie. 2).

Does anyone know how to solve this? Thanks!

Last edited by Franklin52; 12-13-2012 at 04:00 AM.. Reason: Please use code tags for data and code samples
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

join on a file with multiple lines, fields

I've looked at the join command which is able to perform what I need on two rows with a common field, however if I have more than two rows I need to join all of them. Thus I have one file with multiple rows to be joined on an index number: 1 randomtext1 2 rtext2 2 rtext3 3 rtext4 3 rtext5... (5 Replies)
Discussion started by: crimper
5 Replies

2. Shell Programming and Scripting

join lines on line break in files

i had a file where lines appear to be broken when they shouldn't eg Line 1. kerl abc sdskd sdsjkdlsd sdsdksd \ Line 2. ksdkks sdnjs djsdjsd i can do a shift join to combine the lines but i there are plenty of files with this issue Line 1. kerl abc sdskd sdsjkdlsd sdsdksd ksdkks sdnjs... (6 Replies)
Discussion started by: mad_man12
6 Replies

3. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi, In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'. In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
Discussion started by: machomaddy
7 Replies

4. Shell Programming and Scripting

awk program to join 2 fields of different files

Hello Friends, I just need a small help, I need an awk program which can join 2 fields of different files which are having one common field into one file. File - 1 FileName~Size File- 2 FileName~Date I need the output file in the following way O/P- File FileName~Date~Size For... (4 Replies)
Discussion started by: abhisheksunkari
4 Replies

5. Shell Programming and Scripting

Join fields comparing 4 fields using awk

Hi All, I am looking for an awk script to do the following Join the fields together only if the first 4 fields are same. Can it be done with join function in awk?? a,b,c,d,8,,, a,b,c,d,,7,, a,b,c,d,,,9, a,b,p,e,8,,, a.b,p,e,,9,, a,b,p,z,,,,9 a,b,p,z,,8,, desired output: ... (1 Reply)
Discussion started by: aksijain
1 Replies

6. Shell Programming and Scripting

Join lines from two files based on match

I have two files. File1 >gi|11320906|gb|AF197889.1|_Buchnera_aphidicola ATGAAATTTAAGATAAAAAATAGTATTTT >gi|11320898|gb|AF197885.1|_Buchnera_aphidicola ATGAAATTTAATATAAACAATAAAA >gi|11320894|gb|AF197883.1|_Buchnera_aphidicola ATGAAATTTAATATAAACAATAAAATTTTT File2 AF197885 Uroleucon aeneum... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

7. Windows & DOS: Issues & Discussions

Remove duplicate lines from text files.

So, I have text files, one "fail.txt" And one "color.txt" I now want to use a command line (DOS) to remove ANY line that is PRESENT IN BOTH from each text file. Afterwards there shall be no duplicate lines. (1 Reply)
Discussion started by: pasc
1 Replies

8. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies

9. Shell Programming and Scripting

Join files on multiple fields

Hello all, I want to join 2 tabbed files on the first 2 fields, and filling the missing values with 0. The 3rd column in each file is constant for the entire file. file1 12658699 ST5 XX2720 0 1 0 1 53039541 ST5 XX2720 1 0 1.5 1 file2 ... (6 Replies)
Discussion started by: sheetalk
6 Replies

10. Shell Programming and Scripting

Join and merge multiple files with duplicate key and fill void columns

Join and merge multiple files with duplicate key and fill void columns Hi guys, I have many files that I want to merge: file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: (5 Replies)
Discussion started by: yjacknewton
5 Replies
JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME
join - relational database operator SYNOPSIS
join [-an] [-e s] [-o list] [-tc] file1 file2 DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 is `-', the standard input is used. File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in each line. There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con- sists of the common field, then the rest of the line from file1, then the rest of the line from file2. Fields are normally separated by blank, tab or newline. In this case, multiple separators count as one, and leading separators are dis- carded. These options are recognized: -an In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2. -e s Replace empty output fields by string s. -o list Each output line comprises the fields specified in list, each element of which has the form n.m, where n is a file number and m is a field number. -tc Use character c as a separator (tab character). Every appearance of c in a line is significant. SEE ALSO
sort(1), comm(1), awk(1). BUGS
With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort. The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous. 7th Edition April 29, 1985 JOIN(1)
All times are GMT -4. The time now is 07:33 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy