Sponsored Content
Top Forums Shell Programming and Scripting Merge two files with different lengths Post 302796587 by ClaraW on Friday 19th of April 2013 09:19:03 PM
Old 04-19-2013
Merge two files with different lengths

Hi there,

I have two very long files like:

file1: two fields

Code:
1 123 
1 125
1 234
2 123
2 234
2 300
2 312
3 10
3 215
4 56
...

file2: 6 fields

Code:
1 123 0 1 0 0
1 126 2 1 0 0
2 123 0 1 0 1
2 138 1 1 1 1
2 300 0 1 2 3
2 311 2 4 6 0
3 120 3 4 1 0
3 215 1 1 2 1
3 216 0 2 1 5
3 345 8 0 1 0
3 357 0 1 1 1
3 500 2 1 0 1
4 17  6 1 0 2
4 70  0 1 0 1
...

The numbers of lines in file1 and file2 are not equal.

I want to get an output file like

file3: 6 fields, and the first two fields are exactly the same as the first two fields in file1. For example, the line with the first two field "1 123" has a match in file2: "1 123 0 1 0 0", then print the whole line in file2:"1 123 0 1 0 0" to file3. If one line in file1 does not have a match in file2, e.g. "1 125", then print "1 125 0 0 0 0" to file3.

Code:
1 123 0 1 0 0
1 125 0 0 0 0 
1 234 0 0 0 0
2 123 0 1 0 1
2 234 0 0 0 0
2 300 0 1 2 3
2 312 0 0 0 0
3 10  0 0 0 0
3 215 1 1 2 1
4 56  0 0 0 0
...

I am wondering if this can be done using awk or join or any other in linux? Since the files are very large, I really want it to be fast. Thanks a lot~~~

Note: Field 2 in both file1 and file2 has only number values, but field 1 in both files may have characters too. The two fields are sorted. And in both files, this kind of situation will not happen, no duplicates.

Code:
1 123
1 123
...

Also we do not need to consider the lines in file2 which do not have any match in file1, for example "1 126 2 1 0 0" (no match in file1), then this line should not be added to file3.

Last edited by ClaraW; 04-26-2013 at 08:38 PM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Using grep to find strings of certain lengths?

I am trying to use grep to find strings of certain lengths that all start with the same letter. Is this possible?:confused: (4 Replies)
Discussion started by: crabtruck
4 Replies

2. Shell Programming and Scripting

Merge files of differrent size with one field common in both files using awk

hi, i am facing a problem in merging two files using awk, the problem is as stated below, file1: A|B|C|D|E|F|G|H|I|1 M|N|O|P|Q|R|S|T|U|2 AA|BB|CC|DD|EE|FF|GG|HH|II|1 .... .... .... file2 : 1|Mn|op|qr (2 Replies)
Discussion started by: shashi1982
2 Replies

3. Solaris

limit on Solaris username lengths?

Hi this question applies to Solaris 8,9,10 and opensolaris as in my environment it applies to all of these Is there a limit on the size of the username (in /etc/passwd) or indeed does there come a point where, like the 8 character limitation of passwords, the system receives the input but... (6 Replies)
Discussion started by: hcclnoodles
6 Replies

4. Shell Programming and Scripting

Read lines with different lengths in while loop

Hi there ! I need to treat files with variable line length, and process the tab-delimited words of each line. The tools I know are some basic bash scripting and sed ... I haven't got to python or perl yet. So my file looks like this obj1 0.01953 0.34576 0.04418 0.01249 obj2 0.78140... (7 Replies)
Discussion started by: jossojjos
7 Replies

5. Shell Programming and Scripting

Merging data from 2 files of different lengths?

Hi all, Sorry if someone has answered something like this already, but I have a problem. I am not brilliant with "awk" but think it should be the command to use to get what I am after. I have 2 files: job-file (several hundred lines like): 1018003,LONG MU WAN,1113S 1018004,LONG MU... (4 Replies)
Discussion started by: sgb2301
4 Replies

6. Shell Programming and Scripting

Checking in a directory how many files are present and basing on that merge all the files

Hi, My requirement is,there is a directory location like: :camp/current/ In this location there can be different flat files that are generated in a single day with same header and the data will be different, differentiated by timestamp, so i need to verify how many files are generated... (10 Replies)
Discussion started by: srikanth_sagi
10 Replies

7. Shell Programming and Scripting

Merge files and generate a resume in two files

Dear Gents, Please I need your help... I need small script :) to do the following. I have a thousand of files in a folder produced daily. I need first to merge all files called. txt (0009.txt, 0010.txt, 0011.txt) and and to output a resume of all information on 2 separate files in csv... (14 Replies)
Discussion started by: jiam912
14 Replies

8. Shell Programming and Scripting

Paste files of varying lengths

I have three files of varying lengths and different number of columns. How can I paste all three with all columns aligned? File1 ---- 123 File2 ---- 234 345 678 File3 ---- 456 789 Output should look like: 123 234 456 345 789 (6 Replies)
Discussion started by: Un1xNewb1e
6 Replies

9. UNIX for Advanced & Expert Users

UTF-8,16,32 character lengths using awk

Hi All, I am trying to obtain count of characters using awk, but "length" function returns a value of 1 for 2-byte or 3-byte characters as well unlike wc -c command. I have tried to use the below commands within awk function, but it does not seem to work { cmd="wc -c "stringtocheck ( cmd )... (6 Replies)
Discussion started by: tostay2003
6 Replies

10. Programming

Simple C program to count word lengths

So my program is not working and I keep changing it to figure out why. So I have two questions, can I do tracing similar to bash, and also what is wrong with this. The idea is simple, I want to count "word" lengths, with the loose definition of word not being a space, tab, or newline. Here is... (11 Replies)
Discussion started by: Riker1204
11 Replies
All times are GMT -4. The time now is 12:13 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy