Sponsored Content
Top Forums Shell Programming and Scripting Merge two columns from two files into one if another column matches Post 302723235 by pbluescript on Monday 29th of October 2012 05:54:37 PM
Old 10-29-2012
Thanks for the help everyone!

All the options worked except for bipinajith's.
It turns out this was the fastest option:
Code:
awk '{p=$2; getline<f} !A[$2=p $2]++' f=file2 file1

It churned through two 300+million lines files in about four and a half minutes.

---------- Post updated at 05:54 PM ---------- Previous update was at 05:51 PM ----------

Quote:
Originally Posted by RudiC
I'd really appreciate to know how far this solution
can be driven, i.e. how many lines will make the awk arrays explode...
It worked just fine. Took 8.5 minutes and required 33GB of RAM. The best solution from Scrutinizer took half the time and 9GB of RAM.
This User Gave Thanks to pbluescript For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Joining columns from two files, if the key matches

I am trying to join/paste columns from two files for the rows with matching first field. Any help will be appreciated. Files can not be sorted and may not have all rows in both files. Thanks. File1 aaa 111 bbb 222 ccc 333 File2 aaa sss mmmm ccc kkkk llll ddd xxx yyy Want to... (1 Reply)
Discussion started by: sk_sd
1 Replies

2. Shell Programming and Scripting

merge the two files which has contain columns

Hi may i ask how to accomplish this task: I have 2 files which has multiple columns first file 1 a 2 b 3 c 4 d second file 14 a 9 .... 13 b 10.... 12 c 11... 11 d 12... I want to merge the second file to first file that will looks like this ... (2 Replies)
Discussion started by: jao_madn
2 Replies

3. Shell Programming and Scripting

Merge columns of different files

Hi, I have tab limited file 1 and tab limited file 2 The output should contain common first column vales and corresponding 2nd column values; AND also unique first column value with corresponding 2nd column value of the file that contains it and 0 for the second file. the output should... (10 Replies)
Discussion started by: polsum
10 Replies

4. Shell Programming and Scripting

How to merge multiple rows into single row if first column matches ?

Hi, Can anyone suggest quick way to get desired output? Sample input file content: A 12 9 A -0.3 2.3 B 1.0 -4 C 34 1000 C -111 900 C 99 0.09 Output required: A 12 9 -0.3 2.3 B 1.0 -4 C 34 1000 -111 900 99 0.09 Thanks (3 Replies)
Discussion started by: cbm_000
3 Replies

5. UNIX for Dummies Questions & Answers

How do I merge multiple columns into one column?

Hi all, I'm looking for a way to merge multiple columns (from one file) into a single column in an output file. The file I have looks somewhat like this: @HWI-ST212 1:N:0 AGTCCTACCGGGAGT + @@@DDDDDHHHHHII @HWI-ST212 1:N:0 CGTTTAAAAATTTCT + @;@B;DDDDH?:F;F... (4 Replies)
Discussion started by: Vnguyen
4 Replies

6. Shell Programming and Scripting

How to get difference of the same column between two files when other column matches?

File 1: 20130416,235800,10.78.25.104,BR2-loc,60.0,1624,50.0,0,50.0,0 20130416,235800,10.78.25.104,BR1-LOC,70.0,10,50.0,0,70.0,0 20130416,235800,10.78.25.104,Hub_None,60.0,15,60.0,0,50.0,0 File 2: 20130417,000200,10.78.25.104,BR2-loc,60.0,1626,50.0,0,50.0,0... (3 Replies)
Discussion started by: Lakshmikumari
3 Replies

7. Shell Programming and Scripting

Merge columns on different files

Hello, I have two files that have this format: file 1 86.82 0.00 86.82 43.61 86.84 0.00 86.84 43.61 86.86 0.00 86.86 43.61 86.88 0.00 86.88 43.61 file 2 86.82 0.22 86.84 0.22 86.86 0.22 86.88 0.22 I would like to merge these two files such that the final file looks like... (5 Replies)
Discussion started by: kayak
5 Replies

8. Shell Programming and Scripting

Seperated by columns, merge in a file, sort them on common column

Hi All, I have 4 files in below format. I took them as an example. File 1: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file. Inserting character H to the initial of all line like HCTOT. CTOT 456787897 Low fever CTOR 556712345 High fever... (2 Replies)
Discussion started by: Mannu2525
2 Replies

9. UNIX for Beginners Questions & Answers

Matches columns from two different files in shell script

Hi friends, i want to compare first columns from two different files ,if equal print the file2's second column else print the zero.Please help me... file1: a b c d efile2: a 1 c 20 e 30 desired output: 1 0 20 0 30 Please use CODE tags as required by forum rules! Please post in... (1 Reply)
Discussion started by: bhaskar illa
1 Replies

10. Shell Programming and Scripting

awk add all columns if column 1 name matches

Hi - I want to add all columns if column1 name matches. TOPIC1 5 1 4 TOPIC2 3 2 1 TOPIC3 7 2 5 TOPIC1 6 3 3 TOPIC2 4 1 3 TOPIC3 9 5 4 . . . . . . . . . . . . Result should look like TOPIC1 11 4 7 TOPIC2 7 3 4 (1 Reply)
Discussion started by: oraclermanpt
1 Replies
GETLINE(3)						     Linux Programmer's Manual							GETLINE(3)

NAME
getline, getdelim - delimited string input SYNOPSIS
#include <stdio.h> ssize_t getline(char **lineptr, size_t *n, FILE *stream); ssize_t getdelim(char **lineptr, size_t *n, int delim, FILE *stream); Feature Test Macro Requirements for glibc (see feature_test_macros(7)): Before glibc 2.10: getline(), getdelim(): _GNU_SOURCE Since glibc 2.10: getline(), getdelim(): _POSIX_C_SOURCE >= 200809L || _XOPEN_SOURCE >= 700 DESCRIPTION
getline() reads an entire line from stream, storing the address of the buffer containing the text into *lineptr. The buffer is null-termi- nated and includes the newline character, if one was found. If *lineptr is NULL, then getline() will allocate a buffer for storing the line, which should be freed by the user program. (In this case, the value in *n is ignored.) Alternatively, before calling getline(), *lineptr can contain a pointer to a malloc(3)-allocated buffer *n bytes in size. If the buffer is not large enough to hold the line, getline() resizes it with realloc(3), updating *lineptr and *n as necessary. In either case, on a successful call, *lineptr and *n will be updated to reflect the buffer address and allocated size respectively. getdelim() works like getline(), except a line delimiter other than newline can be specified as the delimiter argument. As with getline(), a delimiter character is not added if one was not present in the input before end of file was reached. RETURN VALUE
On success, getline() and getdelim() return the number of characters read, including the delimiter character, but not including the termi- nating null byte. This value can be used to handle embedded null bytes in the line read. Both functions return -1 on failure to read a line (including end-of-file condition). ERRORS
EINVAL Bad arguments (n or lineptr is NULL, or stream is not valid). VERSIONS
These functions are available since libc 4.6.27. CONFORMING TO
Both getline() and getdelim() were originally GNU extensions. They were standardized in POSIX.1-2008. EXAMPLE
#define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> int main(void) { FILE *fp; char *line = NULL; size_t len = 0; ssize_t read; fp = fopen("/etc/motd", "r"); if (fp == NULL) exit(EXIT_FAILURE); while ((read = getline(&line, &len, fp)) != -1) { printf("Retrieved line of length %zu : ", read); printf("%s", line); } free(line); exit(EXIT_SUCCESS); } SEE ALSO
read(2), fgets(3), fopen(3), fread(3), gets(3), scanf(3), feature_test_macros(7) COLOPHON
This page is part of release 3.25 of the Linux man-pages project. A description of the project, and information about reporting bugs, can be found at http://www.kernel.org/doc/man-pages/. GNU
2010-06-12 GETLINE(3)
All times are GMT -4. The time now is 11:05 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy