Merging multiple files using lines from one file Post: 302704879

Sponsored Content

Top Forums Shell Programming and Scripting Merging multiple files using lines from one file Post 302704879 by Don Cragun on Sunday 23rd of September 2012 10:32:26 PM

09-23-2012

Registered User

Assuming that your version of awk has lots of memory and no limits on output line lengths, your system has a LARGE value for ARG_MAX, and that your shell doesn't limit the number of arguments you can pass to an application; the following will create a file with the contents you've requested:

Code:

#!/bin/ksh
awk -v printkey=1 '
FNR==NR{key[$1] = ++kc
        next
}
$1 in key{
        if(out[key[$1]] != "")
                out[key[$1]] = out[key[$1]] FS $2
        else    out[key[$1]] = printkey > 0 ? $1 FS $2 : $2
}
END {   for(i = 1; i <= kc; print out[i++]){}
}' list b.? b.?? b.??? b.???? > out

With 3000 input files of 50000 lines each, this awk program is going to take quite a while to complete. I would expect that it will run into some line length or memory limits which will necessitate running this awk program multiple times on smaller sets of the b.* files with the output from each run saved in a temp file. The paste utility can then be used to join the temp files into a single output file. (Note that in this case the 1st invocation of awk needs to have printkey=1 and all remaining invocations of awk need to have printkey=0 (or unset) so the key will only appear in the output lines once.

Note also that line there will be more than 6000 bytes on each line of output, so with 3000 lines this will be more than 18Mb (assuming 1 byte of output per field and not counting the line number at the start of the line); your file size may be MUCH larger depending on the contents of your input files. On many systems you won't be able to do much of anything with this output file but cut fields out of it for further processing.

Good luck!

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merging columns from multiple files in one file

Hi, I want to select columns from multiple files and combine them in one file. The files are simulation-data-files with 23 columns each and about 50 rows. I now use: cut -f 11 Sweep?wing-30?scale=0.?0?fan2?.txt | pr -3 | awk '{printf("\n%s\t%s\t%s",$1,$2,$3)}' > ../Data_Processed/output.txtI...

2. Shell Programming and Scripting

Matching lines across multiple csv files and merging a particular field

I have about 20 CSV's that all look like this: "","","","","","","","","","","","","","","",""What I've been told I need to produce is the exact same thing, but with each file now containing the start_code from every other file where the email matches. It doesn't matter if any of the other...

3. Shell Programming and Scripting

Merging information from multiple files to a single file

Hello, I am new to unix and need help with a problem. I have 2 files each containing multiple columns of information ie; File 1 : A B C D E 1 2 3 4 5 File 2 : F G 6 7 I would like to merge the information from File 2 to File 1 so that the data reads as follows; File 1: A...

4. Shell Programming and Scripting

merging two .txt files by alternating x lines from file 1 and y lines from file2

Hi everyone, I have two files (A and B) and want to combine them to one by always taking 10 rows from file A and subsequently 6 lines from file B. This process shall be repeated 40 times (file A = 400 lines; file B = 240 lines). Does anybody have an idea how to do that using perl, awk or sed?...

5. Shell Programming and Scripting

merging multiple lines into single line

Hi, 1. Each message starts with date 2. There is blank line between each message 3. Each message does not contain same number of lines. Any help in merging multiple lines in each message to a single line is much appreciated. AIX: Korn Shell Error log file looks like below. ...

6. Shell Programming and Scripting

Merging multiple files from multiple columns

Hi guys, I have very basic linux experience so I need some help with a problem. I have 3 files from which I want to extract columns based on common fields between them. File1: --- rs74078040 NA 51288690 T G 461652 0.99223 0.53611 3 --- rs77209296 NA 51303525 T G 461843 0.98973 0.60837 3...

7. Shell Programming and Scripting

awk Merging multiple files with symbol representing new file

I just tried following ls *.dat|sort -t"_" -k2n,2|while read f1 && read f2; do awk '{print}' $f1 awk FNR==1'{print $1,$2,$3,$4,$5,"*","*","*" }' OFS="\t" $f2 awk '{print}' $f2 donegot following result 18-Dec-1983 11:45:00 AM 18.692 84.672 0 25.4 24 18-Dec-1983 ...

8. Shell Programming and Scripting

Merging multiple lines

I do have a text file with multiple lines on it. I want to put the lines of text into a single line where ever there is ";" for example ert, ryt, yvig, fgr; rtyu, hjk, uio, hyu, hjo; ghj, tyu, gho, hjp, jklo, kol; The resultant file I would like to have is ert, ryt, yvig, fgr;...

9. Shell Programming and Scripting

Merging multiple lines to columns with awk, while inserting commas for missing lines

Hello all, I have a large csv file where there are four types of rows I need to merge into one row per person, where there is a column for each possible code / type of row, even if that code/row isn't there for that person. In the csv, a person may be listed from one to four times...

10. UNIX for Beginners Questions & Answers

Merging multiple lines into single line based on one column

I Want to merge multiple lines based on the 1st field and keep into single record. SRC File: AAA_POC_DB.TAB1 AAA_POC_DB.TAB2 AAA_POC_DB.TAB3 AAA_POC_DB.TAB4 BBB_POC_DB.TAB1 BBB_POC_DB.TAB2 CCC_POC_DB.TAB6 OUTPUT ----------------- 'AAA_POC_DB','TAB1','TAB2','TAB3','TAB4'...

LEARN ABOUT DEBIAN

grib_get_data

GRIB_GET_DATA(1)						   User Commands						  GRIB_GET_DATA(1)

NAME

       grib_get_data - Print a latitude, longitude, data values list

DESCRIPTION

       Print a latitude, longitude, data values list

USAGE

       grib_get_data [options] grib_file grib_file ...

OPTIONS

       -M   Multi-grib support off. Turn off support for multiple fields in single grib message

       -m missingValue
	      The  missing value is given through this option.	Any string is allowed and it is printed in place of the missing values. Default is
	      to skip the missing values.

       -p key[:{s/d/l}],key[:{s/d/l}],...
	      Declaration of keys to print.  For each key a string (key:s) or a double (key:d) or a long (key:l) type can  be  requested.  Default
	      type is string.

       -R out_file_name
	      Redirect the standard output to "out_file_name".	Faster than the shell redirect (>) because buffered.

       -F format
	      C style format for values. Default is "%.10e"

       -w key[:{s/d/l}]{=/!=}value,key[:{s/d/l}]{=/!=}value,...
	      Where  clause.   Grib  messages  are  processed  only  if  they  match all the key/value constraints.  A valid constraint is of type
	      key=value or key!=value.	For each key a string (key:s) or a double (key:d) or a long (key:l) type can be specified. Default type is
	      string.

       -f   Force. Force the execution not to fail on error.

       -G   GRIBEX compatibility mode.

       -7   Does not fail when the message has wrong length

       -V   Version.

AUTHOR

       This manpage has been autogenerated by Enrico Zini <enrico@debian.org>from the command line help of grib_get_data.

grib_get_data							    April 2009							  GRIB_GET_DATA(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merging columns from multiple files in one file

Discussion started by: isgoed

2. Shell Programming and Scripting

Matching lines across multiple csv files and merging a particular field

Discussion started by: Demosthenes

3. Shell Programming and Scripting

Merging information from multiple files to a single file

Discussion started by: crunchie

4. Shell Programming and Scripting

merging two .txt files by alternating x lines from file 1 and y lines from file2

Discussion started by: ink_LE