Sponsored Content
Top Forums Shell Programming and Scripting Join 3 or more files using matching column Post 302517567 by cyz700 on Wednesday 27th of April 2011 06:50:58 AM
Old 04-27-2011
Join 3 or more files using matching column

Dear Forum,

Full title of the topic would be: "Join 3 or more files using matching column without full list in any of these columns"

I have several, typically 3 or 4 files which I need to join, something like FULL JOIN in slq scripts, all combinations of matches should be printed into an output file, including those lines where no match to any other file exists. I used mysql where FULL JOIN statement does not exist but some workarounds do the job, at least in case of 3 files but sometimes I got duplicates or even multiplicates. And most important mysql is slow with big files.

I give a single column examle hoping that I manage to implement it to multicolumn cases:
File 1
col1
aaa
bbb
abb
fff

File2
col1
bbb
abb
ccc
fff

File3
aaa
ccc
dce
fff
Output
col1_file1 col1_file2 col1_file3
aaa null aaa
bbb bbb null
abb abb null
null ccc ccc
null null dce
fff fff fff

The best wold be when I can add more files with a ease to the script!

appreciating your ideas!

cyz
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join 3 files using key column in a mapping file

I'm new of UNIX shell scripting. I'm recently generating a excel report in UNIX(file with delimiter is fine). How should I make a script to do it? 1 file to join comes from output of one UNIX command, the second from another UNIX command, and third from a database query. The key columes of all... (7 Replies)
Discussion started by: bigsmile
7 Replies

2. UNIX for Dummies Questions & Answers

Join 2 files using first column

Hi, I'm trying to compare the first column of two files (tab or whitespace delimited, either way's fine, I`ve got both) and print the lines that are identical for the first column of both files. Something like this: File1 AAA 26 49 7 27 36 33 46 75 73 69 AAAAA 4 10 4 7 10 18 21... (2 Replies)
Discussion started by: vanesa1230
2 Replies

3. Shell Programming and Scripting

Join multiple files by column with awk

Hi all, I searched through the forum but i can't manage to find a solution. I need to join a set of files placed in a directory (~1600) by column, and obtain an output with first and second column common to each file, but following columns are taken from the file in the list (precisely the fourth... (10 Replies)
Discussion started by: macsx82
10 Replies

4. Shell Programming and Scripting

Matching the substring and join two files

Hi I had two files like below. file-1 101001234567890 101001234567891 101001234567892 101001234567893 101001234567894 101001234567895 101001234567896 101001234567897 101001234567898 101001234567899 file-2 (6 Replies)
Discussion started by: p_sai_ias
6 Replies

5. Shell Programming and Scripting

join two files based on one column

Hi All, I am trying to join to files based on one common column. Cat File1 ID HID Ab_1 23 Cd 45 df 22 Vv 33 Cat File2 ID pval Ab_1 0.3 Cd 10 Vv 0.0444 (3 Replies)
Discussion started by: newpro
3 Replies

6. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

7. Shell Programming and Scripting

Join two files with matching columns

Hi, I need to join two files together with one common value in a column. I think I can use awk or join or a combination but I can't quite get it. Basically my data looks like this, with the TICKER columns matching up in each file File1 TICKER,column 1, column, 2, column, 3, column 4 ... (6 Replies)
Discussion started by: unkleruckus
6 Replies

8. UNIX for Dummies Questions & Answers

Join files by second column

I have file input file1 1/1/2013 A 553.0763397 96 16582 1/1/2013 B 459.8333588 195 11992 1/2/2013 A 844.2973022 306 19555 1/2/2013 B 833.9300537 457 20165 1/3/2013 A 563.6917419 396 13879 1/3/2013 B 632.0749969 169 ... (1 Reply)
Discussion started by: radius
1 Replies

9. UNIX for Dummies Questions & Answers

Join 2 files based on certain column

I have file input1.txt 11103|11|OTTAWA|City|AA|CAR|0|0|1|-1|0|8526|2014-09-07 23:00:14 11103|11|OTTAWA|City|BB|TRAIN|0|0|2|-2|6|6359|2014-09-07 23:00:14 11104|11|CANADA|City|CC|CAR|0|0|2|-2|0|5947|2014-09-07 23:00:14 11104|11|CANADA|City|DD|TRAIN|0|0|2|-2|1|4523|2014-09-07 23:00:14... (5 Replies)
Discussion started by: radius
5 Replies

10. Shell Programming and Scripting

Join 2nd column of multiple files

Dear All, I have many files formatted like this: file1.txt: 1/2-SBSRNA4 18 A1BG 3 A1BG-AS1 6 A1CF 0 A2LD1 1 A2M 1160 file2.txt 1/2-SBSRNA4 53 A1BG 1 A1BG-AS1 7 A1CF 0 A2LD1 3 A2M 2780 (5 Replies)
Discussion started by: paolo.kunder
5 Replies
MAN(6)								   Games Manual 							    MAN(6)

NAME
man - macros to typeset manual SYNOPSIS
nroff -man file ... troff -man file ... DESCRIPTION
These macros are used to format pages of this manual. Except in and requests, any text argument denoted t in the request summary may be zero to six words. Quotes " ... " may be used to include blanks in a `word'. If t is empty, the special treatment is applied to the next text input line (the next line that doesn't begin with dot). In this way, for example, .I may be used to italicize a line of more than 6 words, or .SM followed by .B to make small letters in `bold' font. A prevailing indent distance is remembered between successive indented paragraphs, and is reset to default value upon reaching a non- indented paragraph. Default units for indents i are ens. The fonts are R roman, the main font, preferred for diagnostics I italic, preferred for parameters, short names of commands, names of manual pages, and naked function names `bold', actually the constant width font, preferred for examples, file names, declarations, keywords, names of struct members, and literals (numbers are rarely literals) L also the constant width font. In troff L=B; in nroff arguments of the macros .L, .LR, and .RL are printed in quotes; preferred only where quotes really help (e.g. lower-case literals and punctuation). Type font and size are reset to default values before each paragraph, and after processing font- or size-setting macros. The -man macros admit equations and tables in the style of eqn(1) and tbl(1), but do not support arguments on .EQ and .TS macros. These strings are predefined by -man: *R `(Reg)', trademark symbol in troff. *S Change to default type size. FILES
/sys/lib/tmac/tmac.an SEE ALSO
troff(1), man(1) REQUESTS
Request Cause If no Explanation Break Argument .B t no t=n.t.l.*Text t is `bold'. .BI t no t=n.t.l. Join words of t alternating bold and italic. .BR t no t=n.t.l. Join words of t alternating bold and Roman. .DT no Restore default tabs. .EE yes End displayed example .EX yes Begin displayed example .HP i yes i=p.i.* Set prevailing indent to i. Begin paragraph with hanging indent. .I t no t=n.t.l. Text t is italic. .IB t no t=n.t.l. Join words of t alternating italic and bold. .IP x i yes x="" Same as .TP with tag x. .IR t no t=n.t.l. Join words of t alternating italic and Roman. .L t no t=n.t.l. Text t is literal. .LP yes Same as .PP. .LR t no Join 2 words of t alternating literal and Roman. .PD d no d=.4v Interparagraph distance is d. .PP yes Begin paragraph. Set prevailing indent to default. .RE yes End of rel- ative indent. Set prevailing indent to amount of starting .RS. .RI t no t=n.t.l. Join words of t alternating Roman and italic. .RL t no Join 2 or 3 words of t alternating Roman and literal. .RS i yes i=p.i. Start relative indent, move left margin in distance i. Set prevailing indent to default for nested indents. .SH t yes t="" Subhead; reset paragraph distance. .SM t no t=n.t.l. Text t is small. .SS t no t="" Secondary subhead. .TF s yes Prevailing indent is wide as string s in font L; para- graph distance is 0. .TH n c x yes Begin page named n of chapter c; x is extra commentary, e.g. `local', for page head. Set prevailing indent and tabs to default. .TP i yes i=p.i. Set prevailing indent to i. Restore default indent if i=0. Begin indented paragraph with hanging tag given by next text line. If tag doesn't fit, place it on separate line. .1C yes Equalize columns and return to 1-column output .2C yes Start 2-column nofill output * n.t.l. = next text line; p.i. = prevailing indent BUGS
There's no way to fool troff into handling literal double quote marks " in font-alternation macros, such as There is no direct way to suppress column widows in 2-column output; the column lengths may be adjusted by inserting requests before the closing MAN(6)
All times are GMT -4. The time now is 04:10 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy