04-27-2011
See man join. With join, you must use flat files, as it seeks in anticipation of Cartesian products. Files need to be header free, delimited and sorted. If this is a many to one or one to one deal (a simple merge), you could use my m1join.c tool and all piped data for the sort and header removal:
https://www.unix.com/shell-programmin...ity-linux.html
Another alternative is the JDBC and unixODBC drivers that treat flat text or CSV files as database tables, so you can express your desire in SQL to jisql or isql (unixODBC).
This User Gave Thanks to DGPickett For This Post:
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
I need a script (perl or awk..anything is fine) to join 3 files based on three key columns. The no of non-key columns can vary in each file. The columns are delimited by semicolon.
For example,
File1
Dim1;Dim2;Dim3;Fact1;Fact2;Fact3;Fact4;Fact5
---- data delimited by semicolon ---
... (1 Reply)
Discussion started by: Sebben
1 Replies
2. Shell Programming and Scripting
Hi
I have to join two files based on 1st column where 4th column of a2.txt=at and take 2nd column of a1.txt and 3rd column of a2.txt and check against source files ,if matches list those source file names.
a1.txt
a1|20090809|20090810
a2|20090907|20090908
a2.txt
a1|d|file1.txt|at... (9 Replies)
Discussion started by: akil
9 Replies
3. Shell Programming and Scripting
Hi,
I've list of files in a directory, which have date stamp value in their names.
ex:
abc_data_20071102.csv,
abc_data_20091221.csv,
abc_data_20100110.csv,
abc_data_20100222.csv,
abc_data_20080620.csv,... etc.,
I need to select and process only files, within the given date... (4 Replies)
Discussion started by: ganapati
4 Replies
4. Shell Programming and Scripting
I have n files (for ex:64 files) with one similar column. Is it possible to combine them all based on that column ?
file1
ax100 20 30 40
ax200 22 33 44
file2
ax100 10 20 40
ax200 12 13 44
file2
ax100 0 0 4
ax200 2 3 4 (9 Replies)
Discussion started by: quincyjones
9 Replies
5. Shell Programming and Scripting
Hi I'm trying to compare 3 or more files based on similar values and outputting them into 3 columns.
For example:
file1
ABC
DEF
GHI
file2
DEF
DER
file3
ABC
DER
The output should come out like this
file1 file2 file3
ABC ABC (4 Replies)
Discussion started by: zerofire123
4 Replies
6. Shell Programming and Scripting
Hi I have a file like this. I need to eliminate lines with first column having the same value 10 times.
13 18 1 + chromosome 1, 122638287 AGAGTATGGTCGCGGTTG
13 18 1 + chromosome 1, 128904080 AGAGTATGGTCGCGGTTG
13 18 1 - chromosome 14, 13627938 CAACCGCGACCATACTCT
13 18 1 + chromosome 1,... (5 Replies)
Discussion started by: polsum
5 Replies
7. UNIX for Dummies Questions & Answers
I have a table, say this:
name1 num1 num2 num3 num4
name2 num5 num6 num7 num8
name3 num1 num3 num4 num9
name2 num8 num9 num1 num2
name2 num4 num5 num6 num4
name4 num4 num5 num7 num8
name5 num1 num3 num9 num7
name5 num6 num8 num3 num4
I want a code that will sort my data according... (4 Replies)
Discussion started by: FelipeAd
4 Replies
8. Shell Programming and Scripting
Dear All,
I have to solve the following problems with multiple tab-separated text file but I don't know how. Any help would be greatly appreciated. I have access to Linux mint (but not as a professional).
I have multiple tab-delimited files with the following structure:
file1:
1 44
2 ... (5 Replies)
Discussion started by: Bastami
5 Replies
9. UNIX for Beginners Questions & Answers
there can be n number of columns but the number of columns and header name will remain same in all 3 files. Files are tab Delimited.
a.txt
Name 9/1 9/2
X 1 7
y 2 8
z 3 9
a 4 10
b 5 11
c 6 12
b.xt
Name 9/1 9/2
X 13 19
y 14 20
z 15 21
a 16 22
b 17 23
c 18 24 c.txt
Name 9/1 9/2... (14 Replies)
Discussion started by: Nina2910
14 Replies
10. Shell Programming and Scripting
Hi all,
I've multiple files. In this case 5. Space separated columns. Each file has 12 columns. Each file has 300-400K lines.
I want to get the output such that if a value in column 2 is present in all the files then get all the columns of that value and print it side by side.
Desired output... (15 Replies)
Discussion started by: genome
15 Replies
JOIN(1) User Commands JOIN(1)
NAME
join - join lines of two files on a common field
SYNOPSIS
join [OPTION]... FILE1 FILE2
DESCRIPTION
For each pair of input lines with identical join fields, write a line to standard output. The default join field is the first, delimited
by blanks.
When FILE1 or FILE2 (not both) is -, read standard input.
-a FILENUM
also print unpairable lines from file FILENUM, where FILENUM is 1 or 2, corresponding to FILE1 or FILE2
-e EMPTY
replace missing input fields with EMPTY
-i, --ignore-case
ignore differences in case when comparing fields
-j FIELD
equivalent to '-1 FIELD -2 FIELD'
-o FORMAT
obey FORMAT while constructing output line
-t CHAR
use CHAR as input and output field separator
-v FILENUM
like -a FILENUM, but suppress joined output lines
-1 FIELD
join on this FIELD of file 1
-2 FIELD
join on this FIELD of file 2
--check-order
check that the input is correctly sorted, even if all input lines are pairable
--nocheck-order
do not check that the input is correctly sorted
--header
treat the first line in each file as field headers, print them without trying to pair them
-z, --zero-terminated
line delimiter is NUL, not newline
--help display this help and exit
--version
output version information and exit
Unless -t CHAR is given, leading blanks separate fields and are ignored, else fields are separated by CHAR. Any FIELD is a field number
counted from 1. FORMAT is one or more comma or blank separated specifications, each being 'FILENUM.FIELD' or '0'. Default FORMAT outputs
the join field, the remaining fields from FILE1, the remaining fields from FILE2, all separated by CHAR. If FORMAT is the keyword 'auto',
then the first line of each file determines the number of fields output for each line.
Important: FILE1 and FILE2 must be sorted on the join fields. E.g., use "sort -k 1b,1" if 'join' has no options, or use "join -t ''" if
'sort' has no options. Note, comparisons honor the rules specified by 'LC_COLLATE'. If the input is not sorted and some lines cannot be
joined, a warning message will be given.
AUTHOR
Written by Mike Haertel.
REPORTING BUGS
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Report join translation bugs to <http://translationproject.org/team/>
COPYRIGHT
Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
SEE ALSO
comm(1), uniq(1)
Full documentation at: <http://www.gnu.org/software/coreutils/join>
or available locally via: info '(coreutils) join invocation'
GNU coreutils 8.28 January 2018 JOIN(1)