02-07-2005
Great!. That's exactly what I want. Two more questions :
First, as all of my flat files are huge, each contains around 24 millions records. If I join them tegether in one command, is it possible running out of memory ? In other words, does the script you wrote to me consumes memory very much ?
Second, I presume that data of the flat files might conain tab, so can I use other character as delimiter eg. Ctrl B, how to define in sed command ?
I appreciate to your help.
xli
10 More Discussions You Might Find Interesting
1. UNIX for Advanced & Expert Users
what is a flat file in unix?
i have to import a unix flat files from windows based programme.
my question is not to export from unix but only to import from windows only.
how to build that flat files?
how to create export to windows
how to import from windows (3 Replies)
Discussion started by: tunirayavarapu
3 Replies
2. Shell Programming and Scripting
Hi all,
How to create Flat Files using Unix Shell Script. The Script is supposed to be sheduled to run at a particular time?
Thanks in advance
Appu (4 Replies)
Discussion started by: Aparna_k82
4 Replies
3. Shell Programming and Scripting
Hi,
Can we join two fixed length files in Unix using JOIN command? Is there any other command to accomplish the same?
Thanks,
G.Harikrishnan (6 Replies)
Discussion started by: gharikrishnan
6 Replies
4. UNIX for Dummies Questions & Answers
Hi,
I have a big file of 50GB size. I need copy it to a second ftp from a ftp. I am not able to do the full 50GB transfer as it timesout after some time. SO i am trying to split the file into 5gb each 10 files with the below command.
split -b 5368709120 pack.tar.gz backup.gz
After I... (2 Replies)
Discussion started by: venu_nbk
2 Replies
5. UNIX for Dummies Questions & Answers
Hello,
My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns:
File A: (tab-delimited)
PDB CHAIN Start End Fragment
1avq A 171 176 awyfan
1avq A 172 177 wyfany
1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies
6. UNIX for Dummies Questions & Answers
Hi,
I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column).
I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies
7. UNIX for Dummies Questions & Answers
Hi all,
This is my first and undoubtedly many posts to come. I'm new to using unix and would like a hand with this problem I have. What i'm trying to do is match 2 sets of data from 2 files and put result into file 3. Sounds simply but there is a catch, the match is a "partial field" match, if... (2 Replies)
Discussion started by: tugar
2 Replies
8. Shell Programming and Scripting
I have 2 files namely branch.txt file & RXD.txt file as below
Ex:Branch.txt
=========================
B1,Branchname1,city,country
B2,Branchname2,city,country
B3,Branchname3,city,country
B4,Branchname4,city,country
B5,Branchname5,city,country
RXD file : will... (11 Replies)
Discussion started by: satece
11 Replies
9. Shell Programming and Scripting
Dear folks
Hello
I have a one file called (file1) which the structure looks like this
1 gi|358484521|ref|NW_003764373.1|
1 gi|358484520|ref|NW_003764374.1|
1 gi|358484519|ref|NW_003764375.1|
.
.
.
30 gi|368484519|ref|NW_00449375.1|
In addition, I have around 300... (19 Replies)
Discussion started by: sajmar
19 Replies
10. Shell Programming and Scripting
Hello,
This post is already here but want to do this with another way
Merge multiples files with multiples duplicates keys by filling "NULL" the void columns for anothers joinning files
file1.csv:
1|abc
1|def
2|ghi
2|jkl
3|mno
3|pqr
file2.csv:
1|123|jojo
1|NULL|bibi... (2 Replies)
Discussion started by: yjacknewton
2 Replies
JOIN(1) General Commands Manual JOIN(1)
NAME
join - relational database operator
SYNOPSIS
join [ options ] file1 file2
DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If one of the file names is the
standard input is used.
File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in
each line.
There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con-
sists of the common field, then the rest of the line from file1, then the rest of the line from file2.
Input fields are normally separated spaces or tabs; output fields by space. In this case, multiple separators count as one, and leading
separators are discarded.
The following options are recognized, with POSIX syntax.
-a n In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.
-v n Like -a, omitting output for paired lines.
-e s Replace empty output fields by string s.
-1 m
-2 m Join on the mth field of file1 or file2.
-jn m Archaic equivalent for -n m.
-ofields
Each output line comprises the designated fields. The comma-separated field designators are either 0, meaning the join field, or
have the form n.m, where n is a file number and m is a field number. Archaic usage allows separate arguments for field designators.
-tc Use character c as the only separator (tab character) on input and output. Every appearance of c in a line is significant.
EXAMPLES
sort /adm/users | join -t: -a 1 -e "" - bdays
Add birthdays to password information, leaving unknown birthdays empty. The layout of is given in users(6); bdays contains sorted
lines like
tr : ' ' </adm/users | sort -k 3 3 >temp
join -1 3 -2 3 -o 1.1,2.1 temp temp | awk '$1 < $2'
Print all pairs of users with identical userids.
SOURCE
/sys/src/cmd/join.c
SEE ALSO
sort(1), comm(1), awk(1)
BUGS
With default field separation, the collating sequence is that of sort -b -ky,y; with -t, the sequence is that of sort -tx -ky,y.
One of the files must be randomly accessible.
JOIN(1)