08-18-2011
Merging two text files by a column
So I have two text files. The first one looks like this:
refsnp_id chr_name chrom_start
1 rs1000000 12 126890980
2 rs10000010 4 21618674
3 rs10000012 4 1357325
4 rs10000013 4 37225069
5 rs1000002 3 183635768
And the second one looks like this:
AUC rs1000000 0.03 0.1240
AUC rs10000010 0.03 0.1462
AUC rs10000012 0.00 0.8628
AUC rs10000013 0.00 0.5459
AUC rs1000002 0.00 0.6439
AUC rs10000023 0.03 0.1337
AUC rs10000027 0.00 0.7142
AUC rs10000030 0.00 0.7634
AUC rs1000003 0.02 0.2226
The second columns of the two text files match but not completely. There are some missing lines in the first text file. I want to merge the two text files by their second columns and add the third and fourth column of the first file to the second file when the second columns of the two files are matching. How do I go about doing that? Thanks!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
hi,
i need to merge 2 lakh text files .....
can somebody please help me with a script/program for it... (8 Replies)
Discussion started by: code19
8 Replies
2. Shell Programming and Scripting
Hi, I have two files file1 and file2. I have to merge the columns of those two files into file3 based on common column of two files. To be simple.
file1:
Row-id name1
13456 Rahul
16789 Vishal
18901 Karan
file2 :
Row-id place
18901 Mumbai
... (2 Replies)
Discussion started by: manneni prakash
2 Replies
3. Shell Programming and Scripting
Hi,Iam new to Unix.I have a file FileA which is a variable length file where each column is seperated by delimitter "|".
FileA:
SrNo Name Address
1-234|name1|Addr1
1-34|name2|Addr2
1-2345|name3|Addr3
FileB:
SrNo Address
1-34<<06 SPACES>>Addr1<<8 spaces>>
1-234<<05... (1 Reply)
Discussion started by: swapna321
1 Replies
4. Shell Programming and Scripting
Hi,
I have two files consisting of two columns. So I want to merge column 2 if column 1 is the same. So heres an example of what I mean.
FILE1
driver 444
car 333
hat 222
FILE2
driver 333
car 666
hat 999
So I want to merge the column 2's together so... (4 Replies)
Discussion started by: phil_heath
4 Replies
5. Shell Programming and Scripting
Hi All,
I do have 2 files
file 1 has 4 tab delimited columns
234 a c dfgyu
294 b g fih
302 c h jzh
328 z c san
597 f g son
File 2 has 2 tab delimted columns
234 23
302 24
597 24
I want to merge file 2 with file 1 based on the data common in both files which is the first column so... (6 Replies)
Discussion started by: Lucky Ali
6 Replies
6. Shell Programming and Scripting
I had two files file1 and file2. I want a o/p file(file3) like below using first column as ref. Pls give suggestion ass join is not working as the number of lines in each file is nealry 5 C?
file1
---------------------
404000324810001 Y
404000324810004 N
404000324810008 Y
404000324810009 N... (1 Reply)
Discussion started by: p_sai_ias
1 Replies
7. UNIX for Dummies Questions & Answers
I have two text files. One has two columns and looks like below:
rs# otherallele_freq
rs10399749 0
rs4030303 0
rs4030300 0
rs940550 1.000
rs13328714 0
rs11490937 0
rs6683466 0
rs12025928 1.000
rs6650104 0
rs11240781 0... (5 Replies)
Discussion started by: evelibertine
5 Replies
8. UNIX for Dummies Questions & Answers
Hi,
I have to text files that I want to merge by the first column. The values in the first column pretty much match for the first part. However there are some values that are present in column 1 and not present in column 2 or vice versa. For such values I would like to substitute X for the... (9 Replies)
Discussion started by: evelibertine
9 Replies
9. Shell Programming and Scripting
I have two files.
FileA.txt
30910 rs7468327
36587 rs10814410
91857 rs9408752
105797 rs1133715
146659 rs2262038
152695 rs2810979
181843 rs3008128
182129 rs3008131
192118 rs3008170
FileB.txt
30910 1.9415219673 0
36431 1.3351312477 0.0107191428
36587 1.3169171182... (2 Replies)
Discussion started by: genehunter
2 Replies
10. Shell Programming and Scripting
Dear Unix experts and users
I have 2 kinds of files like below, of which I need to merge them in the order of time.
File1:
Date_Time Context D1 D2
04/19/2013_23:48:54.819 ABCD x x
04/19/2013_23:48:55.307 ABCD x x
04/19/2013_23:48:55.823 ABCD x ... (7 Replies)
Discussion started by: ks_reddy
7 Replies
fold(1) User Commands fold(1)
NAME
fold - filter for folding lines
SYNOPSIS
fold [-bs] [-w width | -width] [file...]
DESCRIPTION
The fold utility is a filter that will fold lines from its input files, breaking the lines to have a maximum of width column positions (or
bytes, if the -b option is specified). Lines will be broken by the insertion of a NEWLINE character such that each output line (referred to
later in this section as a segment) is the maximum width possible that does not exceed the specified number of column positions (or bytes).
A line will not be broken in the middle of a character. The behavior is undefined if width is less than the number of columns any single
character in the input would occupy.
If the CARRIAGE-RETURN, BACKSPACE, or TAB characters are encountered in the input, and the -b option is not specified, they will be treated
specially:
BACKSPACE The current count of line width will be decremented by one, although the count never will become negative. fold
will not insert a NEWLINE character immediately before or after any BACKSPACE character.
CARRIAGE-RETURN The current count of line width will be set to 0. fold will not insert a NEWLINE character immediately before or
after any CARRIAGE-RETURN character.
TAB Each TAB character encountered will advance the column position pointer to the next tab stop. Tab stops will be at
each column position n such that n modulo 8 equals 1.
OPTIONS
The following options are supported:
-b Counts width in bytes rather than column positions.
-s If a segment of a line contains a blank character within the first width column positions (or bytes), breaks the line after
the last such blank character meeting the width constraints. If there is no blank character meeting the requirements, the
-s option will have no effect for that output segment of the input line.
-w width|-width Specifies the maximum line length, in column positions (or bytes if -b is specified). If width is not a positive decimal
number, an error is returned. The default value is 80.
OPERANDS
The following operand is supported:
file A path name of a text file to be folded. If no file operands are specified, the standard input will be used.
EXAMPLES
Example 1: Submitting a file of possibly long lines to the line printer
An example invocation that submits a file of possibly long lines to the line printer (under the assumption that the user knows the line
width of the printer to be assigned by lp(1)):
example% fold -w 132 bigfile | lp
ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of fold: LANG, LC_ALL, LC_CTYPE, LC_MES-
SAGES, and NLSPATH.
EXIT STATUS
The following exit values are returned:
0 All input files were processed successfully.
>0 An error occurred.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWcsu |
+-----------------------------+-----------------------------+
|CSI |enabled |
+-----------------------------+-----------------------------+
|Interface Stability |Standard |
+-----------------------------+-----------------------------+
SEE ALSO
cut(1), pr(1), attributes(5), environ(5), standards(5)
NOTES
fold and cut(1) can be used to create text files out of files with arbitrary line lengths. fold should be used when the contents of long
lines need to be kept contiguous. cut should be used when the number of lines (or records) needs to remain constant.
fold is frequently used to send text files to line printers that truncate, rather than fold, lines wider than the printer is able to print
(usually 80 or 132 column positions).
fold may not work correctly if underlining is present.
SunOS 5.10 1 Feb 1995 fold(1)