Sponsored Content
Top Forums UNIX for Dummies Questions & Answers File merging based on column patterns Post 302958967 by dovah on Tuesday 27th of October 2015 03:32:32 PM
Old 10-27-2015
File merging based on column patterns

Hello Smilie

I am in this situation:

Input: two tab-delimited files, `File1` and `File2`. `File2` (`$2`) has to be parsed by patterns found in `File1` (`$1`).

Expected output: tab-delimited file, `File3`. `File3` has to contain the same rows as `File2`, plus the corresponding value in `File1` if the pattern is matched (end of the line, tab-separated).

File1 (tab-delimited) :

Code:
 
    ABC1    1    3    4
    ABC2    4    3    3
    ABC3    3    2    3
    ABC4    3    3    3

File2 (tab-delimited):
Code:
    text1   ABC1AB   text2   text3
    text2   ABC2AB   text1   
    text3   ABC1CDE   text2
    text4   ABC5AB   text3   text4

File3 (desired output, tab-delimited):

Code:
    text1   ABC1AB   text2   text3    1    3    4
    text2   ABC2AB   text1    4    3    3
    text3   ABC1CDE   text2    1    3    4
    text4   ABC5AB   text3   text4

I know this should be quite an easy task, could be done by awk/grep, but I didn't succeed to get it work :/

Any help is welcome, thanks in advance
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

merging column from two files based on identifier

Hi, I have two files consisting of two columns. So I want to merge column 2 if column 1 is the same. So heres an example of what I mean. FILE1 driver 444 car 333 hat 222 FILE2 driver 333 car 666 hat 999 So I want to merge the column 2's together so... (4 Replies)
Discussion started by: phil_heath
4 Replies

2. Shell Programming and Scripting

Merging 2 files based on a common column

Hi All, I do have 2 files file 1 has 4 tab delimited columns 234 a c dfgyu 294 b g fih 302 c h jzh 328 z c san 597 f g son File 2 has 2 tab delimted columns 234 23 302 24 597 24 I want to merge file 2 with file 1 based on the data common in both files which is the first column so... (6 Replies)
Discussion started by: Lucky Ali
6 Replies

3. Shell Programming and Scripting

merging two files based on first column

I had two files file1 and file2. I want a o/p file(file3) like below using first column as ref. Pls give suggestion ass join is not working as the number of lines in each file is nealry 5 C? file1 --------------------- 404000324810001 Y 404000324810004 N 404000324810008 Y 404000324810009 N... (1 Reply)
Discussion started by: p_sai_ias
1 Replies

4. Shell Programming and Scripting

File merging based on different counter loop

hello, File 1 main Group sub group MIT VAR_1D_DATA_TYPE 23-03-2012 MIT VAR_1D_DATA_TYPE 22-03-2012 MIT VAR_10D_DATA_TYPE 23-03-2012 MIT VAR_10D_DATA_TYPE 22-03-2012 MIT ... (0 Replies)
Discussion started by: manas_ranjan
0 Replies

5. Shell Programming and Scripting

Merging columns based on one or more column in two files

I have two files. FileA.txt 30910 rs7468327 36587 rs10814410 91857 rs9408752 105797 rs1133715 146659 rs2262038 152695 rs2810979 181843 rs3008128 182129 rs3008131 192118 rs3008170 FileB.txt 30910 1.9415219673 0 36431 1.3351312477 0.0107191428 36587 1.3169171182... (2 Replies)
Discussion started by: genehunter
2 Replies

6. UNIX for Dummies Questions & Answers

merging rows into new file based on rows and first column

I have 2 files, file01= 7 columns, row unknown (but few) file02= 7 columns, row unknown (but many) now I want to create an output with the first field that is shared in both of them and then subtract the results from the rest of the fields and print there e.g. file 01 James|0|50|25|10|50|30... (1 Reply)
Discussion started by: A-V
1 Replies

7. UNIX for Dummies Questions & Answers

Merging lines based on one column

Hi, I have a file which I'd like to merge lines based on duplicates in one column while keeping the info for other columns. Let me simplify it by an example: File ESR1 ANASTROZOLE NA FDA_approved ESR1 CISPLATIN NA FDA_approved ESR1 DANAZOL agonist NA ESR1 EXEMESTANE NA FDA_approved... (3 Replies)
Discussion started by: JJ001
3 Replies

8. UNIX for Dummies Questions & Answers

Split 1 column into numerous columns based on patterns

Hi, I have a text file 'Item_List.txt' containing only 1 column. This column lists different products, each separated by the same generic string header "NEW PRODUCT, VERSION 1.1". After this the name of the product is given, then a delimiter string "PRODUCT FIELD", and then the name of the... (11 Replies)
Discussion started by: mmab
11 Replies

9. UNIX for Beginners Questions & Answers

Merging multiple lines into single line based on one column

I Want to merge multiple lines based on the 1st field and keep into single record. SRC File: AAA_POC_DB.TAB1 AAA_POC_DB.TAB2 AAA_POC_DB.TAB3 AAA_POC_DB.TAB4 BBB_POC_DB.TAB1 BBB_POC_DB.TAB2 CCC_POC_DB.TAB6 OUTPUT ----------------- 'AAA_POC_DB','TAB1','TAB2','TAB3','TAB4'... (10 Replies)
Discussion started by: raju2016
10 Replies

10. UNIX for Beginners Questions & Answers

Merging rows based on same ID in First column.

Hellow, I have a tab-delimited file with 3 columns : BINPACKER.13259.1.p2 SSF48239 BINPACKER.13259.1.p2 PF13243 BINPACKER.13259.1.p2 G3DSA:1.50.10.20 BINPACKER.13259.2.p2 SSF48239 BINPACKER.13259.2.p2 PF13243 BINPACKER.13259.2.p2 G3DSA:1.50.10.20... (7 Replies)
Discussion started by: anjaliANJALI
7 Replies
paste(1)						      General Commands Manual							  paste(1)

Name
       paste - merge file data

Syntax
       paste file1 file2...
       paste -dlist file1 file2...
       paste -s [-dlist] file1 file2...

Description
       In  the	first  two forms, concatenates corresponding lines of the given input files file1, file2, etc.	It treats each file as a column or
       columns of a table and pastes them together horizontally (parallel merging).

       In the last form, the command combines subsequent lines of the input file (serial merging).

       In all cases, lines are glued together with the tab character, or with characters from an optionally specified  list.   Output  is  to  the
       standard output, so it can be used as the start of a pipe, or as a filter, if - is used in place of a file name.

Options
       -       Used in place of any file name, to read a line from the standard input.	(There is no prompting).

       -dlist  Replaces  characters  of  all but last file with nontabs characters (default tab).  One or more characters immediately following -d
	       replace the default tab as the line concatenation character.  The list is used circularly, i. e. when exhausted, it is reused.	In
	       parallel  merging  (i. e. no -s option), the lines from the last file are always terminated with a new-line character, not from the
	       list.  The list may contain the special escape sequences: 
 (new-line), 	 (tab), \ (backslash), and  (empty string, not a null
	       character).   Quoting  may  be  necessary,  if characters have special meaning to the shell (for example, to get one backslash, use
	       -d"\\" ).
	       Without this option, the new-line characters of each but the last file (or last line in case of the -s option) are  replaced  by  a
	       tab character.  This option allows replacing the tab character by one or more alternate characters (see below).

       -s      Merges  subsequent  lines  rather  than	one  from  each input file.  Use tab for concatenation, unless a list is specified with -d
	       option.	Regardless of the list, the very last character of the file is forced to be a new-line.

Examples
       ls | paste -d" " -
       list directory in one column
       ls | paste - - - -
       list directory in four columns
       paste -s -d"	
" file
       combine pairs of lines into lines

Diagnostics
       line too long
		 Output lines are restricted to 511 characters.

       too many files
		 Except for -s option, no more than 12 input files may be specified.

See Also
       cut(1), grep(1), pr(1)

																	  paste(1)
All times are GMT -4. The time now is 07:30 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy