Merge records based on multiple columns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merge records based on multiple columns
# 1  
Old 04-15-2014
Merge records based on multiple columns

Hi,

I have a file with 16 columns and out of these 16 columns 14 are key columns, 15 th is order column and 16th column is having information. I need to concate the 16th column based on value of 1-14th column as key in order of 15th column. Here are the example file

Input File (multiple records like this)
Code:
"A1"	"A2"	"A3"	A4"	"A5"	"A6"	"A7"	"A8"	"A9"	"A10"	"A11"	"A12"	"A13"	"A14"	"0001"	"Once Upon A Time"
"A1"	"A2"	"A3"	A4"	"A5"	"A6"	"A7"	"A8"	"A9"	"A10"	"A11"	"A12"	"A13"	"A14"	"0003"	"He was very"
"A1"	"A2"	"A3"	A4"	"A5"	"A6"	"A7"	"A8"	"A9"	"A10"	"A11"	"A12"	"A13"	"A14"	"0002"	"There was a crow"
"A1"	"A2"	"A3"	A4"	"A5"	"A6"	"A7"	"A8"	"A9"	"A10"	"A11"	"A12"	"A13"	"A14"	"0004"	"Thirsty"

Required Output
Code:
"A1"	"A2"	"A3"	A4"	"A5"	"A6"	"A7"	"A8"	"A9"	"A10"	"A11"	"A12"	"A13"	"A14"	"Once Upon A Time There was a crow He was very Thirsty"

Moderator's Comments:
Mod Comment Please use code tags next time for your code and data. Thanks
# 2  
Old 04-15-2014
So a few questions to start:-
  • What have you tried so far?
  • What errors/output are out getting?
  • What OS and version are you running on?
  • What are your preferred tools (ksh, bash, awk, perl, etc.)
Most importantly, what have you tried so far?

We all give suggestions freely, but you need to make it easy for us to help you and show that you have made an effort rather than just hoping for a reply with an fully tailored solution. Depending on your set-up and preferences, there may be various ways to approach this.



Robin
# 3  
Old 04-15-2014
Re:Merge records based on multiple columns

Hi,

I am not an expert shell scripting programmer ,but I have tried few options

1) For loop by reading each row and doing concatenated grep to find the matching column

2) I have also gone through the link (on this website itself )given below and tried to change given solution as per requirement


shell-programming-and-scripting
208027-merge-multiple-lines-same-file-common-key-using-awk.html

but ordering on the basis of 15th column is still an issue and also I am getting double quotes in concatenation of last field

I am ok with any solution being in KSH,BASH or AWK. Perl and any other language could be the last option but not preferable for now.

I am not sure about OS version too as I am out of my office and can't check

let me know if these informations are helpful
# 4  
Old 04-17-2014
python

Code:
import re
hash={}
with open("a.txt") as file:
 for line in file:
  arr=re.findall('"[^"]+"',line.replace("\n",""))
  key=" ".join(arr[0:13])
  if key not in hash:
    hash[key]=[]
  hash[key].append(arr[14:])   
  
for i in hash:
    print(i,end=" ")
    print('"'," ".join(j[1].replace('"',"") for j in sorted(hash[i],key=lambda x: int(x[0].replace('"',"")))),'"',sep="")

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Merge multiple columns into one using cat

I will like to merge several files using 'cat', but I observe the output is not consistent. the merge begins at the last line of the first file. file1.txt: 1234 1234 1234 file2.txt: aaaa bbbb cccc dddd cat file1.txt file2.txt > file3.txt file3.txt: 1234 1234 1234aaaa bbbb cccc... (13 Replies)
Discussion started by: geomarine
13 Replies

2. UNIX for Dummies Questions & Answers

Merge selective columns from files based on common key

Hi, I am trying to selectively merge two files based on keys reported in the 1st column. File1: #file1-header1 file1-header2 111 qwe rtz uio 198 asd fgh jkl 165 yxc 789 poi uzt rew 89 lkj File2: #file2-header2 file2-header2 165 ghz nko2 ... (2 Replies)
Discussion started by: dovah
2 Replies

3. Shell Programming and Scripting

Merge files based on columns

011111123444 1234 1 20000 011111123444 1235 1 30000 011111123446 1234 3 40000 011111123447 1234 4 50000 011111123448 1234 3 50000 File2: 011111123444,Rsttponrfgtrgtrkrfrgtrgrer 011111123446,Rsttponrfgtrgtr 011111123447,Rsttponrfgtrguii 011111123448,Rsttponrfgtrgtjiiu I have 2 files... (4 Replies)
Discussion started by: vinus
4 Replies

4. Shell Programming and Scripting

Merge columns from multiple files

Hello and Good day I have a lot of files with same number of rows and columns.$2 and $3 are the same in all files . I need to merge $2,$3,$6 from first file and $6 from another files. File1: $1 $2 $3 $4 $5 $6... (8 Replies)
Discussion started by: ali.seifaddini
8 Replies

5. UNIX for Dummies Questions & Answers

Merge columns from multiple files

Hi all, I've searched the web for a long time trying to figure out how to merge columns from multiple files. I know paste will append columns like so: paste file1 file2 file3 file4 file5 ... But this becomes inconvenient when you want to append a large number of files into a single file. ... (2 Replies)
Discussion started by: torchij
2 Replies

6. UNIX for Dummies Questions & Answers

How do I merge multiple columns into one column?

Hi all, I'm looking for a way to merge multiple columns (from one file) into a single column in an output file. The file I have looks somewhat like this: @HWI-ST212 1:N:0 AGTCCTACCGGGAGT + @@@DDDDDHHHHHII @HWI-ST212 1:N:0 CGTTTAAAAATTTCT + @;@B;DDDDH?:F;F... (4 Replies)
Discussion started by: Vnguyen
4 Replies

7. Shell Programming and Scripting

count the unique records based on certain columns

Hi everyone, I have a file result.txt with records as following and another file mirna.txt with a list of miRNAs e.g. miR22, miR123, miR13 etc. Gene Transcript miRNA Gar Nm_111233 miR22 Gar Nm_123440 miR22 Gar Nm_129939 miR22 Hel Nm_233900 miR13 Hel ... (6 Replies)
Discussion started by: miclow
6 Replies

8. Shell Programming and Scripting

file merge based on common columns

I have two files 1.txt 34, ABC, 7, 8, 0.9 35, CDE, 6.5, -2, 0.01 2.txt 34, ABC, 9, 6, -1.9 35, CDE, 8.5, -2.3, 5.01 So in both files common columns are 1 and 2 so final o/p should look like 34, ABC, 7, 8, 0.9, 9, 6, -1.9 35, CDE, 6.5, -2, 0.01, 8.5, -2.3, 5.01 I tried using... (3 Replies)
Discussion started by: manas_ranjan
3 Replies

9. Shell Programming and Scripting

Multiple records based on :

Hi , I have the below source source data 1|2|3|:123:abc|4 1|2|a| | 5 1|2|3|4|:a:s:D.....:n|t Target data should be 1|2|3|:123:abc|4 1|2|3|:123:abc|4 1|2|a| | 5 1|2|3|4|:a:s:D.....:n|t 1|2|3|4|:a:s:D.....:n|t 1|2|3|4|:a:s:D.....:n|t 1|2|3|4|:a:s:D.....:n|t (3 Replies)
Discussion started by: mora
3 Replies

10. Shell Programming and Scripting

Merge text files while combining the multiple header/trailer records into one each.

Situation: Our system currently executes a job (COBOL Program) that generates an interface file to be sent to one of our vendors. Because this system processes information for over 100,000 employees/retirees (and growing), we'd like to multi-thread the job into processing-groups in order to... (4 Replies)
Discussion started by: oordonez
4 Replies
Login or Register to Ask a Question