Sponsored Content
Top Forums Shell Programming and Scripting Join txt files with diff cols and rows Post 302555085 by BNasir on Tuesday 13th of September 2011 09:11:47 PM
Old 09-13-2011
Question Join txt files with diff cols and rows

I am a new user of Unix/Linux, so this question might be a bit simple!
I am trying to join two (very large) files that both have different # of cols and rows in each file.
I want to keep 'all' rows and 'all' cols from both files in the joint file, and the primary key variables are in the rows.
I need all rows that exist in both files to be matched up and joined. However, those rows not in one file or the other should also be kept and their data maintained in the joint file. Basically, all possible max data to be included in joint file.
Hope this makes sense!

small example of files:

file 1 =
A 1 2 3 4
B 1 2
C 1 2 3 4 5

file 2 =
A 1 2 3 4 5
B 1 2 3
C 1 2 3
D 1 2 3 4 5 6
E 1

Joint file should have =
A 1 2 3 4 5
B 1 2 3
C 1 2 3 4 5
D 1 2 3 4 5 6
E 1
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

diff 2 files; output diff's to 3rd file

Hello, I want to compare two files. All records in file 2 that are not in file 1 should be output to file 3. For example: file 1 123 1234 123456 file 2 123 2345 23456 file 3 should have 2345 23456 I have looked at diff, bdiff, cmp, comm, diff3 without any luck! (2 Replies)
Discussion started by: blt123
2 Replies

2. Shell Programming and Scripting

join cols from multi files into one file

Hi Fields in Files 1,2,3,4 are pipe"|" separated. Say I want to grep col1 from File1 col3 from File2 col4 from File3 and print to File4 in the following order: col3|col1|col4 what is the best way of doing this? Thanks (2 Replies)
Discussion started by: vbshuru
2 Replies

3. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies

4. UNIX for Dummies Questions & Answers

how to join files with diff col # and row #?

I am a new user of Unix/Linux, so this question might be a bit simple! I am trying to join two (very large) files that both have different # of cols and rows in each file. I want to keep 'all' rows and 'all' cols from both files in the joint file, and the primary key variables are in the rows.... (1 Reply)
Discussion started by: BNasir
1 Replies

5. Shell Programming and Scripting

join rows based on the column values

Hi, Please help me to convert the input file to a new one. input file: -------- 1231231231 3 A 4561223343 0 D 1231231231 1 A 1231231231 2 A 1231231231 4 D 7654343444 2 A 4561223343 1 D 4561223343 2 D the output should be: -------------------- 1231231231 3#1#2 A 4561223343 0 D... (3 Replies)
Discussion started by: vsachan
3 Replies

6. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

7. Shell Programming and Scripting

Need help in splitting the string to diff rows

Hi, I have file with values as below 1~ab~456~ac:bd:de:ef~yyyy-mm-dd 2~cd~458~af:fg:ty:er:ty:uj:io:~yyyy-mm-dd I want the o/p as for frist row 1~ab~456~ac~yyyy-mm-dd 1~ab~456~bd~yyyy-mm-dd 1~ab~456~de~yyyy-mm-dd 1~ab~456~ef~yyyy-mm-dd and for the second row 2~cd~458~af~yyyy-mm-dd... (4 Replies)
Discussion started by: rithushri
4 Replies

8. UNIX for Dummies Questions & Answers

How to join 2 .txt files based on a common column?

Hi all, I'm trying to join two .txt file tab delimitated based on a common column. File 1 transcript_id gene_id length effective_length expected_count TPM FPKM IsoPct comp1000201_c0_seq1 comp1000201_c0 337 183.51 0.00 0.00 0.00 0.00 comp1000297_c0_seq1 ... (1 Reply)
Discussion started by: alisrpp
1 Replies

9. Shell Programming and Scripting

Diff 3 files, but diff only their 2nd column

Guys i have 3 files, but i want to compare and diff only the 2nd column path=`/home/whois/doms` for i in `cat domain.tx` do whois $i| sed -n '/Registry Registrant ID:/,/Registrant Email:/p' > $path/$i.registrant whois $i| sed -n '/Registry Admin ID:/,/Admin Email:/p' > $path/$i.admin... (10 Replies)
Discussion started by: kenshinhimura
10 Replies

10. Shell Programming and Scripting

Join Lines every paragraph in a file.txt

Hi all, Is there any idea on how to automate convert the paragraph in one line in a file, this will happen after OCR the documents, OCR split every paragraph. I need to join all the paragraph in one line. #cat file.txtThe Commission on Higher Education (CHED) was created through Republic Act... (7 Replies)
Discussion started by: lxdorney
7 Replies
Template::Plugin::Table(3)				User Contributed Perl Documentation				Template::Plugin::Table(3)

NAME
Template::Plugin::Table - Plugin to present data in a table SYNOPSIS
[% USE table(list, rows=n, cols=n, overlap=n, pad=0) %] [% FOREACH item IN table.row(n) %] [% item %] [% END %] [% FOREACH item IN table.col(n) %] [% item %] [% END %] [% FOREACH row IN table.rows %] [% FOREACH item IN row %] [% item %] [% END %] [% END %] [% FOREACH col IN table.cols %] [% col.first %] - [% col.last %] ([% col.size %] entries) [% END %] DESCRIPTION
The "Table" plugin allows you to format a list of data items into a virtual table. When you create a "Table" plugin via the "USE" directive, simply pass a list reference as the first parameter and then specify a fixed number of rows or columns. [% USE Table(list, rows=5) %] [% USE table(list, cols=5) %] The "Table" plugin name can also be specified in lower case as shown in the second example above. You can also specify an alternative variable name for the plugin as per regular Template Toolkit syntax. [% USE mydata = table(list, rows=5) %] The plugin then presents a table based view on the data set. The data isn't actually reorganised in any way but is available via the "row()", "col()", "rows()" and "cols()" as if formatted into a simple two dimensional table of "n" rows x "n" columns. So if we had a sample "alphabet" list contained the letters '"a"' to '"z"', the above "USE" directives would create plugins that represented the following views of the alphabet. [% USE table(alphabet, ... %] rows=5 cols=5 a f k p u z a g m s y b g l q v b h n t z c h m r w c i o u d i n s x d j p v e j o t y e k q w f l r x We can request a particular row or column using the "row()" and "col()" methods. [% USE table(alphabet, rows=5) %] [% FOREACH item = table.row(0) %] # [% item %] set to each of [ a f k p u z ] in turn [% END %] [% FOREACH item = table.col(2) %] # [% item %] set to each of [ m n o p q r ] in turn [% END %] Data in rows is returned from left to right, columns from top to bottom. The first row/column is 0. By default, rows or columns that contain empty values will be padded with the undefined value to fill it to the same size as all other rows or columns. For example, the last row (row 4) in the first example would contain the values "[ e j o t y undef ]". The Template Toolkit will safely accept these undefined values and print a empty string. You can also use the IF directive to test if the value is set. [% FOREACH item = table.row(4) %] [% IF item %] Item: [% item %] [% END %] [% END %] You can explicitly disable the "pad" option when creating the plugin to returned shortened rows/columns where the data is empty. [% USE table(alphabet, cols=5, pad=0) %] [% FOREACH item = table.col(4) %] # [% item %] set to each of 'y z' [% END %] The "rows()" method returns all rows/columns in the table as a reference to a list of rows (themselves list references). The "row()" methods when called without any arguments calls "rows()" to return all rows in the table. Ditto for "cols()" and "col()". [% USE table(alphabet, cols=5) %] [% FOREACH row = table.rows %] [% FOREACH item = row %] [% item %] [% END %] [% END %] The Template Toolkit provides the "first", "last" and "size" virtual methods that can be called on list references to return the first/last entry or the number of entries in a list. The following example shows how we might use this to provide an alphabetical index split into 3 even parts. [% USE table(alphabet, cols=3, pad=0) %] [% FOREACH group = table.col %] [ [% group.first %] - [% group.last %] ([% group.size %] letters) ] [% END %] This produces the following output: [ a - i (9 letters) ] [ j - r (9 letters) ] [ s - z (8 letters) ] We can also use the general purpose "join" virtual method which joins the items of the list using the connecting string specified. [% USE table(alphabet, cols=5) %] [% FOREACH row = table.rows %] [% row.join(' - ') %] [% END %] Data in the table is ordered downwards rather than across but can easily be transformed on output. For example, to format our data in 5 columns with data ordered across rather than down, we specify "rows=5" to order the data as such: a f . . b g . c h d i e j and then iterate down through each column (a-e, f-j, etc.) printing the data across. a b c d e f g h i j . . . Example code to do so would be much like the following: [% USE table(alphabet, rows=3) %] [% FOREACH cols = table.cols %] [% FOREACH item = cols %] [% item %] [% END %] [% END %] Output: a b c d e f g h i j . . . In addition to a list reference, the "Table" plugin constructor may be passed a reference to a Template::Iterator object or subclass thereof. The Template::Iterator get_all() method is first called on the iterator to return all remaining items. These are then available via the usual Table interface. [% USE DBI(dsn,user,pass) -%] # query() returns an iterator [% results = DBI.query('SELECT * FROM alphabet ORDER BY letter') %] # pass into Table plugin [% USE table(results, rows=8 overlap=1 pad=0) -%] [% FOREACH row = table.cols -%] [% row.first.letter %] - [% row.last.letter %]: [% row.join(', ') %] [% END %] AUTHOR
Andy Wardley <abw@wardley.org> <http://wardley.org/> COPYRIGHT
Copyright (C) 1996-2007 Andy Wardley. All Rights Reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. SEE ALSO
Template::Plugin perl v5.16.3 2011-12-20 Template::Plugin::Table(3)
All times are GMT -4. The time now is 08:00 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy