Sponsored Content
Top Forums Shell Programming and Scripting Paste columns based on common column: multiple files Post 303009584 by RudiC on Saturday 16th of December 2017 10:02:43 AM
Old 12-16-2017
Well, not one of my magic moments. Try instead
Code:
awk '
BEGIN           {TF = ARGC - 1
                }

                {if (!LINE[$2]) SEQ[++SN] = $2
                 LINE[$2] = LINE[$2] $0 " "
                 CNT[$2]++
                }
END             {for (s=1; s<=SN; s++) if (CNT[SEQ[s]] == TF) print LINE[SEQ[s]]
                }
'  HGWAS?/merged_info_CHR1.info

Only if you're running out of memory with too large or too many files, you might want to fall back to the post#9 version reading files twice but saving some memory.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to convert 2 column data into multiple columns based on a keyword in a row??

Hi Friends I have the following input data in 2 columns. SNo 1 I1 Value I2 Value I3 Value SNo 2 I4 Value I5 Value I6 Value I7 Value SNo 3 I8 Value I9 Value ............... ................ SNo N (1 Reply)
Discussion started by: ks_reddy
1 Replies

2. Shell Programming and Scripting

sum multiple columns based on column value

i have a file - it will be in sorted order on column 1 abc 0 1 abc 2 3 abc 3 5 def 1 7 def 0 1 -------- i'd like (awk maybe?) to get the results (any ideas)??? abc 5 9 def 1 8 (2 Replies)
Discussion started by: jjoe
2 Replies

3. Shell Programming and Scripting

Merging 2 files based on a common column

Hi All, I do have 2 files file 1 has 4 tab delimited columns 234 a c dfgyu 294 b g fih 302 c h jzh 328 z c san 597 f g son File 2 has 2 tab delimted columns 234 23 302 24 597 24 I want to merge file 2 with file 1 based on the data common in both files which is the first column so... (6 Replies)
Discussion started by: Lucky Ali
6 Replies

4. Shell Programming and Scripting

Join multiple files based on 1 common column

I have n files (for ex:64 files) with one similar column. Is it possible to combine them all based on that column ? file1 ax100 20 30 40 ax200 22 33 44 file2 ax100 10 20 40 ax200 12 13 44 file2 ax100 0 0 4 ax200 2 3 4 (9 Replies)
Discussion started by: quincyjones
9 Replies

5. UNIX for Dummies Questions & Answers

Writing a loop to merge multiple files by common column

I have 100 data files labelled 250.1.txt through 250.100.txt. The second column of the data files partially match (there is about %90 overlap). Each data file has 4 columns. I want the merge all these text files by the matching values in the second column. In the output, the first column should... (1 Reply)
Discussion started by: evelibertine
1 Replies

6. Shell Programming and Scripting

common entries between files based on 1st column

Hi, I am trying to get the common entries from 2 files based on 1st field.. However when I try to do in perl I am getting blank output.. How can I do this in awk? open(BUFF1, "my_genes"); open(BUFF3, "rawcounts"); #open(WRBUFF,">result_rawcounts"); while($line =<BUFF1>) { ... (3 Replies)
Discussion started by: Diya123
3 Replies

7. UNIX for Dummies Questions & Answers

How to join 2 .txt files based on a common column?

Hi all, I'm trying to join two .txt file tab delimitated based on a common column. File 1 transcript_id gene_id length effective_length expected_count TPM FPKM IsoPct comp1000201_c0_seq1 comp1000201_c0 337 183.51 0.00 0.00 0.00 0.00 comp1000297_c0_seq1 ... (1 Reply)
Discussion started by: alisrpp
1 Replies

8. UNIX for Dummies Questions & Answers

Merge selective columns from files based on common key

Hi, I am trying to selectively merge two files based on keys reported in the 1st column. File1: #file1-header1 file1-header2 111 qwe rtz uio 198 asd fgh jkl 165 yxc 789 poi uzt rew 89 lkj File2: #file2-header2 file2-header2 165 ghz nko2 ... (2 Replies)
Discussion started by: dovah
2 Replies

9. Shell Programming and Scripting

Join columns across multiple lines in a Text based on common column using BASH

Hello, I have a file with 2 columns ( tableName , ColumnName) delimited by a Pipe like below . File is sorted by ColumnName. Table1|Column1 Table2|Column1 Table5|Column1 Table3|Column2 Table2|Column2 Table4|Column3 Table2|Column3 Table2|Column4 Table5|Column4 Table2|Column5 From... (6 Replies)
Discussion started by: nv186000
6 Replies

10. UNIX for Beginners Questions & Answers

How to copy a column of multiple files and paste into new excel file (next to column)?

I have data of an excel files as given below, file1 org1_1 1 1 2.5 100 org1_2 1 2 5.5 98 org1_3 1 3 7.2 88 file2 org2_1 1 1 2.5 100 org2_2 1 2 5.5 56 org2_3 1 3 7.2 70 I have multiple excel files as above shown. I have to copy column 1, column 4 and paste into a new excel file as... (26 Replies)
Discussion started by: dineshkumarsrk
26 Replies
paste(1)						      General Commands Manual							  paste(1)

NAME
paste - Joins corresponding lines of several files or subsequent lines in one file SYNOPSIS
paste [-d list] [-s] file... STANDARDS
Interfaces documented on this reference page conform to industry standards as follows: paste: XCU5.0 Refer to the standards(5) reference page for more information about industry standards and associated tags. OPTIONS
Replaces the delimiter that separates lines in the output (tab by default) with one or more characters from list. If list contains more than one character, then the characters are repeated in order until the end of the output. In parallel merging, the lines from the last file always end with a newline character, instead of one from list. The following special characters can be used in list: Newline character Tab Backslash Empty string (not a null character) [Tru64 UNIX] An extended character You must quote characters that have special meaning to the shell. Merges all lines from each input file into one line of output (serial merging). Using this option, the paste command merges all lines in the first input file forcing a newline before at the end. The command then continues with the next input file, continuing in the same manner until all input files have been completed. A tab separates the input lines unless you use the -d option. Regardless of the list, the last character of the output is a newline character. OPERANDS
The name of an input file. You may specify up to 12 files, including hyphens. If you specify a -, paste reads standard input recursively, one line for each -. DESCRIPTION
Specifying the -d option or no options causes the paste command to treat each file as a column, joining them horizontally with a tab char- acter by default (parallel merging). Using the -s option, the paste command combines all lines of each input file into one output line (serial merging). These lines are joined with the tab character by default. Output lines can be any length. [Tru64 UNIX] The output of pr -t -m is similar to the output produced by the paste command, but pr with its options creates extra spaces, tabs, and lines for an enhanced page layout. RESTRICTIONS
If the -s option is not used, it is an error if any specified file cannot be opened. EXIT STATUS
The following exit values are returned: Successful completion. An error occurred. EXAMPLES
To paste several columns of data together, enter: paste names places dates > npd This creates a file named npd that contains the data from names in one column, places in another, and dates in a third. The columns are separated by tab characters. File npd then contains: rachel New York 28 February jerzy Warsaw 27 April mata Nairobi 21 June michel Boca Raton 27 July segui Managua 18 November A tab character separates the name, place, and date on each line. To separate the columns with a character other than a tab (sh only), enter: paste -d"!@" names places dates > npd This alternates the apostrophe (!) and the at sign (@) as the column separators. If names, places, and dates are the same as in Example 1, then npd contains: rachel!New York@28 February jerzy!Warsaw@27 April mata!Nairobi@21 June michel!Boca Raton@27 July segui!Managua@18 November To dis- play the standard input in multiple columns, enter: ls | paste - - - - This lists the current directory in four columns. Each hyphen (-) tells the paste command to create a column containing data read from the standard input. The first line is put in the first column, the second line in the second column, ... and then the fifth line in the first column, and so on. This is equivalent to ls | paste -d" " -s - which fills the columns across the page with subsequent lines from the standard input. The -d defines the character to insert after each column: a tab character ( ) after the first three columns, and a newline character ( ) after the fourth. Without the -d option, paste -s - displays all of the input as one line with a tab between each column. To merge the lines of the file names above into one output line, enter: paste -s names This results in: rachel jerzy mata michel segui ENVIRONMENT VARIABLES
The following environment variables affect the execution of paste: Provides a default value for the internationalization variables that are unset or null. If LANG is unset or null, the corresponding value from the default locale is used. If any of the internationalization vari- ables contain an invalid setting, the utility behaves as if none of the variables had been defined. If set to a non-empty string value, overrides the values of all the other internationalization variables. Determines the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as opposed to multibyte characters in arguments and input files). Determines the locale for the format and contents of diagnostic messages written to standard error. Determines the location of message catalogues for the processing of LC_MESSAGES. SEE ALSO
Commands: cut(1), grep(1), fold(1), join(1), pr(1) Standards: standards(5) paste(1)
All times are GMT -4. The time now is 09:55 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy