Sponsored Content
Top Forums Shell Programming and Scripting Merging and Adding colon to columns Post 303015776 by nans on Thursday 12th of April 2018 05:28:57 AM
Old 04-12-2018
Merging and Adding colon to columns

Hello,
I have a tab delim file that looks like this

Code:
CHROM    POS    ID    REF    ALT    ID    HGVS_C    HGVS_P
1    17319011    rs2076603    G    A    NM_022089.3,NM_001141973.2,NM_001141974.2    c.1815C>T,c.1800C>T,c.1800C>T    p.Pro605Pro,p.Pro600Pro,p.Pro600Pro
1    20960230    rs45530340    C    T    NM_032409.2,NR_106732.1    c.189C>T,n.59C>T    p.Leu63Leu,.
1    20964328    rs2298298    A    G    NM_032409.2,NR_106732.1,NR_046507.1    c.388-7A>G,n.*4047A>G,n.*4822T>C    .,.,.
1    20972048    rs3131713    G    A    NM_032409.2,NR_046507.1    c.960-5G>A,n.3981+30C>T    .,.
1    43395635    rs2229682    C    T    NM_006516.2    c.588G>A    p.Pro196Pro
1    43396414    rs11537641    G    A    NM_006516.2    c.399C>T    p.Cys133Cys
1    43408966    rs1385129    G    A    NM_006516.2    c.45C>T    p.Ala15Ala

I need the output file to look like this where ID column is merged with the last two. The columns do not have a consistent number of values in it, sometimes its just a single value, sometimes many and sometimes it's a blank.

Code:
CHROM    POS    ID    REF    ALT    ID:HGVS_C    ID:HGVS_P
1    17319011    rs2076603    G    A    NM_022089.3:c.1815C>T,NM_001141973.2:c.1800C>T,NM_001141974.2:c.1800C>T    NM_022089.3:p.Pro605Pro,NM_001141973.2:p.Pro600Pro,NM_001141974.2:p.Pro600Pro
1    20960230    rs45530340    C    T    NM_032409.2:c.189C>T,NR_106732.1:n.59C>T    NM_032409.2:p.Leu63Leu,.
1    20964328    rs2298298    A    G    NM_032409.2:c.388-7A>G,NR_106732.1:n.*4047A>G,NR_046507.1:n.*4822T>C    
1    20972048    rs3131713    G    A    NM_032409.2:c.960-5G>A,NR_046507.1:n.3981+30C>T    
1    43395635    rs2229682    C    T    NM_006516.2:c.588G>A    NM_006516.2:p.Pro196Pro
1    43396414    rs11537641    G    A    NM_006516.2:c.399C>T    NM_006516.2:p.Cys133Cys
1    43408966    rs1385129    G    A    NM_006516.2:c.45C>T    NM_006516.2:p.Ala15Ala

I tried this

Code:
awk '{print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6":"$7"\t"$6":"$8}' input.txt > output.txt

It works for single values (ie last 3-4 rows) but I am not able to merge multiple values together (first few rows).

Any help/suggestions are appreciated.

thank you
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merging columns

Hi, I have input file. File1: Seqno Name 124 name1 121 name2 123 name3 122 name4 We will send the file1 to some other team. They will replace name column with place in file1 and send back to us as file2. file2: Seqno Place 124 place1 121 place2 123 place3file2: (5 Replies)
Discussion started by: manneni prakash
5 Replies

2. UNIX for Dummies Questions & Answers

Merging two columns

Hi, I have two columns that look like this (tab seperated): name top carl ball bob lost joe smith I want the two columns to merge and look like this: nametop carlball boblost joesmith Also, I want to trim the edges of a column. So lets say the above column... (3 Replies)
Discussion started by: phil_heath
3 Replies

3. Shell Programming and Scripting

merging line and adding number

I having file below o/p ibapp311dg,,20480,s,,,,,,,,, test,,20480,s,,,,,,,,, test,,20480,s,,,,,,,,, ibapp311dg,,20480,s,,,,,,,,, I want to to chk unique word line in the first field seperated by , as well as addup corressponding the number in field for each unique word like ibapp311dg... (8 Replies)
Discussion started by: tarunn.dubeyy
8 Replies

4. Shell Programming and Scripting

Merging columns from multiple files

Hello, I have a number of tab delimited data files consists of two columns. Like that: File1 800.000000 0.002744 799.000000 0.002517 798.000000 0.002836 797.000000 0.002553 FIle2 800.000000 0.000261 799.000000 0.000001 798.000000 0.000551 797.000000 0.000275 File3... (19 Replies)
Discussion started by: erden
19 Replies

5. Shell Programming and Scripting

merging files and adding special columns

Hi everyone, I got a problem with merging files and hoped one of you would have an idea how to approach this issue. I tried it with awk, but didn't get far. This is what I have: I got 40 files looking like the ones below. All have three columns but the number of rows differs (20000 to 50000).... (6 Replies)
Discussion started by: TuAd
6 Replies

6. UNIX for Dummies Questions & Answers

Merging two text files by two columns

Hi, I have two text files that I would like to merge/join. I would like to join them if the first columns of both text files match and the second column of the first text file matches the third column of the second text file. Example input: First file: 1334 10 0 0 1 5.2 1334 12 0 0 1 4.5... (4 Replies)
Discussion started by: evelibertine
4 Replies

7. Shell Programming and Scripting

Merging two columns into one

Suppose I have file1.txt 1 2 4 5 10 11 and I want to produce 1 2 4 5 10 11 file2.txt Thanks for your help :) (2 Replies)
Discussion started by: johnkim0806
2 Replies

8. Shell Programming and Scripting

Adding columns with values dependent on existing columns

Hello I have a file as below chr1 start ref alt code1 code2 chr1 18884 C CAAAA 2 0 chr1 135419 TATACA T 2 0 chr1 332045 T TTG 0 2 chr1 453838 T TAC 2 0 chr1 567652 T TG 1 0 chr1 602541 ... (2 Replies)
Discussion started by: plumb_r
2 Replies

9. Shell Programming and Scripting

Merging Columns

Hi, Can you please help me. I have 2 files to merge File1 1251 743 1250 742 1249 741 1248 749 1247 722 1246 740 1245 739 1244 740 1243 705 1242 631 1241 590 File2 (2 Replies)
Discussion started by: jiam912
2 Replies

10. Shell Programming and Scripting

Adding columns from 2 files with variable number of columns

I have two files, file1 and file2 who have identical number of rows and columns. However, the script is supposed to be used for for different files and I cannot know the format in advance. Also, the number of columns changes within the file, some rows have more and some less columns (they are... (13 Replies)
Discussion started by: maya3
13 Replies
tabs(1) 							   User Commands							   tabs(1)

NAME
tabs - set tabs on a terminal SYNOPSIS
tabs [ -n | --file [ [-code] | -a | -a2 | -c | -c2 | -c3 | -f | -p | -s | -u] ] [ +m [n]] [-T type] tabs [-T type] [ + m [n]] n1 [ , n2 ,...] DESCRIPTION
The tabs utility sets the tab stops on the user's terminal according to a tab specification, after clearing any previous settings. The user's terminal must have remotely settable hardware tabs. OPTIONS
The following options are supported. If a given flag occurs more than once, the last value given takes effect: -T type tabs needs to know the type of terminal in order to set tabs and margins. type is a name listed in term(5). If no -T flag is supplied, tabs uses the value of the environment variable TERM. If the value of TERM is NULL or TERM is not defined in the environment (see environ(5)), tabs uses ansi+tabs as the terminal type to provide a sequence that will work for many terminals. +m[n] The margin argument may be used for some terminals. It causes all tabs to be moved over n columns by making column n+1 the left margin. If +m is given without a value of n, the value assumed is 10. For a TermiNet, the first value in the tab list should be 1, or the margin will move even further to the right. The normal (leftmost) margin on most terminals is obtained by +m0. The margin for most terminals is reset only when the +m flag is given explicitly. Tab Specification Four types of tab specification are accepted. They are described below: canned, repetitive (-n), arbitrary (n1,n2,...), and file (-file). If no tab specification is given, the default value is -8, that is, UNIX system ``standard'' tabs. The lowest column number is 1. Note: For tabs, column 1 always refers to the leftmost column on a terminal, even one whose column markers begin at 0, for example, the DASI 300, DASI 300s, and DASI 450. Canned -code Use one of the codes listed below to select a canned set of tabs. If more than one code is specified, the last code option will be used. The legal codes and their meanings are as follows: -a 1,10,16,36,72 Assembler, IBM S/370, first format -a2 1,10,16,40,72 Assembler, IBM S/370, second format -c 1,8,12,16,20,55 COBOL, normal format -c2 1,6,10,14,49 COBOL compact format (columns 1-6 omitted). Using this code, the first typed character corresponds to card column 7, one space gets you to column 8, and a tab reaches column 12. Files using this tab setup should include a format specification as follows (see fspec(4)): <:t-c2 m6 s66 d:> -c3 1,6,10,14,18,22,26,30,34,38,42,46,50,54,58,62,67 COBOL compact format (columns 1-6 omitted), with more tabs than -c2. This is the recommended format for COBOL. The appropriate format specification is (see fspec(4)): <:t-c3 m6 s66 d:> -f 1,7,11,15,19,23 FORTRAN -p 1,5,9,13,17,21,25,29,33,37,41,45,49,53,57,61 PL/I -s 1,10,55 SNOBOL -u 1,12,20,44 UNIVAC 1100 Assembler Repetitive -n A repetitive specification requests tabs at columns 1+n, 1+2*n, etc., where n is a single-digit decimal number. Of particular importance is the value 8: this represents the UNIX system ``standard'' tab setting, and is the most likely tab setting to be found at a terminal. When -0 is used, the tab stops are cleared and no new ones are set. Arbitrary See OPERANDS. File -file If the name of a file is given, tabs reads the first line of the file, searching for a format specification (see fspec(4)). If it finds one there, it sets the tab stops according to it, otherwise it sets them as -8. This type of specification may be used to make sure that a tabbed file is printed with correct tab settings, and would be used with the pr command: example% tabs - file; pr file Tab and margin setting is performed via the standard output. OPERANDS
The following operand is supported: n1[,n2,...] The arbitrary format consists of tab-stop values separated by commas or spaces. The tab-stop values must be positive deci- mal integers in ascending order. Up to 40 numbers are allowed. If any number (except the first one) is preceded by a plus sign, it is taken as an increment to be added to the previous value. Thus, the formats 1,10,20,30, and 1,10,+10,+10 are considered identical. EXAMPLES
Example 1: Using the tabs command The following command is an example using -code ( canned specification) to set tabs to the settings required by the IBM assembler: columns 1, 10, 16, 36, 72: example% tabs -a The next command is an example of using -n (repetitive specification), where n is 8, causes tabs to be set every eighth position: 1+(1*8), 1+(2*8), ... which evaluate to columns 9, 17, ...: example% tabs -8 This command uses n1,n2,... (arbitrary specification) to set tabs at columns 1, 8, and 36: example% tabs 1,8,36 The last command is an example of using -file (file specification) to indicate that tabs should be set according to the first line of $HOME/fspec.list/att4425 (see fspec(4)). example% tabs -$HOME/fspec.list/att4425 ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of tabs: LANG, LC_ALL, LC_CTYPE, LC_MES- SAGES, and NLSPATH. TERM Determine the terminal type. If this variable is unset or null, and if the -T option is not specified, terminal type ansi+tabs will be used. EXIT STATUS
The following exit values are returned: 0 Successful completion. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
expand(1), newform(1), pr(1), stty(1), tput(1), fspec(4), terminfo(4), attributes(5), environ(5), term(5), standards(5) NOTES
There is no consistency among different terminals regarding ways of clearing tabs and setting the left margin. tabs clears only 20 tabs (on terminals requiring a long sequence), but is willing to set 64. The tabspec used with the tabs command is different from the one used with the newform command. For example, tabs -8 sets every eighth position; whereas newform -i-8 indicates that tabs are set every eighth position. SunOS 5.10 1 Feb 1995 tabs(1)
All times are GMT -4. The time now is 07:59 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy