Sponsored Content
Top Forums Shell Programming and Scripting Merging and Adding colon to columns Post 303015776 by nans on Thursday 12th of April 2018 05:28:57 AM
Old 04-12-2018
Merging and Adding colon to columns

Hello,
I have a tab delim file that looks like this

Code:
CHROM    POS    ID    REF    ALT    ID    HGVS_C    HGVS_P
1    17319011    rs2076603    G    A    NM_022089.3,NM_001141973.2,NM_001141974.2    c.1815C>T,c.1800C>T,c.1800C>T    p.Pro605Pro,p.Pro600Pro,p.Pro600Pro
1    20960230    rs45530340    C    T    NM_032409.2,NR_106732.1    c.189C>T,n.59C>T    p.Leu63Leu,.
1    20964328    rs2298298    A    G    NM_032409.2,NR_106732.1,NR_046507.1    c.388-7A>G,n.*4047A>G,n.*4822T>C    .,.,.
1    20972048    rs3131713    G    A    NM_032409.2,NR_046507.1    c.960-5G>A,n.3981+30C>T    .,.
1    43395635    rs2229682    C    T    NM_006516.2    c.588G>A    p.Pro196Pro
1    43396414    rs11537641    G    A    NM_006516.2    c.399C>T    p.Cys133Cys
1    43408966    rs1385129    G    A    NM_006516.2    c.45C>T    p.Ala15Ala

I need the output file to look like this where ID column is merged with the last two. The columns do not have a consistent number of values in it, sometimes its just a single value, sometimes many and sometimes it's a blank.

Code:
CHROM    POS    ID    REF    ALT    ID:HGVS_C    ID:HGVS_P
1    17319011    rs2076603    G    A    NM_022089.3:c.1815C>T,NM_001141973.2:c.1800C>T,NM_001141974.2:c.1800C>T    NM_022089.3:p.Pro605Pro,NM_001141973.2:p.Pro600Pro,NM_001141974.2:p.Pro600Pro
1    20960230    rs45530340    C    T    NM_032409.2:c.189C>T,NR_106732.1:n.59C>T    NM_032409.2:p.Leu63Leu,.
1    20964328    rs2298298    A    G    NM_032409.2:c.388-7A>G,NR_106732.1:n.*4047A>G,NR_046507.1:n.*4822T>C    
1    20972048    rs3131713    G    A    NM_032409.2:c.960-5G>A,NR_046507.1:n.3981+30C>T    
1    43395635    rs2229682    C    T    NM_006516.2:c.588G>A    NM_006516.2:p.Pro196Pro
1    43396414    rs11537641    G    A    NM_006516.2:c.399C>T    NM_006516.2:p.Cys133Cys
1    43408966    rs1385129    G    A    NM_006516.2:c.45C>T    NM_006516.2:p.Ala15Ala

I tried this

Code:
awk '{print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6":"$7"\t"$6":"$8}' input.txt > output.txt

It works for single values (ie last 3-4 rows) but I am not able to merge multiple values together (first few rows).

Any help/suggestions are appreciated.

thank you
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merging columns

Hi, I have input file. File1: Seqno Name 124 name1 121 name2 123 name3 122 name4 We will send the file1 to some other team. They will replace name column with place in file1 and send back to us as file2. file2: Seqno Place 124 place1 121 place2 123 place3file2: (5 Replies)
Discussion started by: manneni prakash
5 Replies

2. UNIX for Dummies Questions & Answers

Merging two columns

Hi, I have two columns that look like this (tab seperated): name top carl ball bob lost joe smith I want the two columns to merge and look like this: nametop carlball boblost joesmith Also, I want to trim the edges of a column. So lets say the above column... (3 Replies)
Discussion started by: phil_heath
3 Replies

3. Shell Programming and Scripting

merging line and adding number

I having file below o/p ibapp311dg,,20480,s,,,,,,,,, test,,20480,s,,,,,,,,, test,,20480,s,,,,,,,,, ibapp311dg,,20480,s,,,,,,,,, I want to to chk unique word line in the first field seperated by , as well as addup corressponding the number in field for each unique word like ibapp311dg... (8 Replies)
Discussion started by: tarunn.dubeyy
8 Replies

4. Shell Programming and Scripting

Merging columns from multiple files

Hello, I have a number of tab delimited data files consists of two columns. Like that: File1 800.000000 0.002744 799.000000 0.002517 798.000000 0.002836 797.000000 0.002553 FIle2 800.000000 0.000261 799.000000 0.000001 798.000000 0.000551 797.000000 0.000275 File3... (19 Replies)
Discussion started by: erden
19 Replies

5. Shell Programming and Scripting

merging files and adding special columns

Hi everyone, I got a problem with merging files and hoped one of you would have an idea how to approach this issue. I tried it with awk, but didn't get far. This is what I have: I got 40 files looking like the ones below. All have three columns but the number of rows differs (20000 to 50000).... (6 Replies)
Discussion started by: TuAd
6 Replies

6. UNIX for Dummies Questions & Answers

Merging two text files by two columns

Hi, I have two text files that I would like to merge/join. I would like to join them if the first columns of both text files match and the second column of the first text file matches the third column of the second text file. Example input: First file: 1334 10 0 0 1 5.2 1334 12 0 0 1 4.5... (4 Replies)
Discussion started by: evelibertine
4 Replies

7. Shell Programming and Scripting

Merging two columns into one

Suppose I have file1.txt 1 2 4 5 10 11 and I want to produce 1 2 4 5 10 11 file2.txt Thanks for your help :) (2 Replies)
Discussion started by: johnkim0806
2 Replies

8. Shell Programming and Scripting

Adding columns with values dependent on existing columns

Hello I have a file as below chr1 start ref alt code1 code2 chr1 18884 C CAAAA 2 0 chr1 135419 TATACA T 2 0 chr1 332045 T TTG 0 2 chr1 453838 T TAC 2 0 chr1 567652 T TG 1 0 chr1 602541 ... (2 Replies)
Discussion started by: plumb_r
2 Replies

9. Shell Programming and Scripting

Merging Columns

Hi, Can you please help me. I have 2 files to merge File1 1251 743 1250 742 1249 741 1248 749 1247 722 1246 740 1245 739 1244 740 1243 705 1242 631 1241 590 File2 (2 Replies)
Discussion started by: jiam912
2 Replies

10. Shell Programming and Scripting

Adding columns from 2 files with variable number of columns

I have two files, file1 and file2 who have identical number of rows and columns. However, the script is supposed to be used for for different files and I cannot know the format in advance. Also, the number of columns changes within the file, some rows have more and some less columns (they are... (13 Replies)
Discussion started by: maya3
13 Replies
nljust(1)						      General Commands Manual							 nljust(1)

NAME
nljust - justify lines, left or right, for printing SYNOPSIS
digits] seq] just] mode] order] margin] width] ck] [file ...] DESCRIPTION
formats for printing data written in languages with a right-to-left orientation. It is designed to be used with the and the commands (see pr(1) and lp(1)). reads the concatenation of input files (or standard input if none are given) and produces on standard output a right-to-left formatted ver- sion of its input. If appears as an input file name, reads standard input at that point. Use to delimit the end of options. formats input files for all languages that are read from right to left. For languages that have a left-to-right orientation, the command merely copies input files to standard output. Options recognizes the following options: Justify data for all languages, including those having a left-to-right text orientation. By default only right-to-left language data is justified. For all other languages, input files are directly copied to standard output. Select enhanced printer shapes for some Arabic characters. With this option, two-character combinations of laam and alif are replaced by a single character. Triggers ISO 8859-6 interpretation of the data. Processes digits for output as hindi, western, or both. digits can be or both. Use seq as the escape sequence to select the primary character set. This escape sequence is used by languages that have too many characters to be accommodated by ASCII in a single 256-character set. In these cases, the seq escape sequence can be used to select the non-ASCII character set. The escape character itself(0x1b) is not given on the command line. Hewlett-Packard escape sequences are used by default. If just is left justify print lines. If just is right-justify print lines starting from the (designated or default) print width column. The default is right justification. Replace leading spaces with alternative spaces. Some right-to-left character sets have a non-ASCII or alternative space. This option can be useful when filtering out- put (see pr(1)). With right justification, the option causes line numbers to be placed immediately to the right of the tab character. Without the option, right justification causes line numbers to be placed at the print-width column. By default, leading spaces are not replaced by alternative spaces. Indicate mode of any file to be formatted. Mode refers to the text orientation of the file when it was created. If mode is assume Latin mode. If mode is assume non-Latin mode. By default, mode information is obtained from the environment variable. Do not terminate lines containing printable characters with a new-line. By default, print lines are terminated by new-lines. Indicate data order of any file to be formatted. The text orientation of a file can affect the way its data is arranged. If order is assume keyboard order. If order is assume screen order. By default, order information is obtained from the environment variable. Truncate print lines that do not fit the designated or default line length. Print lines are folded (that is, wrapped to next line) by default. Expand input tabs to column positions k+1, 2*k+1, 3*k+1, etc. Tab characters in the input are expanded to the appropriate number of spaces. If k is 0 or is omitted, default tab settings at every eighth position is assumed. If cd (any non-digit character) is given, it is treated as the input tab character. The default for c is the tab character. always expands input tabs. This option provides a way to change the tab character and setting. If this option is specified, at least one of the parameters c or k must be given. Designate a number as the print margin. The print margin is the column where truncation or folding takes place. The print margin determines how many characters appear on a single line and can never exceed the print width. The print margin is relative to the justifica- tion. If the print margin is 80, folding or truncation occurs at column 80 starting from the right during a right jus- tification. Similarly, folding or truncation occurs at column 80 starting from the left during a left justification. By default, the print margin is set to column 80. Designates a number as the print width. The print width is the maximum number of columns in the print line. Print width determines the start of text during a right justification. The larger the print width, the further to the right the text will start. By default, an 80-column print width is used. EXTERNAL INFLUENCES
Environment Variables The environment variable determines the mode and order of the file. The syntax of is [mode][_order]. mode describes the mode of a file where represents Latin mode and represents non-Latin mode. Non-Latin mode is assumed for values other than and order describes the data order of a file where is keyboard and is screen. Keyboard order is assumed for values other than and Mode and order information in can be overridden from the command line. The environment variable determines the direction of a language (left-to-right or right-to-left) and whether context analysis of characters is necessary. The environment variable determines whether a language has alternative numbers. The environment variable determines the language in which messages are displayed. International Code Set Support Single-byte character code sets are supported. EXAMPLES
Right justify on a 132-column printer with a print margin at column 80 (the default): Right justify output of with line numbers on a 132-column printer with a print margin at column 132: WARNINGS
If with line numbers option) is piped to the separator character must be a tab(0x09). It is the user's responsibility to ensure that the environment variable accurately reflects the status of the file. Mode and justification must be consistent. Only non-Latin-mode files can be right justified in a meaningful way. Similarly, only Latin- mode files can be safely left justified. If mode and justification do not match, the results are undefined. If present, alternative numbers always have a left-to-right orientation. The command is HP proprietary, not portable to other vendors' systems, and will not be provided in future HP-UX releases. AUTHOR
was developed by HP. SEE ALSO
forder(1), lp(1), pr(1), strord(3C). nljust(1)
All times are GMT -4. The time now is 04:44 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy