Print the row element till the next row element appear in a column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Print the row element till the next row element appear in a column
# 8  
Old 11-17-2012
slight error in output while making repeations

Hi all

I am getting slight eror in my out put

My input file is

Code:
CALR    Antigen processing and presentation    CPSab
KIR2DL5A        
KIR2DS1        
KIR2DS2        
KIR2DS3        
KIR2DS5        
PSME1        
PSME2        
PTK2    Aspirin Blocks Signaling Pathway Involved in Platelet Activation    CPSab
SYK        
PIK3C2G    CCR3 signaling in Eosinophils    CPSab
PTK2        
CHUK    CD40L Signaling Pathway    CPSab
DUSP1        
IKBKAP        
MAP3K1        
TRAF6        
CCNE1    CDK Regulation of DNA Replication    CPSab
KITLG        
MCM5        
ORC4L        
PIK3C2G    CXCR4 Signaling Pathway    CPSab
PTK2        
CCNE1    Cyclin E Destruction Pathway    CPSab
CDC34        
TFDP1        
CCNE1    Cyclins and Cell Cycle Regulation    CPSab
CCNH        
CDC2        
TFDP1        
ACVR2A    Cytokine-cytokine receptor interaction    CPSab
AMH

But My output is


Code:
CALR     Antigen processing and presentation     CPSab
KIR2DL5A     Antigen processing and presentation     CPSab
KIR2DS1     Antigen processing and presentation     CPSab
KIR2DS2     Antigen processing and presentation     CPSab
KIR2DS3     Antigen processing and presentation     CPSab
KIR2DS5     Antigen processing and presentation     CPSab
PSME1     Antigen processing and presentation     CPSab
PSME2     Antigen processing and presentation     CPSab
PTK2     Aspirin Blocks Signaling Pathway Involved in Platelet Activation     CPSab
SYK     Aspirin Blocks Signaling Pathway Involved in Platelet Activation     CPSab
PIK3C2G     CCR3 signaling in Eosinophils    CPS CPSab
PTK2     CCR3 signaling in Eosinophils    CPS CPSab
CHUK     CD40L Signaling Pathway     CPSab
DUSP1     CD40L Signaling Pathway     CPSab
IKBKAP     CD40L Signaling Pathway     CPSab
MAP3K1     CD40L Signaling Pathway     CPSab
TRAF6     CD40L Signaling Pathway     CPSab
CCNE1     CDK Regulation of DNA Replication    C CPSab
KITLG     CDK Regulation of DNA Replication    C CPSab
MCM5     CDK Regulation of DNA Replication    C CPSab
ORC4L     CDK Regulation of DNA Replication    C CPSab
PIK3C2G     CXCR4 Signaling Pathway    CPS CPSab
PTK2     CXCR4 Signaling Pathway    CPS CPSab
CCNE1     Cyclin E Destruction Pathway    C CPSab
CDC34     Cyclin E Destruction Pathway    C CPSab
TFDP1     Cyclin E Destruction Pathway    C CPSab
CCNE1     Cyclins and Cell Cycle Regulation    C CPSab
CCNH     Cyclins and Cell Cycle Regulation    C CPSab
CDC2     Cyclins and Cell Cycle Regulation    C CPSab
TFDP1     Cyclins and Cell Cycle Regulation    C CPSab

using code

Code:
awk ' { if($2 !~ /^ *$/) { a=substr($0,length($1)+1,index($0,$NF)-length($NF)); b=$NF; } print $1, a, b } ' BDchangeoutfile.txt >BDchangeoutfile2.txt

In out put third column repeatition is bit wired it is not cpying preoperly as mentioned in the above row till the next row appears . Please check it.
# 9  
Old 11-17-2012
Looking more closely at the input file you provided in message #8 in this thread, I see that your field separators are exactly four spaces and the separators are present in all input ilnes even if the contents of fields 2 and 3 are empty strings. This makes the logic needed to get the output you want much simpler:
Code:
awk -F "    " 'BEGIN {  OFS = "    "}
{       if($2)  f2 = $2
        else    $2 = f2
        if($3)  f3 = $3
        else    $3 = f3
        print
}' BDchangeoutfile.txt >BDchangeoutfile2.txt

and produces the output:
Code:
KIR2DL5A    Antigen processing and presentation    CPSab
KIR2DS1    Antigen processing and presentation    CPSab
KIR2DS2    Antigen processing and presentation    CPSab
KIR2DS3    Antigen processing and presentation    CPSab
KIR2DS5    Antigen processing and presentation    CPSab
PSME1    Antigen processing and presentation    CPSab
PSME2    Antigen processing and presentation    CPSab
PTK2    Aspirin Blocks Signaling Pathway Involved in Platelet Activation    CPSab
SYK    Aspirin Blocks Signaling Pathway Involved in Platelet Activation    CPSab
PIK3C2G    CCR3 signaling in Eosinophils    CPSab
PTK2    CCR3 signaling in Eosinophils    CPSab
CHUK    CD40L Signaling Pathway    CPSab
DUSP1    CD40L Signaling Pathway    CPSab
IKBKAP    CD40L Signaling Pathway    CPSab
MAP3K1    CD40L Signaling Pathway    CPSab
TRAF6    CD40L Signaling Pathway    CPSab
CCNE1    CDK Regulation of DNA Replication    CPSab
KITLG    CDK Regulation of DNA Replication    CPSab
MCM5    CDK Regulation of DNA Replication    CPSab
ORC4L    CDK Regulation of DNA Replication    CPSab
PIK3C2G    CXCR4 Signaling Pathway    CPSab
PTK2    CXCR4 Signaling Pathway    CPSab
CCNE1    Cyclin E Destruction Pathway    CPSab
CDC34    Cyclin E Destruction Pathway    CPSab
TFDP1    Cyclin E Destruction Pathway    CPSab
CCNE1    Cyclins and Cell Cycle Regulation    CPSab
CCNH    Cyclins and Cell Cycle Regulation    CPSab
CDC2    Cyclins and Cell Cycle Regulation    CPSab
TFDP1    Cyclins and Cell Cycle Regulation    CPSab
ACVR2A    Cytokine-cytokine receptor interaction    CPSab
AMH    Cytokine-cytokine receptor interaction    CPSab

# 10  
Old 11-17-2012
Can you try this? This shall work in any worst case too..
Code:
perl -alne '{if($#F>0){@a=@F[1..$#F];} print "$F[0] @a";}' input


Last edited by msabhi; 11-17-2012 at 03:37 PM..
# 11  
Old 11-17-2012
Code:
awk 'NF==1{sub($1,$1 x)}NF>1{x=$0;sub($1,z,x)}1' yourfile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print a row with the max number in a column

Hello, I have this table: chr1_16857_17742 - chr1 17369 17436 "ENST00000619216.1"; "MIR6859-1"; - 67 chr1_16857_17742 - chr1 14404 29570 "ENST00000488147.1"; "WASH7P"; - 885 chr1_16857_18061 - chr1 ... (5 Replies)
Discussion started by: coppuca
5 Replies

2. Shell Programming and Scripting

Print first row of column a, last row of column b if column a has the same value

I have a table with this structure: cola colb colc 1 19 lemon 20 31 lemon 32 100 lemon 159 205 cherries 210 500 cherries and need to parse it into this format: cola colb colc 1 100 lemon 159 500 cherries So I need the first row of cola and the last row of colb if colc has the... (3 Replies)
Discussion started by: coppuca
3 Replies

3. Shell Programming and Scripting

Print row on 4th column to all row

Dear All, I have input : SEG901 5173 9005 5740 SEG902 5227 5284 SEG903 5284 5346 SEG904 5346 9010 SEG905 5400 5456 SEG906 5456 5511 SEG907 5511 9011 SEG908 5572 9015 SEG909 5622 9020 SEG910 5678 5739 SEG911 5739 5796 SEG912 5796 9025 ... (3 Replies)
Discussion started by: attila
3 Replies

4. Shell Programming and Scripting

Get row number from file1 and print that row of file2

Hi. How can we print those rows of file2 which are mentioned in file1. first character of file1 is a row number.. for eg file1 1:abc 3:ghi 6:pqr file2 a abc b def c ghi d jkl e mno f pqr ... (6 Replies)
Discussion started by: Abhiraj Singh
6 Replies

5. Shell Programming and Scripting

Print every 5 4th column values as separate row with different first column

Hi, I have the following file, chr1 100 200 20 chr1 201 300 22 chr1 220 345 23 chr1 230 456 33.5 chr1 243 567 90 chr1 345 600 20 chr1 430 619 21.78 chr1 870 910 112.3 chr1 914 920 12 chr1 930 999 13 My output would be peak1 20 22 23 33.5 90 peak2 20 21.78 112.3 12 13 Here the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

6. UNIX for Dummies Questions & Answers

awk to print first row with forth column and last row with fifth column in each file

file with this content awk 'NR==1 {print $4} && NR==2 {print $5}' file The error is shown with syntax error; what can be done (4 Replies)
Discussion started by: cdfd123
4 Replies

7. Shell Programming and Scripting

Subtracting each row from the first row in a single column file using awk

Hi Friends, I have a single column data like below. 1 2 3 4 5 I need the output like below. 0 1 2 3 4 where each row (including first row) subtracting from first row and the result should print below like the way shown in output file. Thanks Sid (11 Replies)
Discussion started by: ks_reddy
11 Replies

8. Shell Programming and Scripting

Find if XML element has a matching required element

I want to check if every <Part> element has corresponding <Description> in this sample XML. ....<Lot Of XML> <Inv lineNumber="2"> <Item> ... (4 Replies)
Discussion started by: kchinnam
4 Replies

9. Shell Programming and Scripting

Print row if value in column 1 is the first occurence

Hi All, I would like to have a script which is able to perform the below. Print the whole row if column1 which is "0001" for the below example is the first occurrence. Subsequent "0001" occurrence will not be printed out and so on. Can any expert help ? Input: 0001 k= 40 0001 k= 2... (7 Replies)
Discussion started by: Raynon
7 Replies

10. Shell Programming and Scripting

Changing the column for a row in a text file and adding another row

Hi, I want to write a shell script which increments a particular column in a row from a text file and then adds another row below the current row with the incremented value . For Eg . if the input file has a row : abc xyz lmn 89 lm nk o p I would like the script to create something like... (9 Replies)
Discussion started by: aYankeeFan
9 Replies
Login or Register to Ask a Question