Print . in blank fields to prevent fields from shifting


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Print . in blank fields to prevent fields from shifting
Prev   Next
# 1  
Old 10-21-2016
Print . in blank fields to prevent fields from shifting

The below code works great, kindly provided by @Don Cragun, the lines in bold print the current output. Since some of the fields printed can be blank some of the fields are shifted. I can not seem too add . to the blank fields like in the desired output. Basically, if there is nothing in the field then . otherwise print what the script matches. Thank you Smilie.

script
Code:
for file in /home/cmccabe/Desktop/concordance/comparison/update/*.txt ; do
    file1=${file##*/}    # Strip off directory
    getprefix=${file1%%_*.txt}
    file1=$(printf '%s\n' "/home/cmccabe/Desktop/concordance/reference/files/${file1%%_*.txt}_"*.txt) # look for matching file
    if [[ -f "$file1" ]]
    then
          awk '
BEGIN {FS = OFS = "\t"
}
NR == 1 {
outfile = FILENAME
}
FNR == NR {
o[i[++ic] = $1 OFS $2 OFS $3] = $0
}
{if($2 OFS $4 OFS $5 in o)
o[$2 OFS $4 OFS $5] = $1 OFS $2 OFS $4 OFS $5 OFS $6 OFS $7 OFS $8 OFS $9 OFS $10 OFS $11 OFS $12 OFS $13 OFS $14 OFS $15 OFS $16 OFS $17 OFS $18 OFS $19           }   
END {for(j = 1; j <= ic; j++)
print o[i[j]] > outfile
}' $file $file1
   fi
done

current output
Code:
Missing in IDP but found in Reference:                                         
2    166848646    G    A    exonic    SCN1A    68    13    16;20    0;0    17;15    0;0    0;0    0;0        c.[5139C>T]+[=]    52.94        Not low     found 
12    52200340    A    C    exonic    SCN8A    4129    28.3    1560;1672    413;453    0;0    0;0    0;2    31;0        c.[5070A>C]+[=]    20.97        Not low     Not found 
13    77570076    -    A    exonic    CLN5    2762    26.6    2060;702    0;0    0;0    0;0    2050;696    0;0        c.526_527insA    99.42    TP    Not low     Not found 
7    148106478    -    GT    intronic    CNTNAP2    4051    28.5    0;1    0;0    0;0    2220;1829    1085;887    0;1    rs60451214    c.3716-5_3716-4insGT    48.68        Not low     Not found 
9    138678036    TGCCC    -    intronic    KCNT1    834    23.1    0;0    0;0    0;31    0;1    0;0    0;802    rs141359570    c.3178-7_3178-3delTGCCC    96.16        Not low     Not found 
7    148106476    -    TT    intronic    CNTNAP2    4052    28.8    0;0    5;0    0;0    2221;1826    1081;884    0;0    rs61232377    c.3716-7_3716-6insTT    48.49        Not low     Not found 
2    166245425    C    T    exonic    SCN2A    49    12.6    0;0    13;9    0;0    18;9    0;0    0;0        c.[5109C>T]+[=]    55.1        Not low     found

desired output
Code:
Missing in IDP but found in Reference:                                                                             
CHR    POS    REF    ALT    FUNC    GENE    COVERAGE    PHRED    A[#F,#R]    C[#F,#R]    G[#F,#R]    T[#F,#R]    INS[#F,#R]    DEL[#F,#R]    SNP    MUT    FREQ    SANGER    REGION    TVC 
2    166848646    G    A    exonic    SCN1A    68    13    16;20    0;0    17;15    0;0    0;0    0;0    .    c.[5139C>T]+[=]    52.94    .    Not low     found 
12    52200340    A    C    exonic    SCN8A    4129    28.3    1560;1672    413;453    0;0    0;0    0;2    31;0    .    c.[5070A>C]+[=]    20.97    .    Not low     Not found 
13    77570076    -    A    exonic    CLN5    2762    26.6    2060;702    0;0    0;0    0;0    2050;696    0;0    .    c.526_527insA    99.42    TP    Not low     Not found 
7    148106478    -    GT    intronic    CNTNAP2    4051    28.5    0;1    0;0    0;0    2220;1829    1085;887    0;1    rs60451214    c.3716-5_3716-4insGT    48.68    .    Not low     Not found 
9    138678036    TGCCC    -    intronic    KCNT1    834    23.1    0;0    0;0    0;31    0;1    0;0    0;802    rs141359570    c.3178-7_3178-3delTGCCC    96.16    .    Not low     Not found 
7    148106476    -    TT    intronic    CNTNAP2    4052    28.8    0;0    5;0    0;0    2221;1826    1081;884    0;0    rs61232377    c.3716-7_3716-6insTT    48.49    .    Not low     Not found 
2    166245425    C    T    exonic    SCN2A    49    12.6    0;0    13;9    0;0    18;9    0;0    0;0    .    c.[5109C>T]+[=]    55.1    .    Not low     found


Last edited by cmccabe; 10-21-2016 at 06:29 PM.. Reason: fixed foemat
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Is there a UNIX command that can compare fields of files with differing number of fields?

Hi, Below are the sample files. x.txt is from an Excel file that is a list of users from Windows and y.txt is a list of database account. $ head -500 x.txt y.txt ==> x.txt <== TEST01 APP_USER_PROFILE USER03 APP_USER_PROFILE TEST02 APP_USER_EXP_PROFILE TEST04 APP_USER_PROFILE USER01 ... (3 Replies)
Discussion started by: newbie_01
3 Replies

2. Shell Programming and Scripting

Fields shifting in file, do to null values?

The below code runs and creates an output file with three sections. The first 2 sections are ok, but the third section doesn't seem to put a . in all the fields that are blank. I don't know if this is what causes the last two fields in the current output to shift to a newline, but I can not seem... (3 Replies)
Discussion started by: cmccabe
3 Replies

3. Shell Programming and Scripting

awk sort based on difference of fields and print all fields

Hi I have a file as below <field1> <field2> <field3> ... <field_num1> <field_num2> Trying to sort based on difference of <field_num1> and <field_num2> in desceding order and print all fields. I tried this and it doesn't sort on the difference field .. Appreciate your help. cat... (9 Replies)
Discussion started by: newstart
9 Replies

4. Shell Programming and Scripting

Can ksh read records with blank fields

I have a tab delimited file with some fields potentially containing no data. In ksh 'read' though treats multiple tabs as a single delimiter. Is there any way to change that behavior so I could have blank data too? I.e. When encountering 2 tabs it would take it as a null field? Or do I have to... (3 Replies)
Discussion started by: benalt
3 Replies

5. Shell Programming and Scripting

How to search for blank fields in a text file from a certain position?

Sample txt file : OK00001111112| OK00003443434|skjdaskldj OK32812983918|asidisoado OK00000000001| ZM02910291029|sldkjaslkjdasldjk what would be the shell script to figure out the blank space (if any) after the pipe sign? (4 Replies)
Discussion started by: chatwithsaurav
4 Replies

6. Shell Programming and Scripting

Count blank fields in every line

Hello All, I am trying a one liner for finding the number of null columns in every line of my flat file. The format of my flat file is like this a|b|c|d||||e|f|g| a|b|c|d||||e|f|g| I want to count the number of fields delimited by "|" which are blank. In above case the count should be... (6 Replies)
Discussion started by: nnani
6 Replies

7. Shell Programming and Scripting

How to print 1st field and last 2 fields together and the rest of the fields after it using awk?

Hi experts, I need to print the first field first then last two fields should come next and then i need to print rest of the fields. Input : a1,abc,jsd,fhf,fkk,b1,b2 a2,acb,dfg,ghj,b3,c4 a3,djf,wdjg,fkg,dff,ggk,d4,d5 Expected output: a1,b1,b2,abc,jsd,fhf,fkk... (6 Replies)
Discussion started by: 100bees
6 Replies

8. Shell Programming and Scripting

remove blank spaces from fields

Hi Friends, I have large volume of data file as shown below. Beganing or end of each filed, there are some blank spaces. How do I remove those spaces? AAA AAA1 | BBB BB1 BB2 |CC CCCC DDDD DD | EEEEEEE EEEEEEEE | FFF FFFFFF FFFF GG GGGGGG |HH HH ... (3 Replies)
Discussion started by: ppat7046
3 Replies

9. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

I'm working on formatting some attendance data to meet a vendors requirements to upload to their system. With some help on the forums here, I have the data close. But they've since changed what they want. The vendor wants me to submit three fields to them. Field 1 is the studentid field,... (4 Replies)
Discussion started by: axo959
4 Replies

10. Shell Programming and Scripting

how to include field separator if there are blank fields?

Hi, I have the following data in the format as shown (note: there are more than 1 blank spaces between each field and the spaces are not uniform, meaning there can be one blank space between field1 and field2 and 3 spaces between field3 and field4, in this example, # are the spaces in between... (19 Replies)
Discussion started by: ReV
19 Replies
Login or Register to Ask a Question