awk to remove row 1 and blanks


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to remove row 1 and blanks
# 1  
Old 10-26-2016
awk to remove row 1 and blanks

I am trying to remove $1 along with the blank values from the file. Thank you Smilie.

file
Code:
R_Index    Chr    Start    End    Ref    Alt    Func.IDP.refGene    Gene.IDP.refGene    GeneDetail.IDP.refGene    Inheritence    ExonicFunc.IDP.refGene    AAChange.IDP.refGene    avsnp147    PopFreqMax    1000G_ALL    1000G_AFR    1000G_AMR    1000G_EAS    1000G_EUR    1000G_SAS    ExAC_ALL    ExAC_AFR    ExAC_AMR    ExAC_EAS    ExAC_FIN    ExAC_NFE    ExAC_OTH    ExAC_SAS    ESP6500siv2_ALL    ESP6500siv2_AA    ESP6500siv2_EA    CG46    dpsi_max_tissue    dpsi_zscore    SIFT_score    SIFT_pred    Polyphen2_HDIV_score    Polyphen2_HDIV_pred    Polyphen2_HVAR_score    Polyphen2_HVAR_pred    LRT_score    LRT_pred    MutationTaster_score    MutationTaster_pred    MutationAssessor_score    MutationAssessor_pred    CLINSIG    CLNDBN    CLNACC    CLNDSDB    CLNDSDBID    Quality    Reads    Zygosity    Phred    Classification    HGMD    Sanger
1    chr1    2234824    2234824    C    T    exonic    SKI    .    .    nonsynonymous SNV    SKI:NM_003036.3:exon3:c.1196C>T:p.A399V    rs141862996    0.0045    0.0012    0.0045    .    .    .    .    0.0004    0.0043    .    .    .    0.0002    .    .    0.0014    0.0034    0.0003    .    0.0834    0.430    0.29    T    0.993    D    0.978    D    .    D    0.995    D    1.7    L    Uncertain significance    not_specified    RCV000195594.1    MedGen    CN169374    GOOD    390    het    20    .    .    .
2    chr1    2235405    2235405    C    T    exonic    SKI    .    .    synonymous SNV    SKI:NM_003036.3:exon4:c.1338C>T:p.L446L    rs148632347    0.0053    0.0014    0.0053    .    .    .    .    0.0003    0.0039    .    .    .    .    .    .    0.0012    0.0036    .    .    -0.1582    -0.600    .    .    .    .    .    .    .    .    .    .    .    .    Benign    not_specified    RCV000195511.1    MedGen    CN169374    GOOD    400    het    27    .    .    .

awk
Code:
awk -F'\t' -v OFS='\t' '!($1="")' file | sed -e '/^ *$/d'  > out
awk -F'\t' -v OFS='\t' '!($1="")' file | awk 'NF'  > out
awk -F'\t' -v OFS='\t' '!($1="")' file | awk 'NF < 0'  > out

All the attempts above seem to remove $1 but not the blank spaces.

current output
Code:
    Chr    Start    End    Ref    Alt    Func.IDP.refGene    Gene.IDP.refGene    GeneDetail.IDP.refGene    Inheritence    ExonicFunc.IDP.refGene    AAChange.IDP.refGene    avsnp147    PopFreqMax    1000G_ALL    1000G_AFR    1000G_AMR    1000G_EAS    1000G_EUR    1000G_SAS    ExAC_ALL    ExAC_AFR    ExAC_AMR    ExAC_EAS    ExAC_FIN    ExAC_NFE    ExAC_OTH    ExAC_SAS    ESP6500siv2_ALL    ESP6500siv2_AA    ESP6500siv2_EA    CG46    dpsi_max_tissue    dpsi_zscore    SIFT_score    SIFT_pred    Polyphen2_HDIV_score    Polyphen2_HDIV_pred    Polyphen2_HVAR_score    Polyphen2_HVAR_pred    LRT_score    LRT_pred    MutationTaster_score    MutationTaster_pred    MutationAssessor_score    MutationAssessor_pred    CLINSIG    CLNDBN    CLNACC    CLNDSDB    CLNDSDBID    Quality    Reads    Zygosity    Phred    Classification    HGMD    Sanger
    chr1    2234824    2234824    C    T    exonic    SKI    .    .    nonsynonymous SNV    SKI:NM_003036.3:exon3:c.1196C>T:p.A399V    rs141862996    0.0045    0.0012    0.0045    .    .    .    .    0.0004    0.0043    .    .    .    0.0002    .    .    0.0014    0.0034    0.0003    .    0.0834    0.430    0.29    T    0.993    D    0.978    D    .    D    0.995    D    1.7    L    Uncertain significance    not_specified    RCV000195594.1    MedGen    CN169374    GOOD    390    het    20    .    .    .
    chr1    2235405    2235405    C    T    exonic    SKI    .    .    synonymous SNV    SKI:NM_003036.3:exon4:c.1338C>T:p.L446L    rs148632347    0.0053    0.0014    0.0053    .    .    .    .    0.0003    0.0039    .    .    .    .    .    .    0.0012    0.0036    .    .    -0.1582    -0.600    .    .    .    .    .    .    .    .    .    .    .    .    Benign    not_specified    RCV000195511.1    MedGen    CN169374    GOOD    400    het    27    .    .    .

desired output
Code:
Chr    Start    End    Ref    Alt     Func.IDP.refGene    Gene.IDP.refGene    GeneDetail.IDP.refGene     Inheritence    ExonicFunc.IDP.refGene    AAChange.IDP.refGene     avsnp147    PopFreqMax    1000G_ALL    1000G_AFR    1000G_AMR     1000G_EAS    1000G_EUR    1000G_SAS    ExAC_ALL    ExAC_AFR     ExAC_AMR    ExAC_EAS    ExAC_FIN    ExAC_NFE    ExAC_OTH    ExAC_SAS     ESP6500siv2_ALL    ESP6500siv2_AA    ESP6500siv2_EA    CG46     dpsi_max_tissue    dpsi_zscore    SIFT_score    SIFT_pred     Polyphen2_HDIV_score    Polyphen2_HDIV_pred    Polyphen2_HVAR_score     Polyphen2_HVAR_pred    LRT_score    LRT_pred    MutationTaster_score     MutationTaster_pred    MutationAssessor_score     MutationAssessor_pred    CLINSIG    CLNDBN    CLNACC    CLNDSDB     CLNDSDBID    Quality    Reads    Zygosity    Phred    Classification     HGMD    Sanger
chr1    2234824    2234824    C    T    exonic     SKI    .    .    nonsynonymous SNV     SKI:NM_003036.3:exon3:c.1196C>T:p.A399V    rs141862996    0.0045     0.0012    0.0045    .    .    .    .    0.0004    0.0043    .    .     .    0.0002    .    .    0.0014    0.0034    0.0003    .    0.0834     0.430    0.29    T    0.993    D    0.978    D    .    D    0.995     D    1.7    L    Uncertain significance    not_specified     RCV000195594.1    MedGen    CN169374    GOOD    390    het    20    .     .    .
chr1    2235405    2235405    C    T    exonic    SKI    .     .    synonymous SNV    SKI:NM_003036.3:exon4:c.1338C>T:p.L446L     rs148632347    0.0053    0.0014    0.0053    .    .    .    .     0.0003    0.0039    .    .    .    .    .    .    0.0012    0.0036     .    .    -0.1582    -0.600    .    .    .    .    .    .    .    .     .    .    .    .    Benign    not_specified    RCV000195511.1     MedGen    CN169374    GOOD    400    het    27    .    .    .


Last edited by cmccabe; 10-26-2016 at 03:45 PM.. Reason: fixed format
# 2  
Old 10-26-2016
Hi,

Code:
cat xx
xft    ui    ui    uk

Code:
awk -F'\t' '{$1=x;sub(/^\t/,y)}1' OFS='\t' xx

Gives desired output:
Code:
ui    ui    uk

Code:
awk '{$1=x;sub(/^\t/,y)}1' OFS='\t' file

I doubt your file is tab limited.
This User Gave Thanks to greet_sed For This Post:
# 3  
Old 10-26-2016
You say your input and output files are <tab> delimited, but there are no <tab> characters in anything you have posted. Setting the 1st field to an empty string does not delete a field; it just deletes the contents of that field. To get rid of a field, you have to get rid of the data in the field and the following field delimiter.

If we replace every occurrence of 4 adjacent <space>s in your sample input and output with a <tab>, and then remove all remaining <space> characters from your sample output file, the following seems to do what you want:
Code:
awk '{	sub(/^[^\t]*\t/, "")	# Get rid of 1st field.
	gsub(/ /, "")		# Get rid of all remaining <space>s.
}
1				# Print updated line.
' file

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
This User Gave Thanks to Don Cragun For This Post:
# 4  
Old 10-26-2016
Code:
awk 'NF && sub(".*" $2, $2)' infile

This User Gave Thanks to rdrtx1 For This Post:
# 5  
Old 10-27-2016
Quote:
Originally Posted by rdrtx1
Code:
awk 'NF && sub(".*" $2, $2)' infile

This will delete more than desired on lines that have a repeat of $2 somewhere on the line.
This User Gave Thanks to Scrutinizer For This Post:
# 6  
Old 10-27-2016
Thank you all Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Problem with blanks, shifting the row when using awk

Hello, I have files with fixed length fields. 12345 12345 12345671234567 10 1234567 12345 12345 123456 1234567 10 1234567 I want to take 1st, 3rd, 4th and 6th string. Usually 3rd and 4th string are coming as it is on the first row /1234567 and... (3 Replies)
Discussion started by: apenkov
3 Replies

2. Shell Programming and Scripting

Remove multiple blanks

Hi, I have data as below And I want output as Thanks (5 Replies)
Discussion started by: Anjan1
5 Replies

3. Shell Programming and Scripting

Subtracting each row from the first row in a single column file using awk

Hi Friends, I have a single column data like below. 1 2 3 4 5 I need the output like below. 0 1 2 3 4 where each row (including first row) subtracting from first row and the result should print below like the way shown in output file. Thanks Sid (11 Replies)
Discussion started by: ks_reddy
11 Replies

4. Shell Programming and Scripting

awk - remove row if specific field is empty/blank

I have this text.filecharles darwin sam delight george washington johnson culper darwin sam delight micheal jackson penny lite and would like to remove the row, if the first field is blank. so the result would be: result.filecharles darwin sam ... (4 Replies)
Discussion started by: charles33
4 Replies

5. Shell Programming and Scripting

how to remove a specifed row from a file with AWK

Hi friends, Plz tell how to remove a row from a file thrugh awk command. Thanks in advance,:) (3 Replies)
Discussion started by: sivaranga001
3 Replies

6. Shell Programming and Scripting

remove row if string is same as previous row

I have data like: Blue Apple 6 Red Apple 7 Yellow Apple 8 Green Banana 2 Purple Banana 8 Orange Pear 11 What I want to do is if $2 in a row is the same as $2 in the previous row remove that row. An identical $2 may exist more than one time. So the out file would look like: Blue... (4 Replies)
Discussion started by: dcfargo
4 Replies

7. Shell Programming and Scripting

regex to remove commentaries and blanks

Hi all, I need to prune a var's content as follows: VAR='blah blah # seew seew' NEWVAR='blah blah' (without blanks) I need also to perform this change by using variable substitution within bash shell. I've tried it with the following subst: VAR2=${VAR/ \#*/} but the... (7 Replies)
Discussion started by: yomaya
7 Replies

8. Shell Programming and Scripting

awk{FIELDWIDTHS} replacing blanks with null

Hi, I am having a file and grabbed the contents of the field according to field widths. The command i used is: awk 'BEGIN{FIELDWIDTHS="10 25 20 14 6 10"}{print$4,$5,$6}' newtext.text >test1.txt i got the output for example: val1 val2 val3 <blank> ... (3 Replies)
Discussion started by: rish_max
3 Replies

9. Shell Programming and Scripting

how to remove trailing blanks, tabs

Hi I need to delete trailing spaces, tabs and unprintable charactes from the file. This file has a number of blank lines which should be left intact. Another words I am trying to remove the junk at the end of each line. Does anyone come across the similar problem? Thanks a lot for any help -A (3 Replies)
Discussion started by: aoussenko
3 Replies

10. Shell Programming and Scripting

blanks in an awk comand

Hi, Im trying to write something that will report if a filesytem is over 80% but the problem is the output reports on a dir that is a 9%. any ideas? this is what I have ************************************************* if then param=90 else param=$1 fi df -kl | grep... (1 Reply)
Discussion started by: collie
1 Replies
Login or Register to Ask a Question