Sponsored Content
Top Forums Shell Programming and Scripting Fields shifting in file, do to null values? Post 302986963 by cmccabe on Friday 2nd of December 2016 02:19:17 PM
Old 12-02-2016
Fields shifting in file, do to null values?

The below code runs and creates an output file with three sections. The first 2 sections are ok, but the third section doesn't seem to put a . in all the fields that are blank. I don't know if this is what causes the last two fields in the current output to shift to a newline, but I can not seem to solve this. They should not be on a new line and maybe it is because of the spaces in the fields. The code in bold seems to add . to some of the nulls, but not all of them.Thank you Smilie.


Code:
# update found in reference missing in IDP
for file in /home/cmccabe/Desktop/concordance/comparison/update/*.txt ; do
    file1=${file##*/}    # Strip off directory
    getprefix=${file1%%_*.txt}
    file1=$(printf '%s\n' "/home/cmccabe/Desktop/concordance/reference/files/${file1%%_*.txt}_"*.txt) # look for matching file
    if [[ -f "$file1" ]]
    then
          awk '
BEGIN {FS = OFS = "\t"
}
NR == 1 {
outfile = FILENAME
}
FNR == NR {
o[i[++ic] = $1 OFS $2 OFS $3] = $0
}
{for(f=1;f<=19;f++)
{if($f == "")$f = "."}
}
{if($2 OFS $4 OFS $5 in o)
o[$2 OFS $4 OFS $5] = $1 OFS $2 OFS $4 OFS $5 OFS $6 OFS $7 OFS $8 OFS $9 OFS $10 OFS $11 OFS $12 OFS $13 OFS $14 OFS $15 OFS $16 OFS $17 OFS $18 OFS $19}
END {for(j = 1; j <= ic; j++)
print o[i[j]] > outfile
}' $file $file1
   fi
done

current output
Code:
Missing in IDP but found in Reference:	
CHR	POS	REF	ALT	FUNC	GENE	COVERAGE	PHRED	A[#F,#R]	C[#F,#R]	G[#F,#R]	T[#F,#R]	INS[#F,#R]	DEL[#F,#R]	SNP	MUT	FREQ	SANGER	REGION	TVC
9	138676398	-	C	exonic	KCNT1	97	13.5	0;0	0;97	0;0	0;0	0;24	0;0	.	c.2961_2962insC	24.74	FP
	Not low	 Not found
9	131337098	-	T	intronic	SPTAN1	1522	15.3	0;0	0;0	295;1227	0;0	1;277	0;0	.	c.504+4_504+5insT	18.27	
	Not low	 Not found
10	78944590	G	A	exonic	KCNMA1	2173	24.8	448;626	1;0	496;598	0;0	3;0	0;4	rs1131824	c.[687C>T]+[=]	49.42	
	Not low	 found
20	62038393	-	G	exonic	KCNQ2	140	13.6	0;0	0;0	132;8	0;0	63;2	0;0	.	c.2223_2224insC	46.43	FP
	Not low	 Not found
2	166848646	G	A	exonic	SCN1A	110	15.7	20;16	0;0	44;30	0;0	0;0	0;0	.	c.[5139C>T]+[=]	32.73	
	Not low	 found
2	166210776	C	T	exonic	SCN2A	3095	23.1	0;0	1158;1177	0;0	457;303	1;0	0;0	.	c.[2994C>T]+[=]	24.56	
	Not low	 found
9	138676400	-	C	exonic	KCNT1	98	13.5	0;0	0;0	0;98	0;0	0;19	0;0	.	c.2963_2964insC	19.39	FP
	Not low	 Not found
11	1780815	C	-	exonic	CTSD	187	12.9	0;0	9;117	0;0	0;0	0;0	0;61	rs141482597	c.283delG	32.62	RFP
	Not low	 Not found
16	10273906	-	G	exonic	GRIN2A	3252	16.7	0;2	0;0	586;2664	0;0	3;627	0;0	rs145961628	c.363_364insC	19.37	RFP
	Not low	 Not found
7	148106478	-	GT	intronic	CNTNAP2	4168	28.6	0;0	0;1	0;0	2199;1967	1129;997	0;1	rs60451214	c.3716-5_3716-4insGT	51.01	
	Not low	 Not found
18	53303101	C	G	exonic	TCF4	1822	20	2;0	0;0	739;1027	0;0	0;0	1;53	rs611326	c.[-48754C>G]+[-48754C>G]	96.93	
	Not low	 found
2	166901684	-	T	exonic	SCN1A	1540	14.4	313;1227	0;0	0;0	0;0	0;291	0;0	.	c.1530_1531insA	18.9	FP
	Not low	 Not found
7	148106476	-	TT	intronic	CNTNAP2	4170	28.6	0;0	0;1	0;0	2208;1961	1131;996	0;0	rs61232377	c.3716-7_3716-6insTT	51.01	
	Not low	 Not found

desired output
Code:
Missing in IDP but found in Reference:	
CHR	POS	REF	ALT	FUNC	GENE	COVERAGE	PHRED	A[#F,#R]	C[#F,#R]	G[#F,#R]	T[#F,#R]	INS[#F,#R]	DEL[#F,#R]	SNP	MUT	FREQ	SANGER	REGION	TVC
9	138676398	-	C	exonic	KCNT1	97	13.5	0;0	0;97	0;0	0;0	0;24	0;0	.	c.2961_2962insC	24.74	FP     Not low	Not found
9	131337098	-	T	intronic	SPTAN1	1522	15.3	0;0	0;0	295;1227	0;0	1;277	0;0	.	c.504+4_504+5insT	18.27	.	Not low	 Not found
10	78944590	G	A	exonic	KCNMA1	2173	24.8	448;626	1;0	496;598	0;0	3;0	0;4	rs1131824	c.[687C>T]+[=]	49.42	.	Not low	found
20	62038393	-	G	exonic	KCNQ2	140	13.6	0;0	0;0	132;8	0;0	63;2	0;0	.	c.2223_2224insC	46.43	FP	Not low	Not found
2	166848646	G	A	exonic	SCN1A	110	15.7	20;16	0;0	44;30	0;0	0;0	0;0	.	c.[5139C>T]+[=]	32.73	.	Not low	found
2	166210776	C	T	exonic	SCN2A	3095	23.1	0;0	1158;1177	0;0	457;303	1;0	0;0	.	c.[2994C>T]+[=]	24.56	.	Not low	found
9	138676400	-	C	exonic	KCNT1	98	13.5	0;0	0;0	0;98	0;0	0;19	0;0	.	c.2963_2964insC	19.39	FP	Not low	Not found
11	1780815	C	-	exonic	CTSD	187	12.9	0;0	9;117	0;0	0;0	0;0	0;61	rs141482597	c.283delG	32.62	RFP	Not low	Not found
16	10273906	-	G	exonic	GRIN2A	3252	16.7	0;2	0;0	586;2664	0;0	3;627	0;0	rs145961628	c.363_364insC	19.37	RFP	Not low	Not found
7	148106478	-	GT	intronic	CNTNAP2	4168	28.6	0;0	0;1	0;0	2199;1967	1129;997	0;1	rs60451214	c.3716-5_3716-4insGT	51.01	.	Not low	Not found
18	53303101	C	G	exonic	TCF4	1822	20	2;0	0;0	739;1027	0;0	0;0	1;53	rs611326	c.[-48754C>G]+[-48754C>G]	96.93	.	Not low	found
2	166901684	-	T	exonic	SCN1A	1540	14.4	313;1227	0;0	0;0	0;0	0;291	0;0	.	c.1530_1531insA	18.9	FP	Not low     Not found
7	148106476	-	TT	intronic	CNTNAP2	4170	28.6	0;0	0;1	0;0	2208;1961	1131;996	0;0	rs61232377	c.3716-7_3716-6insTT	51.01	. Not low     Not found

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Null values after emptying a log file

Hi, I have a log file which is constantly being written to by some process. I need to clear that log file on a daily basis. The problem is that when I issue this command: echo "" > logfile.log the file gets filled with nulls thus increasing the size of the file. Is there a way to... (2 Replies)
Discussion started by: kasie4life
2 Replies

2. Shell Programming and Scripting

Find null fields in file

Hi All, I have some csv files out of which i want to find records which have empty values in either the 14th or 16th fields. The following is a sample. $cut -d',' -f14,16 SPS* | head -5 VOIP_ORIG_INFO,VOIP_DEST_INFO sip:445600709315@sip.com,sip:999@sip.com... (2 Replies)
Discussion started by: rahulrathod
2 Replies

3. Shell Programming and Scripting

Replace 3 fields with null in the file

Hi, I have a file with 104 columns delimited by comma. I have to replace fields 4,5 and 19 with null values and after replacing the columns in the file , the file should be still comma delimited. I am new to shell scripting, Experts please help me out. Thank you (1 Reply)
Discussion started by: vukkusila
1 Replies

4. Shell Programming and Scripting

identifying null values in a file

I have a huge file with 20 fileds in each record and each field is seperated by "|". If i want to get all the reocrds that have 18th or for that matter any filed as null how can i do it? Please let me know (3 Replies)
Discussion started by: dsravan
3 Replies

5. Shell Programming and Scripting

Find out if few fields in a file are null

Hi, I've a pipe delimited file where I want to find out a number of lines where 1st 2nd and last field are null using awk/sed. Is it possible? Thanks (5 Replies)
Discussion started by: rudoraj
5 Replies

6. Shell Programming and Scripting

File values alwaya null

Hi All , below is my shell program. !/bin/sh set -x #---------------------------------------------------------------------------------------- # Program : weekly_remove_icd_file.sh # Author : # Date : 04/06/2013 # Purpose : Execute the script to... (3 Replies)
Discussion started by: krupasindhu18
3 Replies

7. Shell Programming and Scripting

Replace a field where values are null in a file.

Hi, I've a pipe delimited file and wanted to replace the 3rd field to 099990 where the values are null. How can I achieve it using awk or sed. 20130516|00000061|02210|111554|03710|2|205069|SM APPL $80-100 RTL|S 20130516|00000061|02210|111554|03710|2|205069|SM APPL $80-100 RTL|S... (12 Replies)
Discussion started by: rudoraj
12 Replies

8. Shell Programming and Scripting

Grep null values in a file with no delimiter

Hi Folks, We have a file that has null values but there are no delimiters. So all columns are considered as a single column. Ex: abc def 123 abcdef1234567 hijklmn7896545 Now from "a" till "3" all are considered as a single column from the first row. Our requirement is like, we... (2 Replies)
Discussion started by: jayadanabalan
2 Replies

9. Shell Programming and Scripting

Print . in blank fields to prevent fields from shifting

The below code works great, kindly provided by @Don Cragun, the lines in bold print the current output. Since some of the fields printed can be blank some of the fields are shifted. I can not seem too add . to the blank fields like in the desired output. Basically, if there is nothing in the field... (10 Replies)
Discussion started by: cmccabe
10 Replies

10. Shell Programming and Scripting

Count null values in a file using awk

I have the following a.txt file A|1|2|3|4|5| A||2|3|0|| A|1|6||8|10| A|9|2|3|4|1| A|0|9|3|4|5| A||2|3|4|5| A|0|av|.9|4|9| I use the following command to count null values for 2nd field awk -F"|" '!$2 { N++; next } END {print N}' a.txt It should give the result 2, but it is giving... (2 Replies)
Discussion started by: RJG
2 Replies
All times are GMT -4. The time now is 11:35 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy