awk to remove range of fields


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to remove range of fields
# 1  
Old 07-07-2016
awk to remove range of fields

I am trying to cut a range of fields in awk. The below seems to work for removing field 50, but what is the correct syntax for removing a range ($50-$62). Thank you Smilie.

awk
Code:
awk 'BEGIN{FS=OFS="\t"}{$50=""; gsub(/\t\t/,"\t")}1' test.vcf.hg19_multianno.txt > output.csv

Maybe:
Code:
awk 'BEGIN{FS=OFS="\t"}{$50:$62=""; gsub(/\t\t/,"\t")}1' test.vcf.hg19_multianno.txt > output.csv

# 2  
Old 07-07-2016
Code:
awk 'BEGIN{FS=OFS="\t"} {for (i=50; i<=62; i++) $i = ""; gsub(/\t+/,"\t")}1'

(untested)
This User Gave Thanks to RudiC For This Post:
# 3  
Old 07-07-2016
works great... thank you Smilie.
# 4  
Old 07-07-2016
The code that you posted originally and the code suggested by RudiC will remove any empty fields from your input file in addition to the fields you want to remove. The following will only remove fields 50 through 62, inclusive:
Code:
awk '
BEGIN {	FS = OFS = "\t"
}
{	for(i = 1; i <= NF; i++)
		if(i < 50 || i > 62)
			printf("%s%s", $i, (i == NF) ? ORF : OFS)
}' test.vcf.hg19_multianno.txt > output.csv

The above code should do what you want (assuming that you have at least 63 fields in each input line). If some lines have less than 63 input fields, slightly different logic would be needed to ensure that each line is properly terminated and that no unneeded field separators are included in the output (after we get a clear description of whether empty fields should be added to the ends of short field count lines or if they should be omitted).

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
These 2 Users Gave Thanks to Don Cragun For This Post:
# 5  
Old 07-07-2016
This might be a little safer if you have empty fields somewhere on the line or lines with less than 62 fields:

Code:
awk -v F=50 -v T=62 '
BEGIN{FS=OFS="\t"}
{ b=T+1
  t=T<NF?T:NF
  for(i=F;i<NF-t+F;i++) $i=$(b++)
  NF=--i}1'

This User Gave Thanks to Chubler_XL For This Post:
# 6  
Old 07-08-2016
Quote:
Originally Posted by cmccabe
I am trying to cut a range of fields in awk. The below seems to work for removing field 50, but what is the correct syntax for removing a range ($50-$62). Thank you Smilie.

awk
Code:
awk 'BEGIN{FS=OFS="\t"}{$50=""; gsub(/\t\t/,"\t")}1' test.vcf.hg19_multianno.txt > output.csv

Maybe:
Code:
awk 'BEGIN{FS=OFS="\t"}{$50:$62=""; gsub(/\t\t/,"\t")}1' test.vcf.hg19_multianno.txt > output.csv

Alternative?
Code:
perl -nale '$"="\t"; print "@F[0..48,62..$#F]"' test.vcf.hg19_multianno.txt > output.csv

This User Gave Thanks to Aia For This Post:
# 7  
Old 07-08-2016
Thank you all... works great Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed or awk to remove specific column to one range

I need to remove specific column to one range source file 3 1 000123456 2 2 000123569 3 3 000123564 12 000123156 15 000125648 128 000125648 Output required 3 000123456 2 000123569 3 000123564 12 000123156 15 000125648 128 000125648 (6 Replies)
Discussion started by: ranjancom2000
6 Replies

2. Shell Programming and Scripting

awk to remove lines where field count is greather than 1 in two fields

I am trying to remove all the lines and spaces where the count in $4 or $5 is greater than 1 (more than 1 letter). The file and the output are tab-delimited. Thank you :). file X 5811530 . G C NLGN4X 17 10544696 . GA G MYH3 9 96439004 . C ... (1 Reply)
Discussion started by: cmccabe
1 Replies

3. Shell Programming and Scripting

awk to search field2 in file2 using range of fields file1 and using match to another field in file1

I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited. I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Discussion started by: cmccabe
6 Replies

4. Shell Programming and Scripting

awk sort based on difference of fields and print all fields

Hi I have a file as below <field1> <field2> <field3> ... <field_num1> <field_num2> Trying to sort based on difference of <field_num1> and <field_num2> in desceding order and print all fields. I tried this and it doesn't sort on the difference field .. Appreciate your help. cat... (9 Replies)
Discussion started by: newstart
9 Replies

5. Shell Programming and Scripting

awk - compare 1st 15 fields of record with 20 fields

I'm trying to compare 2 files for differences in a selct number of fields. When differnces are found it will write the whole record of the second file including appending '|C' out to a delta file. Each record will have 20 fields, but only want to do comparison of 1st 15 fields. The 1st field of... (7 Replies)
Discussion started by: sljnk
7 Replies

6. Shell Programming and Scripting

How to print 1st field and last 2 fields together and the rest of the fields after it using awk?

Hi experts, I need to print the first field first then last two fields should come next and then i need to print rest of the fields. Input : a1,abc,jsd,fhf,fkk,b1,b2 a2,acb,dfg,ghj,b3,c4 a3,djf,wdjg,fkg,dff,ggk,d4,d5 Expected output: a1,b1,b2,abc,jsd,fhf,fkk... (6 Replies)
Discussion started by: 100bees
6 Replies

7. Shell Programming and Scripting

awk to print range of fields

Hi file.in and file.out are in csv format. the code I have now is, cat file.in | awk -F"," '!($1$2$3$4$5$6$7$8 in a){a;print $0}' > file.out Here, I am printing entire line using $0. however, I want to print $1 to $150 and it should be in csv format. Cut -d is not good in performace.... (3 Replies)
Discussion started by: krishnix
3 Replies

8. Shell Programming and Scripting

Remove rows with first 4 fields duplicated in awk

Hi, I am trying to use awk to remove all rows where the first 4 fields are duplicates. e.g. in the following data lines 6-9 would be removed, leaving one copy of the duplicated row (row 5) Borgarhraun FH9822 ol24 FH9822_ol24_m20 ol Deformed c Borgarhraun FH9822 ol24 ... (3 Replies)
Discussion started by: tomahawk
3 Replies

9. Shell Programming and Scripting

Trim empty fields in a given range

Is there some easy way to trim empty fields but only in a given range? for example say I have csv data that looks like this: apple,,,Granysmith,,2.50,,TimmysGrocers Pear,Bartlett,,,,,Park, peach,,,,Peento,3.00,Garden,TimmysGrocers is there a way of getting the single field with data... (4 Replies)
Discussion started by: cue
4 Replies

10. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

I'm working on formatting some attendance data to meet a vendors requirements to upload to their system. With some help on the forums here, I have the data close. But they've since changed what they want. The vendor wants me to submit three fields to them. Field 1 is the studentid field,... (4 Replies)
Discussion started by: axo959
4 Replies
Login or Register to Ask a Question