awk to create separate files but not include specific field in output


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to create separate files but not include specific field in output
# 1  
Old 05-09-2018
awk to create separate files but not include specific field in output

I am trying to use awk to create (in this example) 3 seperate text file from the unique id in $1 in file, if it starts with the pattern aa. The contents of each row is used to populate each text file except for $1 which is not needed. It seems I am close but not quite get there. Thank you Smilie.

file tab-delimeted
Code:
aa1110-0	12	47259533	47259533	G	A	Comment:heterozygous_snv
aa1110-1	11	23892795	23892799	G	C	Comment:heterozygous_snv
	2	7581601	7581601	T	A	Comment:heterozygous_snv
aa1110-2	1	237837422	237837422	C	TTC	Comment:substitution
	3	7583892	7583892	G	A	Comment: heterozygous snv
		19	23892788	23892799	G	-	Comment:deletion

awk

Code:
awk -F'\t' '/^aa/{                     # if line starts with aa
        if(!w)                          # if negate of w is true
           f=sprintf($1"%d.txt",++n);   # pre increment n, and set up variable f 
        w=1;                            # set variable w = 1
        print >f;                       # write record/row/line to file
        next                            # go to next line
     }
     {                                  # for which does not start with aa  
        close(f);                       # close file
        w=0                             # set w = 0 for next line with aa use newfile
     }
' file

current output is two files with each row in them but $1 as well
Here is one:
Code:
aa1110-0	12	47259533	47259533	G	A	Comment:heterozygous_snv
aa1110-1	11	23892795	23892799	G	C	Comment:heterozygous_snv

awk

Code:
awk '{for(i=2;i<=NF;i++){printf "%s ", $i >> $1".txt"};printf "\n" >> $1".txt"; close($1".txt")}' file

current output is three files with no $1 in them but only one line in them.
Here is the same file as above:
Code:
12	47259533	47259533	G	A	Comment:heterozygous_snv

desired output tab-delimeted
Code:
aa1110-0.txt
12	47259533	47259533	G	A	Comment:heterozygous_snv

aa1110-1.txt
11	23892795	23892799	G	C	Comment:heterozygous_snv
2	7581601	7581601	T	A	Comment:heterozygous_snv

aa1110-2.txt
1	237837422	237837422	C	TTC	Comment:substitution
3	7583892	7583892	G	A	Comment:heterozygous_snv
19	23892788	23892799	G	-	Comment:deletion


Last edited by cmccabe; 05-09-2018 at 02:45 PM.. Reason: fixed format
# 2  
Old 05-09-2018
Try this:

Code:
awk -F'\t' '
/^aa/{                             # if line starts with aa
   if(f) close(f)                  # close already open file
   f=sprintf($1"%d.txt",++n)       # pre increment n, and set up variable f 
}
f {                                # if file name created
   $1=""                           # blank field #1
   $0=substr($0, 2)                # strip blank #1 field
   print >f;                       # write record/row/line to file
}
' OFS='\t' file

This User Gave Thanks to Chubler_XL For This Post:
# 3  
Old 05-09-2018
The following seems to do what I think you want; which assumes you don't want extra whitespace characters added to the ends of your output lines, that you want <tab> delimited output from your <tab> delimited input, and that you just want the contents of field 1 with .txt added as the filename for your output files (with no sequence numbering added to the filenames):
Code:
awk '
BEGIN {	FS = OFS = "\t"
}
/^aa/ {	if(f != "")
		close(f)
	f = $1 ".txt"
}
{	for(i = 2; i <= NF; i++)
		printf("%s%s", $i, (i == NF) ? ORS : OFS) > f
}' file

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
This User Gave Thanks to Don Cragun For This Post:
# 4  
Old 05-10-2018
Thank you both very much Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Awk: output lines with common field to separate files

Hi, A beginner one. my input.tab (tab-separated): h1 h2 h3 h4 h5 item1 grpA 2 3 customer1 item2 grpB 4 6 customer1 item3 grpA 5 9 customer1 item4 grpA 0 0 customer2 item5 grpA 9 1 customer2 objective: output a file for each customer ($5) with the item number ($1) only if $2 matches... (2 Replies)
Discussion started by: beca123456
2 Replies

2. Shell Programming and Scripting

awk to parse field and include the text of 1 pipe in field 4

I am trying to parse the input in awk to include the |gc= in $4 but am not able to. The below is close: awk so far: awk '{sub(/\|]+]++/, ""); print }' input.txt Input chr1 955543 955763 AGRN-6|pr=2|gc=75 0 + chr1 957571 957852 AGRN-7|pr=3|gc=61.2 0 + chr1 970621 ... (7 Replies)
Discussion started by: cmccabe
7 Replies

3. Shell Programming and Scripting

Include pathname in awk output?

I am running an awk to verify all the memory settings for tomcat, and need to include path or directory in output .... I am running: awk '{ print $3 }' /opt/dir1/dir2/*/tomcat/bin/setenv.sh Output results: -Xms1024m -Xmx1536m -Xmx1536m -Xmx1024m -Xms1024m -Xms1024m -Xms512m -Xms1024m... (3 Replies)
Discussion started by: kgolli
3 Replies

4. Shell Programming and Scripting

awk Parse And Create Multiple Files Based on Field Value

Hello: I am working parsing a large input file which will be broken down into multiples based on the second field in the file, in this case: STORE. The idea is to create each file with the corresponding store number, for example: Report_$STORENUM_$DATETIMESTAMP , and obtaining the... (7 Replies)
Discussion started by: ec012
7 Replies

5. Shell Programming and Scripting

awk assign output of array to specific field-number

With this script i want to print the output to a specific field-number . Can anybody help? awk 'NR=FNR{split(FILENAME,fn,"_");nr=$2;f = $1} END{for (i=1;i<=f;i++) print i,$fn=nr}' input_5.csv input_6.csvinput_5.csv 4 135 5 185 6 85 11 30input_6.csv 1 90 3 58 4 135 7 60 8 55 10... (1 Reply)
Discussion started by: sdf
1 Replies

6. Shell Programming and Scripting

Replace specific field on specific line sed or awk

I'm trying to update a text file via sed/awk, after a lot of searching I still can't find a code snippet that I can get to work. Brief overview: I have user input a line to a variable, I then find a specific value in this line 10th field in this case. After asking for new input and doing some... (14 Replies)
Discussion started by: crownedzero
14 Replies

7. Shell Programming and Scripting

Compare two files and output difference, by first field using awk.

It seems like a common task, but I haven't been able to find the solution. vitallog.txt 1310,John,Hancock 13211,Steven,Mills 122,Jane,Doe 138,Thoms,Doe 1500,Micheal,May vitalinfo.txt 12122,Jane,Thomas 122,Janes,Does 123,Paul,Kite **OUTPUT** vitalfiltered.txt 12122,Jane,Thomas... (2 Replies)
Discussion started by: charles33
2 Replies

8. UNIX for Dummies Questions & Answers

awk to match multiple regex and create separate output files

Howdy Folks, I have a list that looks like this: (file2.txt) AAA BBB CCC DDD and there are 24 of these short words. I am matching these patterns to another file with 755795 lines (file1.txt). I have this code for matching: awk -v f2=file2.txt ' BEGIN { while(... (2 Replies)
Discussion started by: heecha
2 Replies

9. Shell Programming and Scripting

awk command to separate a field

I have a log file that I am trying to convert. File contents something like this: aaaaa bbbbbb cccc dddddd\123 eeeee ffffffff I am trying to output the fields in a different order and separate field 4 so that the "123" and "dddddd" can be output separately. for example bbbbbb aaaaa 123... (5 Replies)
Discussion started by: jake1988
5 Replies

10. Shell Programming and Scripting

how to include field in the output filename of awk

Im using awk and I want the output filename to contain the first field of the input file. Ex. 1 dddd wwwww 1 eeeee wwww 1 wwww eerrrr 2 eeee eeeeee I want the output files to be xxx1 and xxx2 Thank you (4 Replies)
Discussion started by: yahyaaa
4 Replies
Login or Register to Ask a Question