Sponsored Content
Full Discussion: awk RS/ORS error
Top Forums Shell Programming and Scripting awk RS/ORS error Post 302968361 by yifangt on Tuesday 8th of March 2016 05:17:18 PM
Old 03-08-2016
awk RS/ORS problem

Hello,
I am trying to filter fastq file (in short, every 4 lines to be a record) based on the GC counts (GC-contents) in sequence (i.e. field 2), which is the count % of the G/C chars in the string. The example script is to pick up records with GC contents > 0.6 in the sequence (second field).
One thing special is the "@" symbol is always the first char of the first row in each record, but it may appear in the third field of anywhere except the first position.
A sample input.file is:
Code:
@HWI-ST1410:193:C7847ANXX:3:1101:3144:2591
CCGCTTGGAGCGGATCAGGTAGTCGACCTGCTTAAGGAGGGC
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@HWI-ST1410:193:C7847ANXX:3:1101:3050:2607
CAAAAAAAATTTTCTATTTTACATATACAATGAAGAACGTCACTG
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFHHH
@HWI-ST1410:193:C7847ANXX:3:1101:3075:2609
CACTGTACTAAGCTTTGGCGCTGATTCCATAATTTCTTTCTC
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@HWI-ST1410:193:C7847ANXX:3:1101:3098:2622
GGTACGTACACATAATCCGTTGACTAGCTCGATACGATTACG
+
BBBBBFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFF
@HWI-ST1410:193:C7847ANXX:3:1101:3097:2667
CCCGGCGGGAGAGGGACGGCAGGCTCGTCGGCGCCACAATCG
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

So far, my script is:
Code:
awk 'BEGIN{RS="\n@", FS="\n"; OFS="\n"} {s=$2; if (gsub(/[GC]/, "x", $2)/length($2)>0.6) print "@"$1, s, $3, $4}' input.file

My script seems not doing what I want, as the first record always has double "@@" for its record/sequence name.
How to deal with the first record without the "\n@" as the RS? Sometimes the "@" symbol was NOT put back in front of $1 to have the original string.
I am using GNU Awk 4.0.1 under Linux 3.19.0-32-generic ~14.04.1 Ubuntu.
Thanks a lot for any clue!

Last edited by yifangt; 03-08-2016 at 06:59 PM.. Reason: typos
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

rs and ors in gawk ...????

:D dear members I have a good knowledge of gawk and seem to do quite well with it.. but I have never understood what the use of the rs and ors are for or how they are used.. i am thinking they are for seperating lines and paragraphs but i have absolutely no idea how to make it work, if that is what... (2 Replies)
Discussion started by: moxxx68
2 Replies

2. Shell Programming and Scripting

Error in awk

var1=`echo "emp,dept,salgrade" | awk -F, '{print NF}'` count=1 while ; do i=`expr $count` tname=`echo "emp,dept,salgrade" | awk -F, '{ print $(echo $i) }'` count=$count+1; echo ${tname}; echo $count done I want to store in tname=emp, tname=dept,tname=salgrade I am getting... (2 Replies)
Discussion started by: dreams5617
2 Replies

3. UNIX for Dummies Questions & Answers

awk Shell Script error : "Syntax Error : `Split' unexpected

hi there i write one awk script file in shell programing the code is related to dd/mm/yy to month, day year format but i get an error please can anybody help me out in this problem ?????? i give my code here including error awk ` # date-month -- convert mm/dd/yy to month day,... (2 Replies)
Discussion started by: Herry
2 Replies

4. Shell Programming and Scripting

Awk error -- awk: 0602-562 Field $() is not correct.

typeset -i i=1 while read -r filename; do Splitfile=`$Targetfile_$i.txt` awk 'substr($0,1,5) == substr($filename,1,5) && substr($0,526,2) == substr($filename,6,2) && substr($0,750,12) == substr($filename,8,12)' $SourceFilename >> $Splitfile i=i+1 done < /tmp/list.out I am using this logic... (1 Reply)
Discussion started by: pukars4u
1 Replies

5. Shell Programming and Scripting

awk command in script gives error while same awk command at prompt runs fine: Why?

Hello all, Here is what my bash script does: sums number columns, saves the tot in new column, outputs if tot >= threshold val: > cat getnon0file.sh #!/bin/bash this="getnon0file.sh" USAGE=$this" InFile="xyz.38" Min="0.05" # awk '{sum=0; for(n=2; n<=NF; n++){sum+=$n};... (4 Replies)
Discussion started by: catalys
4 Replies

6. Shell Programming and Scripting

awk error

Hi Team, I have .csv file in the following format .csv file TAB1;COL1;DATATYPE;NOTNULL;WITH DEFAULT TAB2;COL1;DATATYPE;NOTNULL;WITH DEFAULT .... .... .... output: ALTER TABLE TAB1. add COL1 DATATYPE NOTNULL WITH DEFAULT; ALTER TABLE TAB2 add COL1 DATATYPE NOTNULL WITH DEFAULT; I... (5 Replies)
Discussion started by: rocking77
5 Replies

7. Shell Programming and Scripting

awk output yields error: awk:can't open job_name (Autosys)

Good evening, Im newbie at unix specially with awk From an scheduler program called Autosys i want to extract some data reading an inputfile that comprises jobs names, then formating the output to columns for example 1. This is the inputfile: $ more MapaRep.txt ds_extra_nikira_usuarios... (18 Replies)
Discussion started by: alexcol
18 Replies

8. UNIX for Dummies Questions & Answers

No error in awk...

Hi all... In the OSX forum I am starting a new awk project to learn awk. In this code snippet I have had to check boundaries to ensure that no NUMERICAL error occurs in the rest of the code... printf "Enter frequency required:- "; getline FREQ; RATE=(BYTES*FREQ); if ( RATE <= 4000 ) {... (4 Replies)
Discussion started by: wisecracker
4 Replies

9. Shell Programming and Scripting

awk : ORS not to be printed after the last record

Hello Team, here is the code: scripts]# ls /etc/init.d/ | awk 'BEGIN{ORS=" && "} /was.init/ && !/interdependentwas/ && !/NodeAgent/ && !/dmgr/{print "\$\{service_cmd\} "$0 " status"}' 2>/dev/null ${service_cmd} cmserver_was.init status && ${service_cmd} fmserver_was.init status &&... (6 Replies)
Discussion started by: chandana.hs
6 Replies

10. UNIX for Beginners Questions & Answers

Can someone please explain why we need to set ORS in below awk code?

Question: Write a command to print the fields in a text file in reverse order? awk 'BEGIN {ORS=""} { for(i=NF;i>0;i--) print $i," "; print "\n"}' filename I was thinking it should be (what is the need to set ORS="" ? )- awk 'BEGIN { for(i=NF;i>0;i--) print $i," "; print "\n"}' filename (3 Replies)
Discussion started by: Tanu
3 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 05:38 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy