More efficient awk parser


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting More efficient awk parser
# 8  
Old 03-13-2015
Definitely not. There's a single quote too many.
Where does the NC last field come from?
# 9  
Old 03-13-2015
loose the ' following FNR > 1
Code:
awk -F'[_.>]' 'FNR > 1 {a=length($4);b=substr($4,1,a-1);print $2+0,b,b,substr($4,a),$5}' OFS='\t' ${id}.txt > ${id}_parse.txt

# 10  
Old 03-13-2015
My guess is that it parses out the NC_004004.4 incorectly. If I use a file with just the variants in it (no header) it works fine. The problem is that the input has a header normally that needs to be skipped. I thought I had it or was close, but it errors and I'm not that good at debugging yet. Thank you Smilie.

Code:
 
awk -F"[_.>]" '{a=length($4);b=substr($4,1,a-1);print $2+0,b,b,substr($4,a),$5}' OFS="\t" Test.txt > output.txt

# 11  
Old 03-13-2015
I attached the output of the command, which runs, but doesn't look right. Thank you Smilie.

Code:
 awk -F'[_.>]' 'FNR > 1 {a=length($4);b=substr($4,1,a-1);print $2+0,b,b,substr($4,a),$5}' OFS='\t' ${id}.txt > ${id}_parse.txt

Code:
 Desired Output
13 20763477 20763477 C T
4 41749507 41749507 G T
4 41749410 41749410 C T

# 12  
Old 03-13-2015
Try
Code:
awk -F"[_.>]" '{X=$4+0; sub(X, "", $4); print $2+0, X, X, $4, $5}' OFS="\t" /tmp/Test.txt 
13      20763477        20763477        C       T
4       41749507        41749507        G       T
4       41749410        41749410        C       T

awk -F"[_.>\t]" 'FNR>1 {X=$4+0; sub(X, "", $4); print $2+0, X, X, $4, $5}' OFS="\t" /tmp/Target.txt 
4004    244     244     G       A
3924    288     288     C       A
3924    385     385     G       A

# 13  
Old 03-14-2015
The output of the second awk skips the header but the first awk has the desired output.

The input will change each time so to represent this ${id}_position.txt is used. An example of the input file is attached. I tried:

Code:
 awk -F"[_.>]" 'FNR > 1 {X=$4+0; sub(X, "", $4); print $2+0, X, X, $4, $5}' OFS="\t" ${id}_position.txt > ${id}_parse.txt

but that didn't work.

Output of script
Code:
 4004	79	79	G	A		NC

Desired Output
Code:
 13     20763642     20763642     C     T

Thank you very much and have a nice weekendSmilie.
# 14  
Old 03-14-2015
I modified the code a bit and it works perfectly Smilie thank you for your help.


Code:
 awk 'NR==2 {split($2,a,"[_.>]");b=substr(a[4],1,length(a[4]-1));print a[2]+0,b,b,substr(a[4],length(a[4])),a[5]}' OFS="\t" ${id}_position.txt > ${id}_parse.txt


As you mentioned in a post what if a > is not present, then how to parse. I am encountering that more now, but will post in a new thread. Thank you Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Efficient awk way to add numbers in line fields

data.now: blah1,dah,blaha,sweet,games.log,5297484456,nagios-toin,529748456,on__host=93 SERVICE__ALERT_=51 Warning___The__results__of__service=16 Warning___on__host=92 Auto_save__of__retention__data__completed=1 Warning___Return=68 PASSIVE__SERVICE__CHECK_=53 ,1026--1313,1... (12 Replies)
Discussion started by: SkySmart
12 Replies

2. Shell Programming and Scripting

Combining awk command to make it more efficient

VARIABLE="jhovan 5259 5241 0 20:11 ? 00:00:00 /proc/self/exe --type=gpu-process --channel=5182.0.1597089149 --supports-dual-gpus=false --gpu-driver-bug-workarounds=2,45,57 --disable-accelerated-video-decode --gpu-vendor-id=0x80ee --gpu-device-id=0xbeef --gpu-driver-vendor... (3 Replies)
Discussion started by: SkySmart
3 Replies

3. Shell Programming and Scripting

Efficient way to search array in text file by awk

I have one array SPLNO with approx 10k numbers.Now i want to search the subscriber number from MDN.TXT file (containing approx 1.5 lac record)from the array.if subscriber number found in array it will perform below operation.my issue is that it's taking more time because for one number it's search... (6 Replies)
Discussion started by: siramitsharma
6 Replies

4. Shell Programming and Scripting

Java stack trace parser in awk

I want the developers to get a mail with Java stack traces on a daily bases. When something is flaged as known issue and will get a fix but mean while this does not need to get sent each dayl. This is what I got so far. It's a bash script that runs some AWK in it. To get the files that needs to... (6 Replies)
Discussion started by: chipmunken
6 Replies

5. UNIX for Dummies Questions & Answers

Help with awk (making simple/advanced ini parser)

Hello I'm searching some kind of example (or ready-made solution, but I don't really want it, because I want to learn awk more), to make something like a parser in awk for something like this (I put example, because I don't really know how to explain this): line1=1 line2=0 line3=1... (23 Replies)
Discussion started by: jormung
23 Replies

6. Emergency UNIX and Linux Support

Help to make awk script more efficient for large files

Hello, Error awk: Internal software error in the tostring function on TS1101?05044400?.0085498227?0?.0011041461?.0034752266?.00397045?0?0?0?0?0?0?11/02/10?09/23/10???10?no??0??no?sct_det3_10_20110516_143936.txt What it is It is a unix shell script that contains an awk program as well as... (4 Replies)
Discussion started by: script_op2a
4 Replies

7. Shell Programming and Scripting

Is there a way to make this more efficient

I have the following code. printf "Test Message Report" > report.txt while read line do msgid=$(printf "%n" "$line" | cut -c1-6000| sed -e 's///g' -e 's|.*ex:Msg\(.*\)ex:Msg.*|\1|') putdate=$(printf "%n" "$line" | cut -c1-6000| sed -e 's///g' -e 's|.*PutDate\(.*\)PutTime.*|\1|')... (9 Replies)
Discussion started by: gugs
9 Replies

8. Shell Programming and Scripting

Can you suggest a more efficient way for this?

Hi I have the following at the end of a service shutdown script used in part of an active-passive failover setup: ### # Shutdown all primary Network Interfaces # associated with failover ### # get interface names based on IP's # and shut them down to simulate loss of # heartbeatd ... (1 Reply)
Discussion started by: mikie
1 Replies

9. Shell Programming and Scripting

Efficient way of Awk

Hi, Can someone let me know if the below AWK can be made much simpler / efficient ? I have 200 fields, I need to substr only the last fields. So i'm printing awk -F~ 'print {$1, $2, $3....................................$196,$197 , susbstr($198,1,3999), substr($199,1,3999)..}' Is there a... (4 Replies)
Discussion started by: braindrain
4 Replies

10. Shell Programming and Scripting

Is there a more efficient way?

I'm using korn shell to connect to oracle, retrieve certain values, put them in a list, and iterate through them. While this method works, I can't help but think there is an easier method. If you know of one, please suggest a shorter, more efficient method. ############### FUNCTIONS ... (6 Replies)
Discussion started by: SelectSplat
6 Replies
Login or Register to Ask a Question