Sponsored Content
Top Forums Shell Programming and Scripting awk to change value in field according to another Post 303027227 by cmccabe on Tuesday 11th of December 2018 02:52:23 PM
Old 12-11-2018
I have been trying to understand your code and am just not understanding, but I am trying, I know it may seem like I am not but I assure yu that I am and will continue to do so. I added comments to each line and some questions. My understanding is not there completely but hopefully its a start. I apologize for the misleading output description, thousands of lines print, i only showed a few to keep the post short. Thank you Smilie.

Code:
#!/bin/sh
awk -v d=$# '   # does this define d as non-zero
BEGIN {	FS = "[\t_]"    # define FS as tab or underscore
	OFS = "\t"      # define output as tab delimited
}
FNR == NR {   # process same line in file 2 as file1 and start processing file2 or /home/cmccabe/folder/less/all_cdsV2
	m[$1, $4, ++c[$1, $4]] = $2 + 0    # split $4 and store $1 and $2in array m, what does the + 0 do?
	M[$1, $4, c[$1, $4]] = $3 + 0      # split $4 and store $1 and $2in array M, what does the + 0 do?
	if(d) printf("m[%s,%s,%d]=%s,M[%s,%s,%d]=%s\n", # debugging print
		$1, $4, c[$1, $4], m[$1, $4, c[$1, $4]],  # for m (m=min)?
		$1, $4, c[$1, $4], M[$1, $4, c[$1, $4]])  # for M (M=max)?
	next   # process next line
}
{	if(d) printf("FNR=%d:\"%s\"\n",FNR,$0)  # not sure what this doesI think it prints each line in $file?
	for(i = 1; i <= c[$1, $4]; i++) {     # start a loop using $4 and $1 value
		#if(d) printf("m[%d]=%d,M[%d]=%d,$2=%d\n", # again not sure?
			i, m[$1, $4, i],  # loop through each m in /home/cmccabe/folder/less/all_cdsV2 for each $1 and $4 of $file
			i, M[$1, $4, i],  # loop through each M in /home/cmccabe/folder/less/all_cdsV2 for each $1 and $4 of $file
			$2)  # not sure what this does?
		if(m[$1, $4, i] <= $2 && $2 <= M[$1, $4, i]) {  # if the value of each matching m<=$2 and <=M then print 
			$5 = "exon"  #  exon in $5
			break    # break loop and move to next line
		} else {if(m[$1, $4, i] > $2 + 0) { # if the value of each matching m>=$2 and >=M then print
				if(m[$1, $4, i] - 10 <= $2 + 0) {  # if the value of each matching -10and <=$2 then print
					$5 = "splicing"
					break     # break loop and move to next line
				} else {$5 = "intron"   # print intron in $5
					break  # break loop and move to next line
				}
		}
	}
}
	if(i > c[$1, $4])     # what does this do hasn't intron already been printed?
		$5 = "intron"
}
1' "$1" "$2"   # parameters passed from for loop

Code:
  (only a few lines of the thousands to show the desired output results from the grep)
grep -E 'splicing|intron|exon' /home/cmccabe/folder/less/00-0000_output.txt
chr7	30062272	30062492 	FKBP14	splicing
chr7	30065867	30066087 	FKBP14	intron
chr7	30065964	30066184 	FKBP14	exon
chr7	94024268	94024488 	COL1A2	intron


Last edited by cmccabe; 12-12-2018 at 07:26 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

change field content awk

I have a line like this: I want to move HTTP/1.1 200 OK to the next line and put a blank line between the two lines i.e. How can i get it using awk? Thanks in advance (2 Replies)
Discussion started by: littleboyblu
2 Replies

2. Shell Programming and Scripting

dynamically change awk Field Separator FS

Hi All, I was wondering if anyone knew how to dynamically change the FS in awk to accept vairiable containing a field separator. the current code is as below and does not work when i introduce the dynamic FS change :-( validate_source_file() { source_file=$1 ... (2 Replies)
Discussion started by: satnamx
2 Replies

3. Shell Programming and Scripting

awk,cut fields by change field format

Hi Everyone, # cat 1.txt 1321631,77770132976455,19,20091001011859,20091001011907 1321631,77770132976455,19,20091001011859,20091001011907 1321631,77770132976455,19,20091001011859,20091001011907 # cat 1.txt | awk -F, '{OFS=",";print $1,$3,$4,$5}' 1321631,19,20091001011859,20091001011907... (7 Replies)
Discussion started by: jimmy_y
7 Replies

4. Shell Programming and Scripting

awk, comma as field separator and text inside double quotes as a field.

Hi, all I need to get fields in a line that are separated by commas, some of the fields are enclosed with double quotes, and they are supposed to be treated as a single field even if there are commas inside the quotes. sample input: for this line, 5 fields are supposed to be extracted, they... (8 Replies)
Discussion started by: kevintse
8 Replies

5. Shell Programming and Scripting

AWK: Pattern match between 2 files, then compare a field in file1 as > or < field in file2

First, thanks for the help in previous posts... couldn't have gotten where I am now without it! So here is what I have, I use AWK to match $1 and $2 as 1 string in file1 to $1 and $2 as 1 string in file2. Now I'm wondering if I can extend this AWK command to incorporate the following: If $1... (4 Replies)
Discussion started by: right_coaster
4 Replies

6. Shell Programming and Scripting

awk or sed? change field conditional on key match

Hi. I'd appreciate if I can get some direction in this issue to get me going. Datafile1: -About 4000 records, I have to update field#4 in selected records based on a match in the key field (Field#1). -Field #1 is the key field (servername) . # of Fields may vary # comment server1 bbb ccc... (2 Replies)
Discussion started by: RascalHoudi
2 Replies

7. UNIX for Dummies Questions & Answers

change field separator only from nth field until NF

Hi ! input: 111|222|333|aaa|bbb|ccc 999|888|777|nnn|kkk 444|666|555|eee|ttt|ooo|ppp With awk, I am trying to change the FS "|" to "; " only from the 4th field until the end (the number of fields vary between records). In order to get: 111|222|333|aaa; bbb; ccc 999|888|777|nnn; kkk... (1 Reply)
Discussion started by: beca123456
1 Replies

8. Shell Programming and Scripting

awk :how to change delimiter without giving all field name

Hi Experts, i need to change delimiter from tab to "," sample test file cat test A0000368 A29938511 072569352 5 Any 2 for £1.00 BUTCHERS|CAT FOOD|400G Sep 12 2012 12:00AM Jan 5 2014 11:59PM Sep 7 2012 12:00AM M 2.000 group 5 ... (2 Replies)
Discussion started by: Lakshman_Gupta
2 Replies

9. Shell Programming and Scripting

awk to change value of field using multiple conditions

In the below awk in the first step I default Classification NF-1 to VUS. Next, I am trying to change the value of Classification (NF) to whatever CLINSIG (NF-1) is. If there is only one condition everything works great, but if there are two conditions it does not work. Is the syntax used... (4 Replies)
Discussion started by: cmccabe
4 Replies

10. Shell Programming and Scripting

awk to change contents of field based on condition in same file

In the awk below I am trying to copy the entire contents of $6 there may be multiple values seperated by a ;, to $8, if $8 is . (lines 1 and 3 are examples). If that condition $8 is not . (line2 is an example) then that line is skipped and printed as is. The awk does execute but prints the output... (3 Replies)
Discussion started by: cmccabe
3 Replies
LAM(1)							    BSD General Commands Manual 						    LAM(1)

NAME
lam -- laminate files SYNOPSIS
lam [-f min.max] [-s sepstring] [-t c] file ... lam [-p min.max] [-s sepstring] [-t c] file ... DESCRIPTION
The lam utility copies the named files side by side onto the standard output. The n-th input lines from the input files are considered frag- ments of the single long n-th output line into which they are assembled. The name `-' means the standard input, and may be repeated. Normally, each option affects only the file after it. If the option letter is capitalized it affects all subsequent files until it appears again uncapitalized. The options are described below: -f min.max Print line fragments according to the format string min.max, where min is the minimum field width and max the maximum field width. If min begins with a zero, zeros will be added to make up the field width, and if it begins with a `-', the fragment will be left- adjusted within the field. -p min.max Like -f, but pad this file's field when end-of-file is reached and other files are still active. -s sepstring Print sepstring before printing line fragments from the next file. This option may appear after the last file. -t c The input line terminator is c instead of a newline. The newline normally appended to each output line is omitted. To print files simultaneously for easy viewing use pr(1). EXAMPLES
The command lam file1 file2 file3 file4 joins 4 files together along each line. To merge the lines from four different files use lam file1 -S " " file2 file3 file4 Every 2 lines of a file may be joined on one line with lam - - < file and a form letter with substitutions keyed by `@' can be done with lam -t @ letter changes SEE ALSO
join(1), paste(1), pr(1), printf(3) STANDARDS
Some of the functionality of lam is standardized as the paste(1) utility by IEEE Std 1003.2 (``POSIX.2''). HISTORY
The lam utility first appeared in 4.2BSD. BUGS
The lam utility does not recognize multibyte characters. BSD
August 12, 2004 BSD
All times are GMT -4. The time now is 10:07 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy