Using awk to add length of matching characters between field in file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Using awk to add length of matching characters between field in file
# 1  
Old 06-21-2018
Using awk to add length of matching characters between field in file

The awk below produces the current output, which will add +1 to $3. However, I am trying to add the length of the matching characters between $5 and $6 to $3. I have tried using sub as a variable to store the length but am not able to do so correctly. I added comments to each line and the description has the rules for each line and the math is zero-based. Thank you Smilie.

description
Code:
since line 1 has 4 matching characters  between $5 and $6 (GAAA), 4 is added to $3
since line 1 has 5 matching characters between $5 and $6 (GAAAA), 5 is added to $3

file tab-delimited
Code:
id1	1	116268178		GAAA	GAAAA
id2	1	116268200		GAAAA	GAAAAA


current output tab-delimeted
Code:
id1	1	116268179	116268179	GAAA	GAAAA
id2	1	116268201	116268201	GAAAA	GAAAAA

desired output tab-delimeted
Code:
id1	1	116268181	116268181	GAAA	GAAAA
id2	1	116268204	116268204	GAAAA	GAAAAA

awk
Code:
awk 'BEGIN{FS=OFS="\t"}  # define fs and output
         FNR==NR{ # process each field in each line of file
           if(length($5) < length($6)) {  # condition 2
               sub($5,"",$6) && sub($6,"",$5)       # removing matching
               print $1,$2,$3+1,$3+1,"-",$6  # print desired output
                 next
}
}' file > output


Last edited by cmccabe; 06-21-2018 at 09:40 AM.. Reason: added details
# 2  
Old 06-21-2018
See:
Code:
       match(s, r)
              the  position  in s where the regular expression r occurs, or 0 if it does not.  The variables RSTART and RLENGTH are set
              to the position and length of the matched string.

This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 06-21-2018
Hello cmccabe,

Could you please try following and let me know if this helps you.

Code:
 
awk 'BEGIN{FS=OFS="\t"} match($NF,$(NF-1)){$3+=RLENGTH-1;$3=$3 OFS $3} 1'  Input_file

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 4  
Old 06-22-2018
116268178 + 4 needs to be 116268182 ?
Having doubt on requirement?

Code:
/bin/awk 'BEGIN {
		FS=OFS="\t";
	}
	match($NF,$(NF-1)) {
	$3+=RLENGTH;
	$3=$3OFS$3;
} 1' ./Input_file


Last edited by murugesandins; 06-22-2018 at 07:24 AM.. Reason: + 4 instead of + 4 -1 => depends => based on requirement.
This User Gave Thanks to murugesandins For This Post:
# 5  
Old 07-02-2018
Thank you all Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to add field to diffrent file using shellscript? or awk

hi, would you help me? i have file total.csv "a","e23","f" "b,"34d","g" "c","45f","f" "d","45s","f" count.csv 3 i do this : paste -d',",' total.csv count.csv but the result like this: "a,"e23","f" 3 "b,"34d","g" (1 Reply)
Discussion started by: kivale
1 Replies

2. Shell Programming and Scripting

awk to add text to matching pattern in field

In the awk I am trying to add :p.=? to the end of each $9 that matches the pattern NM_. The below executes andis close but I can not seem to figure out why the :p.=? repeats in the split as in the green in the current output. I have added comments as well. Thank you :). file ... (4 Replies)
Discussion started by: cmccabe
4 Replies

3. Shell Programming and Scripting

awk to update field using matching value in file1 and substring in field in file2

In the awk below I am trying to set/update the value of $14 in file2 in bold, using the matching NM_ in $12 or $9 in file2 with the NM_ in $2 of file1. The lengths of $9 and $12 can be variable but what is consistent is the start pattern will always be NM_ and the end pattern is always ;... (2 Replies)
Discussion started by: cmccabe
2 Replies

4. Shell Programming and Scripting

awk to output the percentage of a field compared to length

The awk below using the sample input would output the following: Basically, it averages the text in $5 that matches if $7 < 30 . awk '{if(len==0){last=$5;total=$7;len=1;getline}if($5!=last){printf("%s\t%f\n", last,... (6 Replies)
Discussion started by: cmccabe
6 Replies

5. Shell Programming and Scripting

Awk: Matching Pattern From other file with length

Hi, I have input file whose first column needs(match.txt) to be matched with the first column of the input file with min & max length as defined in match.txt. But conditions are not matching. Please help on the changes in the code below as for multiple enteries in match.txt complete match.txt will... (3 Replies)
Discussion started by: siramitsharma
3 Replies

6. UNIX for Dummies Questions & Answers

Help with awk, where line length and field position are variable

I have several questions about using awk. I'm hoping someone could lend me a hand. (I'm also hoping that my questions make sense.) I have a file that contains pipe separated data. Each line has similar data but the number of fields and the field position on each line is variable. ... (3 Replies)
Discussion started by: Cheese64
3 Replies

7. Shell Programming and Scripting

Flat file-make field length equal to header length

Hello Everyone, I am stuck with one issue while working on abstract flat file which i have to use as input and load data to table. Input Data- ------ ------------------------ ---- ----------------- WFI001 Xxxxxx Control Work Item A Number of Records ------ ------------------------... (5 Replies)
Discussion started by: sonali.s.more
5 Replies

8. Shell Programming and Scripting

AWK : Add Fields of lines with matching field

Dear All, I would like to add values of a field, if the lines match in a certain field. Then I would like to divide the sum though the number of lines that have a matched field. This is the Input: Input: Test1 5 Test1 10 Test2 2 Test2 5 Test2 13 Test3 4 Output: Test1 7.5 Test1 7.5... (6 Replies)
Discussion started by: DerSeb
6 Replies

9. Shell Programming and Scripting

perl or awk, field length check

Hi Everyone, 1.txt a;1234;134;1111111 b;123;123;1111111 c;123;1334;1111111 d;1234;1234;1111111 output a;1234;134;1111111 c;123;1334;1111111 d;1234;1234;1111111 if field2 legth>3 or field3 length >3, then output. Please advice. Thanks (4 Replies)
Discussion started by: jimmy_y
4 Replies

10. Shell Programming and Scripting

Print matching field using awk

Hi All, I have a string like below: str="Hold=True Map=False 'This will map the data' Run=Yes Modify=False" I want to print the field Run=Yes and retrive the value "Yes". I cannot use simple awk command because the position of the "Run" will be different at different times. Is there a way... (6 Replies)
Discussion started by: deepakgang
6 Replies
Login or Register to Ask a Question