How to get index values for multiple matches in the same line with awk?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to get index values for multiple matches in the same line with awk?
# 1  
Old 07-14-2015
How to get index values for multiple matches in the same line with awk?

Hi,

I know that

Code:
echo "bob alice robert alice" | awk '{print index($0,"alice")}'
5

Will output the index of the first alice match, is there any way to get the index of all matches?, eg:

Code:
echo "bob alice robert alice" | awk 'unknown magic'
5:18

Thanks for your time.
# 2  
Old 07-14-2015
Code:
[akshay@localhost tmp]$ cat position.awk
function usage()
{
	print "\n\tawk -f position.awk -vsearch='word_to_be_searched' infile\n"
	exit
}
BEGIN{
	if(!search)usage();
}
index($0,search){
   printf ("FILENAME : %s, LINE :  %s, Column : ", (FILENAME=="-"?"stdin":FILENAME), FNR )
   string=$0; m=0;
   while((n=index(string, search))>0)
   {
      m+=n;
      printf "%s ", m
      string=substr(string, n+1)
   }
   print ""
}

OR

Code:
[akshay@localhost tmp]$ cat position.awk
function usage()
{
	print "\n\tawk -f position.awk -vsearch='word_to_be_searched' infile\n"
	exit
}
BEGIN{if(!search)usage(); sl = length(search) }
{
    si = 0
    pos = ""
    while ( sind = index($0,search) )
    {
        pos = pos (pos?" ":"") (si + sind)
        $0  = substr($0,sind + sl)
        si += (sind + sl - 1)
    }
}
pos { printf ("FILENAME : %s, LINE :  %s, Column : %s\n", (FILENAME=="-"?"stdin":FILENAME), FNR, pos ) }

Code:
[akshay@localhost tmp]$ echo "bob alice robert alice" | awk -f position.awk -vsearch='alice'
FILENAME : stdin, LINE :  1, Column : 5 18

Code:
[akshay@localhost tmp]$ cat testfile
bob alice robert alice
bob 1 alice robert alice
bob     2 alice robert alice
bob        3 alice robert alice

Code:
[akshay@localhost tmp]$ awk -f position.awk -vsearch='alice' testfile
FILENAME : testfile, LINE :  1, Column : 5 18 
FILENAME : testfile, LINE :  2, Column : 7 20 
FILENAME : testfile, LINE :  3, Column : 11 24 
FILENAME : testfile, LINE :  4, Column : 14 27


Last edited by Akshay Hegde; 07-14-2015 at 03:35 AM..
These 3 Users Gave Thanks to Akshay Hegde For This Post:
# 3  
Old 07-14-2015
In case you expect EXACTLY two matches, try
Code:
awk -vsrch=alice '{match($0,srch".*"srch); print RSTART, RSTART+RLENGTH-length(srch)}' file
5 18
7 20
11 24
14 27

This User Gave Thanks to RudiC For This Post:
# 4  
Old 07-14-2015
You could also try something like:
Code:
#!/bin/ksh
printf '%s\n' 'bob alice robert alice' 'aaaa bbbb aaaa' | awk -v s="$1" '
{       d = ""
        for(i = 1; x = index(substr($0, i), s); i = i + x + length(s) - 1) {
                printf("%s%d", d, i + x - 1)
                d = ":"
        }
        print ""
}'

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk.

If you save this in a file named tester and make it executable, then various invocations show how it works:
Code:
$ ./tester alice
5:18

$ ./tester a
5:18
1:2:3:4:11:12:13:14
$ ./tester "b a"
3
9
$ ./tester b
1:3:13
6:7:8:9
$ ./tester aa

1:3:11:13
$

This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 07-14-2015
In case your awk version allows for string field separators, try
Code:
awk -vSRCH=alice '
BEGIN   {LNS=length(SRCH)
        }
        {n=split ($0, TMP, SRCH)
         m=0
         for (i=1; i<n; i++)    {m+=length(TMP[i])
                                 printf "%d ", (i-1)*LNS + m + 1
                                }
         printf "\n"
        }
' file

---------- Post updated at 11:59 ---------- Previous update was at 11:21 ----------

Of course, above is stupid, as you could use FS and normal filed splitting ab initio:
Code:
awk -F"alice" '
BEGIN   {LNS=length(FS)
        }
        {m=0
         for (i=1; i<NF; i++)   {m+=length($i)
                                 printf "%d ", (i-1)*LNS + m + 1
                                }
         printf "\n"
        }
' file

This User Gave Thanks to RudiC For This Post:
# 6  
Old 07-14-2015
Try this if you don't want to reset the index position for each line...

Code:
awk '{ for(i=1;i<=NF;i++)
      { if(index($i,"alice")!=0) 
         printf ("%d ",index($i,"alice") + len);
         len+=length($i)+1
      }
 printf "\n"
}' file

And this if you want to reset the index position for each line...

Code:
awk '{ for(i=1;i<=NF;i++)
      { if(index($i,"alice")!=0) 
         printf ("%d ",index($i,"alice") + len);
         len+=length($i)+1
      }
 printf "\n";len=0
}' file

This User Gave Thanks to malcomex999 For This Post:
# 7  
Old 07-14-2015
I wasn't expected so high quality answers, thanks for all the them, at the end I used a modified version of Akshay's answer:

Code:
[akshay@localhost tmp]$ cat position.awk function usage()
{
     print "\n\tawk -f position.awk -vsearch='word_to_be_searched' infile\n"
     exit
}

BEGIN{ if(!search)usage(); }

match(tolower($0),search) {
    string=tolower($0); m=0;
    while((n=match(string,search))>0)
    {
       m+=n;
       printf ("%s:%s:%s\n", FNR, m, substr($0, m, RLENGTH))
       string=substr(string, n+1)
    }
}

I wanted to print the resulting match and separate every entry:

Code:
1:5:alice
1:18:alice

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Trying to use awk to check values and multiple

I am trying to use AWK to read a file, comma delimited, and check each field to see if it has a suffix of - (dash , minus sign) if so then I want to either move the minus sign the the beginning of the field or take the numeric portion of the field and multiply it by negative 1 to get the field... (9 Replies)
Discussion started by: ziggy6
9 Replies

2. Shell Programming and Scripting

awk Index to get position matches pattern

Input data as below (filetest.txt): 1|22 JAN Minimum Bal 20.00 | SAT 2|09 FEB Extract bal 168.00BR | REM 3|MIN BAL | LEX Output should be: ( If there is Date & Month in 2nd field of Input file, It should be seperated else blank. If There is Decimal OR Decimal & Currency in last of the 2nd... (7 Replies)
Discussion started by: JSKOBS
7 Replies

3. Shell Programming and Scripting

Print line if values in fields matches number and text

datafile: 2017-03-24 10:26:22.098566|5|'No Route for Sndr:RETEK RMS 00040 /ZZ Appl:PF Func:PD Txn:832 Group Cntr:None ISA CntlNr:None Ver:003050 '|'2'|'PFI'|'-'|'EAI_ED_DeleteAll'|'EAI_ED'|NULL|NULL|NULL|139050594|ActivityLog| 2017-03-27 02:50:02.028706|5|'No Route for... (7 Replies)
Discussion started by: SkySmart
7 Replies

4. Shell Programming and Scripting

awk to print the line that matches and the next if line is wrapped

I have a file and when I match the word "initiators" in the first column I need to be able to print the rest of the columns in that row. This is fine for the most part but on occasion the "initiators" line gets wrapped to the next line. Here is a sample of the file. caw-enabled ... (3 Replies)
Discussion started by: kieranfoley
3 Replies

5. Shell Programming and Scripting

Display all the matches lines in one line using awk

Please can you let me know how to print all the matching lines from a file in one single line using awk. Thanks I have the following data in the input file data1 voice2 voice1 speech1 data2 data3 ... ... voice4 speech2 data4 and the output should be as follows data1 data2... (4 Replies)
Discussion started by: Sudhakar333
4 Replies

6. Shell Programming and Scripting

Command line arguments with multiple values

how can I pass multiple values from command line arguments example script.sh -arg1 op1 -arg2 op1 op2 op3 (2 Replies)
Discussion started by: nsk
2 Replies

7. Shell Programming and Scripting

Finding index values

I have an array XOFFS of size N. I have VDISTMIN and VDISTMAX. I want to find the beginning and and index in XOFFS such that values are within Example Let XOFFS = (10 20 30 40 50) If VDISTMIN = -20 and VDISTMAX = 20 Then ISTR = 1 and IEND = 2 as XOFFS through XOFFS fall within the... (2 Replies)
Discussion started by: kristinu
2 Replies

8. Shell Programming and Scripting

Sort from start index and end index in line

Hi All, I have a file (FileNames.txt) which contains the following data in it. $ cat FileNames.txt MYFILE17XXX208Sep191307.csv MYFILE19XXX208Sep192124.csv MYFILE20XXX208Sep192418.csv MYFILE22XXX208Sep193234.csv MYFILE21XXX208Sep193018.csv MYFILE24XXX208Sep194053.csv... (5 Replies)
Discussion started by: krish_indus
5 Replies

9. Shell Programming and Scripting

matching multiple values in awk

How will you change the 5th column in the data file with the value in the second column in the error_correction.txt file. You have to match an extra variable, column 3 of the error_correction file with column 6 of the data.txt file. data.txt: vgr,bugatti veron,,3.5,Maybe,6,.......,ax2,....... (0 Replies)
Discussion started by: VGR
0 Replies

10. Shell Programming and Scripting

Does Sed Search/Replace Work For Multiple Matches On The Same Line?

Hello, I would like to delete all the footnotes in all my htm files. Hence, I have to delete the whole font tag pairs, i.e. deleting everything between the begin/end font tags. I create a testfile, of which data parts of all four lines are the same except for the number of font tag pairs,... (3 Replies)
Discussion started by: cibalo
3 Replies
Login or Register to Ask a Question