Converting grep to awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Converting grep to awk
# 1  
Old 05-22-2012
Converting grep to awk

First, I am trying to search for certain string types within a very large file. Here is a sample record:

Code:
MyCountry|MyGroup|MyCust|Run-Date|N/A|SERVER|OS|USERID|897//123456//LNAME FNAME|STATE|LLOGON

In field 9 I am looking for invalid formats. A valid format would be 897/C/123456//LNAME FNAME. The leading 897 can be numeric or it can be character (897 or US, these are country codes).

I am trying to convert a grep regexp search into an awk search with little success. The reason being is that I have to read in the line with grep, then test the variable, then print out the whole line. I figured with awk I could be doing the whole thing with one line and I could get rid of a slow "while read LINE;do" statement which makes my script extremely slow. Below are my two line examples:

GREP:
Code:
cat SID-Removed.csv | cut -d"|" -f9 |grep -ih '[[:alnum:]]\{2,3\}"//"[[:alnum:]]\{6\}'   (this one finally worked)

Note: I had tried
Code:
grep -ih '[[:alnum:]]\{2,3\}//[[:alnum:]]\{6\}'

but it did not produce the desired results

AWK:
Code:
awk -F'|' 'BEGIN { search_regex = "[[:alnum:]]\{2,3\}//[[:alnum:]]\{6\}" }tolower($9) ~ search_regex  {print $0}' SID-Removed.csv

I even tried this:
Code:
awk -F'|' 'BEGIN { search_regex = "[[:alnum:]]\{2,3\}\/\/[[:alnum:]]\{6\}"  }tolower($9) ~ search_regex  {print $0}' SID-Removed.csv

The awk statement finds 897/C/123456/LNAME FNAME, but it does not find the ones where the second element of that string is blank (897//123456//LNAME FNAME). Can anyone help me figure out what i'm doing wrong?

Moderator's Comments:
Mod Comment Video tutorial on how to use code tags in The UNIX and Linux Forums.
# 2  
Old 05-22-2012
I'm not sure [[:alnum:]] is supported in awk. Try [0-9a-zA-Z].

I don't think you need to escape the { } with \.
# 3  
Old 05-22-2012
alnum is POSIX character class. gawk is ok with those. the { } is what is not supported by many awk. Smilie it is newer i guess and gawk didn't want to break itself

Code:
[mute@geek ~]$ echo foo | gawk '/^[[:alnum:]]{3}$/'
[mute@geek ~]$ echo foo | gawk --posix '/^[[:alnum:]]{3}$/'
foo

so to help more, please tell me which awk and/or OS you use.

Last edited by neutronscott; 05-22-2012 at 01:30 PM..
# 4  
Old 05-22-2012
Code:
awk -F\| '$9 ~ "^[a-zA-Z0-9]{2,3}/[a-zA-Z0-9]?/[a-zA-Z0-9]{6}"' file

This User Gave Thanks to shamrock For This Post:
# 5  
Old 05-22-2012
Quote:
Originally Posted by Corona688
I'm not sure [[:alnum:]] is supported in awk. Try [0-9a-zA-Z].

I don't think you need to escape the { } with \.
I was trying to use [[:alnum:]]\{2,3\} because I want to search for any alpha (0-9, a-z as you indicated), but only if it found 2 or 3 before the /

897//123456
US//123456

If I switch to [0-9a-zA-z], can I still use the count feature from grep where I tell it to only match 2 or 3 characters in that position?
# 6  
Old 05-22-2012
Quote:
Originally Posted by dagamier
If I switch to [0-9a-zA-z], can I still use the count feature from grep where I tell it to only match 2 or 3 characters in that position?
Yes you can...
# 7  
Old 05-22-2012
Here is a full code snippet of what i'm trying to convert. Note that I am currently using grep in a while look and reading through a file with millions of records make this take quite a long time to complete:

Code:
while read LINE
do
   REC13=`echo $LINE |cut -d"|" -f9 |grep -ih '[[:alnum:]]\{2,3\}"//"[[:alnum:]]\{6\}'`
   if [ -n "$REC13" ]
   then
        echo $LINE >> ./$PRVYR/$MONTH/mislabeled/$MONTH-mislabeled.csv
   fi
done < INFILE

This particular record looks for these strings: CCC//SSSSSS or CC//SSSSSS

My goal is to try and convert this into an awk command.

Moderator's Comments:
Mod Comment Code tags for code, please.

Last edited by Corona688; 05-22-2012 at 01:48 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Converting awk to perl

Hello. I'm currently teaching myself Perl and was trying to turn an awk code that I had written into Perl. I have gotten stuck on a particular part and a2p has not helped me at all. The task was to take a .csv file containing a name, assignment type, score and points possible and compute it into a... (1 Reply)
Discussion started by: Eric7giants
1 Replies

2. UNIX for Beginners Questions & Answers

Converting awk to perl

Hello. I'm trying to convert an awk script I wrote to perl (which I just started self-teaching). I tried the a2p command but I couldn't make sense of most of it. Here was the awk code: BEGIN{ FS = "," print "NAME\tLOW\tHIGH\tAVERAGE" a=0 } { if(a==0){ a+=1 (1 Reply)
Discussion started by: Eric7giants
1 Replies

3. Shell Programming and Scripting

Using awk for converting xml to txt

Hi, I have a xml script, I converted it to .txt with values comma seperated using awk function. But I want the output values should be inside double quotes My xml script (Workorders.xml) is shown like below: <?xml version="1.0" encoding="utf-8" ?> <scbm-extract version="3.3">... (8 Replies)
Discussion started by: Viswanatheee55
8 Replies

4. Shell Programming and Scripting

Converting shell/awk to ruby

any idea on how to get started with this: shell script: awk '/{/,/}/' ~/newservices.txt | while read line do BEGIN=$(echo "${line}" | egrep ":" | egrep "{") if ; then checkname=$(echo $line | awk -F":" '{print $1}' | sed 's_"__g') echo "{"... (1 Reply)
Discussion started by: SkySmart
1 Replies

5. Shell Programming and Scripting

Converting awk script from bash to csh

I have the following script set up and working properly in bash. It basically copies a set of lines which match "AS1100002" from one file and replaces the same lines in another file. awk -vN=AS1100002* 'NR==FNR { if($1 ~ N)K=$0; next } { if($1 in K) $0=K; print }' $datadir/file1... (7 Replies)
Discussion started by: ncwxpanther
7 Replies

6. Shell Programming and Scripting

awk - problems by converting date-format

Hi i try to change the date-format from DD/MM/YYYY into MM/DD/YY. Input-Data: ... 31/12/2013,23:40,198.00,6.20,2,2,2,1,11580.0,222 31/12/2013,23:50,209.00,7.30,2,2,3,0,4380.0 01/01/2014,00:00,205.90,8.30,2,2,3,1,9360.0,223 ... Output-Data should be: ...... (7 Replies)
Discussion started by: IMPe
7 Replies

7. Shell Programming and Scripting

Converting to matrix-like file using AWK

Hi, Needs for statistics, doing converting Here is a sample file Input : 1|A|17,94 1|B|22,59 1|C|56,93 2|A|63,71 2|C|23,92 5|B|19,49 5|C|67,58 expecting something like that Output : 1|A|17,94|B|22,59|C|56,93 2|A|63,71|B|0|C|23,92 5|A|0|B|19,49|C|67,58 (11 Replies)
Discussion started by: fastlane3000
11 Replies

8. Shell Programming and Scripting

Converting txt file into CSV using awk or sed

Hello folks I have a txt file of information about journal articles from different fields. I need to convert this information into a format that is easier for computers to manipulate for some research that I'm doing on how articles are cited. The file has some header information and then details... (8 Replies)
Discussion started by: ksk
8 Replies

9. UNIX for Dummies Questions & Answers

Converting HP-UX awk to Solaris

Hi, I am using awk in HP-UX to enter an encrypted entry of the password into /etc/passwd with success, this is the command I am using and it is working great. cat /tmp/passwd.gal.before|awk -F: -v gal_passwd="encrypted_password" '{OFS=":" ; print $1,gal_passwd,$3,$4,$5,$6,$7}' >... (3 Replies)
Discussion started by: galuzan
3 Replies

10. Shell Programming and Scripting

Converting a line into a list using awk or sed

Hello, I am trying to convert a line into a list using awk or sed. Line: 345 897 567 098 123 output: 345 897 567 098 123 thanks (7 Replies)
Discussion started by: smarones
7 Replies
Login or Register to Ask a Question