passing a regex as variable to awk and using that as regular expression for search


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting passing a regex as variable to awk and using that as regular expression for search
# 1  
Old 02-16-2012
passing a regex as variable to awk and using that as regular expression for search

Hi All,

I have a sftp session log where I am transferring multi files by issuing "mput abc*.dat". The contents of the logfile is below -
#################################################
Code:
Connecting to 10.75.112.194...
Changing to: /home/dasd9x/testing1
sftp> mput abc*.dat
Uploading abc140212095613.dat to /home/dasd9x/testing1/abc140212095613.dat
abc140212095613.dat                                                                                                        100%   21     0.0KB/s   00:00
Uploading abc140212095639.dat to /home/dasd9x/testing1/abc140212095639.dat
abc140212095639.dat                                                                                                        100%   25     0.0KB/s   00:00
Uploading abc140212095648.dat to /home/dasd9x/testing1/abc140212095648.dat
abc140212095648.dat                                                                                                        100%   43     0.0KB/s   00:00
Uploading abc140212095658.dat to /home/dasd9x/testing1/abc140212095658.dat
abc140212095658.dat                                                                                                        100%   35     0.0KB/s   00:00
Uploading abc140212095710.dat to /home/dasd9x/testing1/abc140212095710.dat
abc140212095710.dat                                                                                                        100%   27     0.0KB/s   00:00
Uploading abc140212095719.dat to /home/dasd9x/testing1/abc140212095719.dat
abc140212095719.dat                                                                                                        100%   40     0.0KB/s   00:00
Uploading abc14022012.dat to /home/dasd9x/testing1/abc14022012.dat
abc14022012.dat                                                                                                            100%   52     0.0KB/s   00:00
sftp> ls -l
drwxr-xr-x    0 600598020 600598020     1024 Feb 16 14:35 .
drwx------    0 600598020 600598020     1024 Feb 16 14:34 ..
-rw-r--r--    0 600598020 600598020        0 Feb 16 14:32 a.dat
-rw-r--r--    0 600598020 600598020       21 Feb 16 14:35 abc140212095613.dat
-rw-r--r--    0 600598020 600598020       25 Feb 16 14:35 abc140212095639.dat
-rw-r--r--    0 600598020 600598020       43 Feb 16 14:35 abc140212095648.dat
-rw-r--r--    0 600598020 600598020       35 Feb 16 14:35 abc140212095658.dat
-rw-r--r--    0 600598020 600598020       27 Feb 16 14:35 abc140212095710.dat
-rw-r--r--    0 600598020 600598020       40 Feb 16 14:35 abc140212095719.dat
-rw-r--r--    0 600598020 600598020       52 Feb 16 14:35 abc14022012.dat
-rw-r--r--    0 600598020 600598020        0 Feb 16 14:32 b.dat
-rw-r--r--    0 600598020 600598020        0 Feb 16 14:32 c.dat
-rw-r--r--    0 600598020 600598020        0 Feb 16 14:32 d.dat
sftp> quit

#################################################

This log has been captured in a file called sftp_log. Now I am out of the sftp session and I have this sftp_log for my referance. I want to check the log and find out if all the files (resembling abc*.dat) are transferred as per size. So, I want to find the lines where the abc*.dat files were long listed. I have the abc*.dat captured in a variable named TRANSFERRING_FNAME. So I used this variable to be passed in awk and search the desired lines by below command -
Code:
awk -v fname="$TRANSFERRING_FNAME" 'substr($1,1,1) == "-" && $9 ~ fname {print $9 "|" $5}' sftp_log

But it is not returning anything. Actually I needed only below lines from the sftp_log -
###################################################
Code:
-rw-r--r--    0 600598020 600598020       21 Feb 16 14:35 abc140212095613.dat
-rw-r--r--    0 600598020 600598020       25 Feb 16 14:35 abc140212095639.dat
-rw-r--r--    0 600598020 600598020       43 Feb 16 14:35 abc140212095648.dat
-rw-r--r--    0 600598020 600598020       35 Feb 16 14:35 abc140212095658.dat
-rw-r--r--    0 600598020 600598020       27 Feb 16 14:35 abc140212095710.dat
-rw-r--r--    0 600598020 600598020       40 Feb 16 14:35 abc140212095719.dat
-rw-r--r--    0 600598020 600598020       52 Feb 16 14:35 abc14022012.dat

###################################################

From these lines I want to extract the file name and their corresponding size like below -
############################################
Code:
abc140212095613.dat|21
abc140212095639.dat|25
abc140212095648.dat|43
abc140212095658.dat|35
abc140212095710.dat|27
abc140212095719.dat|40
abc14022012.dat|52

############################################

And the value in $TRANSFERRING_FNAME can vary so we can't manupulate on hard coded value like 'abc'. Could anyone please advise.

Thanks & Regards,
Bijitesh

Last edited by Franklin52; 02-17-2012 at 03:30 AM.. Reason: Please use code tags for code and data samples, thank you
# 2  
Old 02-16-2012
Just look for - at the beginning of the line and print the fields you want.

Code:
awk '/^-/ { print $9 "|" $5 }' datafile

This User Gave Thanks to Corona688 For This Post:
# 3  
Old 02-16-2012
Thanks. But in that case I'll end up getting the lines -
##############################################
Code:
-rw-r--r--    0 600598020 600598020        0 Feb 16 14:32 a.dat
-rw-r--r--    0 600598020 600598020       21 Feb 16 14:35 abc140212095613.dat
-rw-r--r--    0 600598020 600598020       25 Feb 16 14:35 abc140212095639.dat
-rw-r--r--    0 600598020 600598020       43 Feb 16 14:35 abc140212095648.dat
-rw-r--r--    0 600598020 600598020       35 Feb 16 14:35 abc140212095658.dat
-rw-r--r--    0 600598020 600598020       27 Feb 16 14:35 abc140212095710.dat
-rw-r--r--    0 600598020 600598020       40 Feb 16 14:35 abc140212095719.dat
-rw-r--r--    0 600598020 600598020       52 Feb 16 14:35 abc14022012.dat
-rw-r--r--    0 600598020 600598020        0 Feb 16 14:32 b.dat
-rw-r--r--    0 600598020 600598020        0 Feb 16 14:32 c.dat
-rw-r--r--    0 600598020 600598020        0 Feb 16 14:32 d.dat

##############################################

and selecting their name and size will get -
###############################
Code:
a.dat|0
abc140212095613.dat|21
abc140212095639.dat|25
abc140212095648.dat|43
abc140212095658.dat|35
abc140212095710.dat|27
abc140212095719.dat|40
abc14022012.dat|52
b.dat|0
c.dat|0
d.dat|0

###############################

But I don't want the details of a.dat, b.dat, c.dat and d.dat.

Regards,
Bijitesh

Last edited by Franklin52; 02-17-2012 at 03:30 AM.. Reason: Please use code tags for code and data samples, thank you
# 4  
Old 02-16-2012
What are the exact contents of this TRANSFERRING_FNAME? abc*.dat is not regular expression, it's a glob -- try to interpret it as a regex and it will work wrong. It will match abc.dat, abcadat, abcccc.dat, abccccccadat, and so forth, since * means "one or more of the previous character" and . means "any single character".

For a regex I think you'd want '^abc.*\.dat'.
This User Gave Thanks to Corona688 For This Post:
# 5  
Old 02-16-2012
Thanks Corona688 for correcting me. I am not good at regex and thought abc*.dat as regex. But TRANSFERRING_FNAME variable contains abc*.dat only. Actually it is taken from the understanding of working "ls -lrt abc*.dat" which means all files - abc.dat, abca.dat, abcccc.dat, abcccccca.dat, abc140212095613.dat, abc140212095639.dat, abc14022012.dat etc. The script first does multiple put in a sftp session as "mput $TRANSFERRING_FNAME". And then from this clue, i.e. $TRANSFERRING_FNAME I have to search the sftp_log. And it is obvious that I can't modify the contents of $TRANSFERRING_FNAME as '^abc.*\.dat'. Please help.
# 6  
Old 02-16-2012
That's a slight fallacy known as useless use of ls *, it's not ls that understands what * means -- it's the shell itself. That's why * works the same way with every command.

mput is independent of the shell however, and does its own processing of *, but it works the same way as the shell.

You're going to need to modify your shell script I think. Perhaps you can tell it the prefix you want, instead of giving it the entire "abc*.dat", and the shell can replace that itself, and you can feed that into awk in the form you want as well.

To modify your shell script I'll need to actually see it of course.

---------- Post updated at 10:01 AM ---------- Previous update was at 09:55 AM ----------

This kludge might also work, but it's not pretty. It replaces all . by \., all * by .*, prepends ^, and appends $ to turn simple globs into regular expressions. If your awk doesn't have gsub, use nawk.

Code:
awk -v STR="*.dat" 'BEGIN { gsub(/[.]/, "\\.", STR); gsub(/[*]/, ".*", STR); STR="^" STR "$" }; /^-/ && ($9 ~ STR) { print $9 "|" $5 }'

This User Gave Thanks to Corona688 For This Post:
# 7  
Old 02-17-2012
Thanks a lot Corona688 for this valuable advise. I'll check if this is possible. One thing is that the TRANSFERRING_FNAME would be assigned with a value that the users passes as the argument to the shell script. So it can contain anything starting from absolute value abc140212095639.dat to abc*.dat or abc?.dat.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk regular expression search

Hi All, I would like to search a regular expression by passing as an i/p variableto AWK. For Example :: 162.111.101.209.9516 162.111.101.209.41891 162.111.101.209.9516 162.111.101.209.9517 162.111.101.209.41918 162.111.101.209.9517 162.111.101.209.41937 162.111.101.209.41951... (7 Replies)
Discussion started by: Girish19
7 Replies

2. Shell Programming and Scripting

Passing regular expression to nawk

I am trying to test if I can replace a regular expression in 'nawk' with a variable. Please let me know why it is not working. I am using ksh88i on solaris8 I am trying use this test as a building block to filter active external DNS connections. Ideally I want to pass variable defined... (4 Replies)
Discussion started by: kchinnam
4 Replies

3. Emergency UNIX and Linux Support

Regular expression (regex) clean up text

Hi, Server - MEDIAWIKI - MYSQL - CENTOS 5 - PHP5 I have a database import of close to a million pages into my wiki, mediawiki site, the format that were left with is not pretty, and I need to find a way to clean this up and present it nicely... I think regex is the best option as I can... (1 Reply)
Discussion started by: lawstudent
1 Replies

4. Shell Programming and Scripting

How can awk search a string without using regular expression?

Hello, Awk seem treat the pattern as regular expression, how can awk search not using regular expression? e.g. just represent for "", not "A" or "a" . I don't want to add backslash . (2 Replies)
Discussion started by: 915086731
2 Replies

5. Shell Programming and Scripting

Passing Variable to Regular Expression

Hi All, Below is a sample code: print "Enter the Name: "; my $Name = <>; print "Word is $Name"; open (FH,"AIDNameList.txt"); while (<FH>) { my $line; print "Word is $Name"; for $line(<FH>)... (12 Replies)
Discussion started by: jisha
12 Replies

6. Shell Programming and Scripting

awk + pattern search with regular expression

Hi , I have a file with "|" (pipe) as a delimeter. I am looking for the record count where 5th field is a number with 15 digit length only. all the records with above requirement is valid rest all are invalid. I need count of valid records and invalid records. Can anyone please help (9 Replies)
Discussion started by: vikash_k
9 Replies

7. UNIX for Advanced & Expert Users

Regular expression / regex substition on Unicode text

I have a large file encoded in Unicode that I need to convert to CSV. In general, I know how to do this by regular expression substitutions using sed or Perl, but one problem I am having is that I need to put a quotation mark at the end of each line to protect the last field. The usual regex... (1 Reply)
Discussion started by: thomas.hedden
1 Replies

8. Shell Programming and Scripting

Regular expression (regex) required

I want to block all special characters except alphanumerics.. and "."(dot ) character currently am using // I want to even block only single dot or multiple dots.. ex: . or .............. should be blocked. please provide me the reg ex. ---------- Post updated at 05:11 AM... (10 Replies)
Discussion started by: shams11
10 Replies

9. Shell Programming and Scripting

AWK - compare $0 to regular expression + variable

Hi, I have this script: awk -v va=45 '$0~va{print}' flo2 That returns: "4526745 1234 " (this is the only line of the file "flo2". However, I would like to get "va" to match the begining of the line, so that is "va" is different than 45 (eg. 67, 12 ...) I would not have any output. That... (3 Replies)
Discussion started by: jolecanard
3 Replies

10. Shell Programming and Scripting

Awk's variable in regular expression

Anyone know how I will use awk's variable in a regular expression? This line of code of mine is working, the value PREMS should be a variable: awk '$1 ~ /PREMS/ { if(length(appldata)+2 >= length($1)) print $0; }' appldata=$APPLDATA /tmp/file.tmp The value of APPLDATA variable is PREMS. ... (2 Replies)
Discussion started by: Orbix
2 Replies
Login or Register to Ask a Question