Extract values in a line using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract values in a line using awk
# 8  
Old 06-07-2016
Try this small adaption to zaxxon's file proposal:

Code:
sed -n '/HomeZone/ s/.*CustomerId="\([^"]\+\)".*s_PackageId=\([^,]\+\).*s_HZINumbers=\[\([^]]\+\)\].*/\1|\2|\3/p' file3
967666666666|"HomeZone"|"17103","10803","100000","200000","300000"

This User Gave Thanks to RudiC For This Post:
# 9  
Old 06-07-2016
awk with a logic differing slightly from RavinderSingh13's (and, it works also on IDs longer or shorter than 12 digits...):


Code:
awk '
match ($0, /s_PackageId="HomeZone"/)    {PI = substr ($0, RSTART+12, RLENGTH-12)
                                         match ($0, /CustomerId="[^"]*/)
                                         CI = substr ($0, RSTART+12, RLENGTH-12)
                                         match ($0, /HZINumbers=\[[^]]*/)
                                         HN = substr ($0, RSTART+12, RLENGTH-12)
                                         print CI, PI, HN
                                        }
' OFS="|" file
967666666666|"HomeZone"|"17103","10803","100000","200000","300000"

This User Gave Thanks to RudiC For This Post:
# 10  
Old 06-16-2016
Hello All again,

The codes above actually worked properly but i encountered another problem. sometimes, the field HZINumbers doesn't appear in the line and i want the default value for such case to be "0". how can i do that?

---------- Post updated at 04:05 AM ---------- Previous update was at 03:37 AM ----------

The below is an example:

Code:
Read:RPP(CustomerId="961666666666",OfferProfileKey=1,Key=7,category=ONLINE,prefetchFilter=-1,s_ActivationEndTime=32532613199999,s_ActivationStartTime=1459112400000,s_CRMTitle="-",s_CanBeSharedByMultipleRops=FALSE,s_InsertedViaBatch=TRUE,s_PackageId="HomeZone",s_PeriodStartPoint=0,vValidFrom=2016-03-28 00:00:00,vInvalidFrom=4000-01-01 00:00:00,s_FreeOfChargeModifyTasks=0,s_HZINumbers=["17103","10803","100000","200000","300000"],s_PeriodicBonus={CreditLimit=0,CreditMax=1000000},s_Touched=TRUE,bCategory=ONLINE,bSeriesId=0,bValidFrom=2009-01-01 00:00:00,bInvalidFrom=MAX_DATEANDTIME,s_Active=TRUE,s_ExpireDate=3000-01-01,s_FreeOfChargePeriods=0,s_NextPeriodAct=32532613199999,s_OnTouchDate=32532613199999,s_StartDate=1970-01-01,s_Valid=TRUE);


Read:RPP(CustomerId="961666666662",OfferProfileKey=1,Key=7,category=ONLINE,prefetchFilter=-1,s_ActivationEndTime=32532613199999,s_ActivationStartTime=1459112400000,s_CRMTitle="-",s_CanBeSharedByMultipleRops=FALSE,s_InsertedViaBatch=TRUE,s_PackageId="HomeZone",s_PeriodStartPoint=0,vValidFrom=2016-03-28 00:00:00,vInvalidFrom=4000-01-01 00:00:00,s_FreeOfChargeModifyTasks=0,,s_PeriodicBonus={CreditLimit=0,CreditMax=1000000},s_Touched=TRUE,bCategory=ONLINE,bSeriesId=0,bValidFrom=2009-01-01 00:00:00,bInvalidFrom=MAX_DATEANDTIME,s_Active=TRUE,s_ExpireDate=3000-01-01,s_FreeOfChargePeriods=0,s_NextPeriodAct=32532613199999,s_OnTouchDate=32532613199999,s_StartDate=1970-01-01,s_Valid=TRUE);

as you can see, in the first line the HZINumbers field is appearing but in the second line its not appearing and for that i don't get any result for the second line although the CustomerId and s_packageid are there.

and i want the output to be:

Code:
961666666666|"HomeZone"|"17103","10803","100000","200000","300000"
961666666662|"HomeZone"|0



Moderator's Comments:
Mod Comment Please use code tags as required by forum rules!

Last edited by RudiC; 06-16-2016 at 09:06 AM.. Reason: Added code tags.
# 11  
Old 06-16-2016
Quote:
Originally Posted by nael_najib
Hello All again,

The codes above actually worked properly but i encountered another problem. sometimes, the field HZINumbers doesn't appear in the line and i want the default value for such case to be "0". how can i do that?
---------- Post updated at 04:05 AM ---------- Previous update was at 03:37 AM ----------
The below is an example:
Read:RPP(CustomerId="961666666666",OfferProfileKey=1,Key=7,category=ONLINE,prefetchFilter=-1,s_ActivationEndTime=32532613199999,s_ActivationStartTime=1459112400000,s_CRMTitle="-",s_CanBeSharedByMultipleRops=FALSE,s_InsertedViaBatch=TRUE,s_PackageId="HomeZone",s_PeriodStartPoin t=0,vValidFrom=2016-03-28 00:00:00,vInvalidFrom=4000-01-01 00:00:00,s_FreeOfChargeModifyTasks=0,s_HZINumbers=["17103","10803","100000","200000","300000"],s_PeriodicBonus={CreditLimit=0,CreditMax=1000000},s_Touched=TRUE,bCategory=ONLINE,bSeriesId=0,bVali dFrom=2009-01-01 00:00:00,bInvalidFrom=MAX_DATEANDTIME,s_Active=TRUE,s_ExpireDate=3000-01-01,s_FreeOfChargePeriods=0,s_NextPeriodAct=32532613199999,s_OnTouchDate=32532613199999,s_StartDate=1 970-01-01,s_Valid=TRUE);


Read:RPP(CustomerId="961666666662",OfferProfileKey=1,Key=7,category=ONLINE,prefetchFilter=-1,s_ActivationEndTime=32532613199999,s_ActivationStartTime=1459112400000,s_CRMTitle="-",s_CanBeSharedByMultipleRops=FALSE,s_InsertedViaBatch=TRUE,s_PackageId="HomeZone",s_PeriodStartPoin t=0,vValidFrom=2016-03-28 00:00:00,vInvalidFrom=4000-01-01 00:00:00,s_FreeOfChargeModifyTasks=0,,s_PeriodicBonus={CreditLimit=0,CreditMax=1000000},s_Touched=TR UE,bCategory=ONLINE,bSeriesId=0,bValidFrom=2009-01-01 00:00:00,bInvalidFrom=MAX_DATEANDTIME,s_Active=TRUE,s_ExpireDate=3000-01-01,s_FreeOfChargePeriods=0,s_NextPeriodAct=32532613199999,s_OnTouchDate=32532613199999,s_StartDate=1 970-01-01,s_Valid=TRUE);
as you can see, in the first line the HZINumbers field is appearing but in the second line its not appearing and for that i don't get any result for the second line although the CustomerId and s_packageid are there.
and i want the output to be:
961666666666|"HomeZone"|"17103","10803","100000","200000","300000"
961666666662|"HomeZone"|0
Hello nael_najib,

Could you please try following and let me know how it goes then.
Code:
awk -vs1="CustomerId=" -vs2="PackageId" -vs3="s_HZINumbers" '!/^$/{match($0,/CustomerId=\"[0-9]+/);Q=substr($0,RSTART+length(s1)+1,RLENGTH-length(s1)-1);match($0,/PackageId[^"]*"/);P=substr($0,RSTART+length(s2)+1,RLENGTH-2);Q=substr($0,RSTART+length(s2)+1,RLENGTH-2)?Q OFS substr($0,RSTART+length(s2)+1,RLENGTH-1):Q OFS 0;match($0,/s_HZINumbers[^]]*/);Q=substr($0,RSTART+length(s3)+2,RLENGTH-length(s3)-2)?Q OFS substr($0,RSTART+length(s3)+2,RLENGTH-length(s3)-2):Q OFS 0;print Q}' OFS="|"   Input_file

Output will be as follows.
Code:
961666666666|"HomeZone"|"17103","10803","100000","200000","300000"
961666666662|"HomeZone"|0

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 12  
Old 06-16-2016
Thank you Ravinder, it worked. but if i want to add another field example (s_NextPeriodAct) and to have (0 zero) value if its not available in the line. how would the code look like?

I appreciate your patience and good response
# 13  
Old 06-16-2016
Wouldn't it be nice if you posted some efforts by yourself to be discussed? Adapt the solution given before:

Code:
awk '
match ($0, /s_PackageId="HomeZone"/)    {PI = substr ($0, RSTART+12, RLENGTH-12)
                                         match ($0, /CustomerId="[^"]*/)
                                         CI = substr ($0, RSTART+12, RLENGTH-12)
                                         if (match ($0, /HZINumbers=\[[^]]*/))
                                                HN = substr ($0, RSTART+12, RLENGTH-12)
                                           else HN = 0
                                         print CI, PI, HN         
                                        }
' OFS="|" file
961666666666|"HomeZone"|"17103","10803","100000","200000","300000"
961666666662|"HomeZone"|0

This User Gave Thanks to RudiC For This Post:
# 14  
Old 06-16-2016
Now, the code is clearer and can add more fields. Thank you all for the good efforts
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl to extract values and print at end of each line

In the below perl I am trying to extract and print the values AF1=, the GT value, and F or QUAL diveded by 33 (rounded to the nearest whole #). The GT value is at the end after the GT:PL so all the possibilities are read into a hash h, then depending on the value that is in the line the... (1 Reply)
Discussion started by: cmccabe
1 Replies

2. Shell Programming and Scripting

awk to match file1 and extract specific tag values

File2 is tab-delimeted and I am trying to use $2 in file1 (space delimeted) as a search term in file2. If it is found then the AF= in and the FDP= values from file2 are extracted and printed next to the file1 line. I commented the awk before I added the lines in bold the current output resulted. I... (7 Replies)
Discussion started by: cmccabe
7 Replies

3. Shell Programming and Scripting

awk to extract multiple values from file and add two additional fields

In the attached file I am trying to use awk to extract multiple values and create the tab-delimited desired output. In the output R_Index is a the sequential # and Pre_Enrichment is defaulted to .. I can extract from the values to the side of the keywords, but most are above and I can not... (2 Replies)
Discussion started by: cmccabe
2 Replies

4. Shell Programming and Scripting

How to extract xml attribute values using awk inline.?

I am trying to extract specific XML attribute values for search pattern <factories.*baseQueueName' from resources.xml. my scripts works ok,, but to extract 3 values this code does echo $line three times, it could be 'n' times. How can I use awk to extract matching pattern values in-line or... (11 Replies)
Discussion started by: kchinnam
11 Replies

5. Shell Programming and Scripting

Extract values froma line in file

p.txt T|DCNT=100|RECCHK=22222.2|PERCHK=32323|# I want to extract the value of 100 22222.2 and 32323 and assign it to variable x1,y1,z1 x=`cut -f2 -d "=" p.txt` x1=`echo $x | cut -f1 -d "|" ` y=`cut -f3 -d "=" p.txt` y1=`echo $y | cut -f1 -d "|" ` z=`cut -f4 -d "=" p.txt` z1=`echo $z... (3 Replies)
Discussion started by: w020637
3 Replies

6. Shell Programming and Scripting

How to extract part of xml line via awk?

Hi, I like to set a variable "name" automatically by reading an xml file. The name should be set to the date, which is a part of the following line of the xml file: <sceneID>C82_N32_A_SM_strip_008_R_2009-11-24T04:22:12.790028Z</sceneID> How can I separate this line, that the name will... (6 Replies)
Discussion started by: friend
6 Replies

7. Shell Programming and Scripting

how to extract part of xml line via awk?

Hi, I like to set a variable "name" automatically by reading an xml file. My code looks like this: set name = `awk '/<generationTime>/,/<\/generationTime>/ p' $xml_name` the "name" is thus set to <generationTime>2004-12-01T08:23:50.000000</generationTime> How can I separate this line,... (3 Replies)
Discussion started by: friend
3 Replies

8. Shell Programming and Scripting

Awk extract a range of values

Hi Input 10 131 11 179 11 170 20 142 20 131 20 144 21 178 22 155 22 196 23 144 23 184 24 194 24 191 24 218 25 167 25 131 26 189 (6 Replies)
Discussion started by: genehunter
6 Replies

9. UNIX for Dummies Questions & Answers

how extract certain value within a line using awk

hi if would like to get the phone number as an output, can you guide me here please <A>213444555</A><B><B>ABCDEFG</B> I just want to get the phone number from the file in between <A> and </A> Thanks (9 Replies)
Discussion started by: imran721
9 Replies

10. Shell Programming and Scripting

awk: need to extract a line before a pattern

Hello , I need your help to extract a line in a big file , and this line is always 11 lines before a specific pattern . Do you know a way via Awk ? Thanks in advance npn35 (17 Replies)
Discussion started by: npn35
17 Replies
Login or Register to Ask a Question