File filtering using awk or sed


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File filtering using awk or sed
# 1  
Old 12-30-2015
File filtering using awk or sed

Hello Members,

I have a file, having below contents:

Code:
<KEYVALUE>$4,0,1,4,966505098999--&gt;RemoteSPC: 13 SSN: 146</KEYVALUE>
<KEYVALUE>$4,123,1,4,966505050198--&gt;RemoteSPC: 1002 SSN: 222,Sec:RemoteSPC: 1004 SSN: 222</KEYVALUE>
<KEYVALUE>$4,123,1,4,966505050598--&gt;RemoteSPC: 1002 SSN: 222,Sec:RemoteSPC: 1004 SSN: 222</KEYVALUE>
<KEYVALUE>$4,123,1,4,966505050998--&gt;RemoteSPC: 1008 SSN: 222,Sec:RemoteSPC: 1009 SSN: 222</KEYVALUE>
<KEYVALUE>GA12$4,0,1,4,966505070212--&gt;GA12 GA12 RemoteSPC: 8020 SSN: 146</KEYVALUE>
<KEYVALUE>GD13$4,0,1,4,966505030313--&gt;GD13 GD13 RemoteSPC: 3020 SSN: 146</KEYVALUE>

While using awk or sed, I would like to have output like below:

Code:
966505098999,13
966505050198,1002
966505050598,1002
966505050998,1008
966505070212,8020
966505030313,3020

Please help me and let me know, how I can get above output.

Regards,
Umar

Last edited by umarsatti; 12-30-2015 at 08:29 AM..
# 2  
Old 12-30-2015
Please use code tags Image, not icode Image tags for multiline code samples.

What have you tried so far?

---------- Post updated at 13:17 ---------- Previous update was at 13:06 ----------

Anyway, to get you started:
Code:
while read CONT;do
	num="${CONT/\-\-*}"
	spc="${CONT/*SPC:}"
	spc="${spc/SSN*}"
	echo "${num/*,},${spc}"
done<input_file

hth
This User Gave Thanks to sea For This Post:
# 3  
Old 12-30-2015
Hello umarsatti,

Following may help you in same.
Code:
awk --re-interval '{match($0,/[0-9]{12}+/);B=substr($0,RSTART,RLENGTH);match($0,/RemoteSPC: [0-9]+/);print B OFS substr($0,RSTART+11,RLENGTH-11)}' OFS=", "   Input_file

Output will be as follows.
Code:
966505098999, 13
966505050198, 1002
966505050598, 1002
966505050998, 1008
966505070212, 8020
966505030313, 3020

EDIT: Adding a non-one liner form for same too now.
Code:
awk --re-interval '{
                        match($0,/[0-9]{12}+/);
                        B=substr($0,RSTART,RLENGTH);
                        match($0,/RemoteSPC: [0-9]+/);
                        print B OFS substr($0,RSTART+11,RLENGTH-11)
                   }
                  ' OFS=", "   Input_file

Thanks,
R. Singh

Last edited by RavinderSingh13; 12-30-2015 at 08:40 AM.. Reason: Added a non-one liner form for solution now.
This User Gave Thanks to RavinderSingh13 For This Post:
# 4  
Old 12-30-2015
Thanks, I have edited my post in proper format.

Your code is not working for the attached file.

Can you please take a look at it and suggest?

Please explain your code as well.

Thanks!!
# 5  
Old 12-30-2015
The format of your file does not resemble the sample you posted in post #1 at all !


Moderator's Comments:
Mod Comment Please take proper care when posting data samples so that forum members that are trying to help you do not waste their valuable time...
# 6  
Old 12-30-2015
Hello umarsatti,

Seems your attachment provided Input_file is having different input values compare to sample shown in your very first post, following may help you in same.
Code:
awk --re-interval '{match($0,/[0-9]+/);B=substr($0,RSTART,RLENGTH);match($0,/SSN: [0-9]+/);print B OFS substr($0,RSTART+5,RLENGTH-5)}'  Input_file

Output will be as follows.(only as sample I am showing not complete output it is.)
Code:
966505097999 146
966505098999 146
966505050198 222
966505050598 222
966505050998 222
966505070212 146
966505030313 146
966505022317 146
966505020214 146
966505020915 146
966505022414 146
966505022315 146
966505022316 146
966505011427 146
966505011828 146
966505023108 146
966505060208 146
966505022318 146

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 7  
Old 12-30-2015
Thanks R.Singh and hth, both of your codes worked perfectly fine.

And apologies I mixed up the input file.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Filtering data using uniq and sed

Hello, Does anyone know an easy way to filter this type of file? I want to get everything that has score (column 2) 100.00 and get rid of duplicates (for example gi|332198263|gb|EGK18963.1| below), so I guess uniq can be used for this? gi|3379182634|gb|EGK18561.1| 100.00... (6 Replies)
Discussion started by: narachaid
6 Replies

2. Shell Programming and Scripting

Awk/sed : help on:Filtering multiple lines to one:

Experts Good day, I want to filter multiple lines of same error of same day , to only 1 error of each day, the first line from the log. Here is the file: May 26 11:29:19 cmihpx02 vmunix: NFS write failed for server cmiauxe1: error 5 (RPC: Timed out) May 26 11:29:19 cmihpx02 vmunix: NFS... (4 Replies)
Discussion started by: rveri
4 Replies

3. Shell Programming and Scripting

sed filtering lines by range fails 1-line-ranges

The following is part of a larger project and sed is (right now) a given. I am working on a recursive Korn shell function to "peel off" XML tags from a larger text. Just for context i will show the complete function (not working right now) here: function pGetXML { typeset chTag="$1" typeset... (5 Replies)
Discussion started by: bakunin
5 Replies

4. Shell Programming and Scripting

Need help for filtering a file through awk script

Hello Folks, I am working on filtering a file having some special characters. Let's say for an example a file contains person name and phone number based on positions. First 5 characters name and next 10 characters phone number. My task is to , if there is any special character in phone number... (6 Replies)
Discussion started by: dinesh1985
6 Replies

5. Shell Programming and Scripting

filtering with awk

i have question about awk ex: input.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 1252468819,yahoo,1.2 msn,1252468812,8.9 1252468923,gmail,12 live,1252468812,3.4 yahoo,1252468812,9.0 1252468929,msn,1.2 output.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 msn,1252468812,8.9... (3 Replies)
Discussion started by: zvtral
3 Replies

6. Shell Programming and Scripting

Filtering Issues Using sed and awk

Hi, I am currently using the sed and awk commands to filter a file that has multiple sets of data in different columns. An example of part of the file i am filtering is as follows; Sat Oct 2 07:42:45 2010 01:33:46 R1_CAR_12.34 Sun Oct 3 13:09:53 2010 00:02:34 R2_BUS_56.78 Sun... (4 Replies)
Discussion started by: crunchie
4 Replies

7. UNIX for Dummies Questions & Answers

filtering and copying contains of a file using awk/sed

Hello folks, I have 2 files one( file1) contains the ddl for a view and file 2 contains the view defination/alias columns. I want to merge the 2 into a third file using awk/sed as follows: cheers ! :b: FILE1 ----- PROMPT FIRST_VIEW CREATE OR REPLACE FORCE VIEW FIRST_VIEW AS SELECT... (2 Replies)
Discussion started by: jville
2 Replies

8. Shell Programming and Scripting

Sed filtering issue

The problem I have is that I have 23,000 records I need to sort through to pull out LEN: XXXX XX XX XX XX and NCOS: XXX entries from so I can insert them into a database. But some of my records include TYPE: ISDN, THE DN IS UNASSIGNED, or INVALID entries in between some records and I would like... (2 Replies)
Discussion started by: roachmmflhyr
2 Replies

9. Shell Programming and Scripting

awk and sed filtering

Goo afternoon Sir'sould like to ask your help reagrding in this scenario using sed and awk. ******************************************************** Host:CDRMSAPPS1 Operating System:Linux 2.6.9-42.ELsmp Machine Type:UNIX Host Type:Client Version:5.1... (2 Replies)
Discussion started by: invinzin21
2 Replies

10. UNIX for Advanced & Expert Users

awk filtering ?

I have a Ques. Regarding awk I have few strings in a file, like.. ABC DEF_ABC GHI_ABC GHI Now I want string which has only 'ABC', not the part of any other string as it is also present in 'DEF_ABC' Output should be ABC Please guide me asap !! Thanks :b: (4 Replies)
Discussion started by: varungupta
4 Replies
Login or Register to Ask a Question