Filtering Issues Using sed and awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Filtering Issues Using sed and awk
# 1  
Old 11-24-2010
Filtering Issues Using sed and awk

Hi,

I am currently using the sed and awk commands to filter a file that has multiple sets of data in different columns. An example of part of the file i am filtering is as follows;

Code:
Sat Oct  2 07:42:45 2010    01:33:46 R1_CAR_12.34
Sun Oct  3 13:09:53 2010    00:02:34 R2_BUS_56.78
Sun Oct  3 21:11:39 2010    00:43:21 R3_TRAIN_COACH_90.12
Mon Oct  4 06:07:10 2010    00:01:50 R4_TRAIN_CARRAIGE_34.56X

when i filter the file i get the following result;

Code:
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,TRAIN,
Mon,Oct,4,2010,00:01:50,TRAIN,X


The sed and awk commands i am using are as follows;

Code:
sed 's/[^ \t][^ \t]*[ \t]//4;s/[^ \t_]*_//;s/_.*\(.\)$/ \1/;s/[^X]$//' |  awk '{print $1","$2","$3","$4","$5","$
6","$7}'

I am trying to figure out how to filter the data so that, for example, instead of getting;


Code:
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,TRAIN,
Mon,Oct,4,2010,00:01:50,TRAIN,X

i would like to get;

Code:
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,COACH,
Mon,Oct,4,2010,00:01:50,CARRAIGE,X


Could i use the sed command twice so that i would get;

Code:
Sat Oct  2 07:42:45 2010    01:33:46 CAR
Sun Oct  3 13:09:53 2010    00:02:34 BUS
Sun Oct  3 21:11:39 2010    00:43:21 TRAIN_COACH
Mon Oct  4 06:07:10 2010    00:01:50 TRAIN_CARRAIGE X

first and then use the sed command to remove the "TRAIN_" part to get;

Code:
Sat Oct  2 07:42:45 2010    01:33:46 CAR
Sun Oct  3 13:09:53 2010    00:02:34 BUS
Sun Oct  3 21:11:39 2010    00:43:21 COACH
Mon Oct  4 06:07:10 2010    00:01:50 CARRIAGE X

This is only a suggestion but a much better method could probably be used.

Unfotunately i am new to unix so i am only just getting used to all the commands

If i have made anything unclear please let me know and i will try to explain the problem better.

Any help would be greatly appreciated

Thanks in advance
# 2  
Old 11-24-2010
Code:
nawk '{n=split($NF,a,"[_.]");print $1,$2,$3,$5,$6,a[n-2],(/[A-Za-z]$/)?substr($0,length):""}' OFS=, myFile

# 3  
Old 11-24-2010
Hi vgersh,

The nawk command is working perfectly. Is there any way to add a comma as a delimiter between the different sets of data i.e. instead of

Code:
SatOct2201000:30:21CAR
SatOct2201000:30:24BUS
SatOct2201000:33:14COACH
SatOct2201000:41:51CARRAIGEX

that i am getting i would be able to get

Code:
Sat,Oct,2,2010,00:30:21,CAR,
Sat,Oct,2,2010,00:30:24,BUS,
Sat,Oct,2,2010,00:33:14,COACH,
Sat,Oct,2,2010,00:41:51,CARRAIGE,X

instead?

Thanks in advance
# 4  
Old 11-24-2010
based on the sample file you provided, this is the output I get:
Code:
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,COACH,
Mon,Oct,4,2010,00:01:50,CARRAIGE,X

don't forget the OFS=, in the code I've posted!
# 5  
Old 11-25-2010
Hi,

A problem has cropped up. Whenever the program tries to filter the following line;

Code:
Mon Oct 11 15:07:16 2010    00:01:30 R3_TRAIN_COACH_12.1.2X

I get the following output;

Code:
Mon,Oct,11,2010,00:01:30,12,X

Is there any way to alter the code so that

Code:
 COACH

is filtered instead of the

Code:
 12

digit?

The command that i am using is as follows;

Code:
nawk '{n=split($NF,a,"[_.]");print $1,$2,$3,$5,$6,a[n-2],(/[A-Za-z]$/)?substr($0,length):""}' OFS=, $FileName

Any help would be greatly appreciated

Thanks in advance

Last edited by crunchie; 11-25-2010 at 11:28 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

File filtering using awk or sed

Hello Members, I have a file, having below contents: <KEYVALUE>$4,0,1,4,966505098999--&gt;RemoteSPC: 13 SSN: 146</KEYVALUE> <KEYVALUE>$4,123,1,4,966505050198--&gt;RemoteSPC: 1002 SSN: 222,Sec:RemoteSPC: 1004 SSN: 222</KEYVALUE> <KEYVALUE>$4,123,1,4,966505050598--&gt;RemoteSPC: 1002 SSN:... (9 Replies)
Discussion started by: umarsatti
9 Replies

2. Shell Programming and Scripting

Filtering data using uniq and sed

Hello, Does anyone know an easy way to filter this type of file? I want to get everything that has score (column 2) 100.00 and get rid of duplicates (for example gi|332198263|gb|EGK18963.1| below), so I guess uniq can be used for this? gi|3379182634|gb|EGK18561.1| 100.00... (6 Replies)
Discussion started by: narachaid
6 Replies

3. Shell Programming and Scripting

Awk/sed : help on:Filtering multiple lines to one:

Experts Good day, I want to filter multiple lines of same error of same day , to only 1 error of each day, the first line from the log. Here is the file: May 26 11:29:19 cmihpx02 vmunix: NFS write failed for server cmiauxe1: error 5 (RPC: Timed out) May 26 11:29:19 cmihpx02 vmunix: NFS... (4 Replies)
Discussion started by: rveri
4 Replies

4. Shell Programming and Scripting

filtering with awk

i have question about awk ex: input.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 1252468819,yahoo,1.2 msn,1252468812,8.9 1252468923,gmail,12 live,1252468812,3.4 yahoo,1252468812,9.0 1252468929,msn,1.2 output.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 msn,1252468812,8.9... (3 Replies)
Discussion started by: zvtral
3 Replies

5. Shell Programming and Scripting

Filtering issues while using nawk

Hi, I am currently filtering a file that has multiple sets of data. An example of some of the data is as follows; Sat Oct 2 07:42:45 2010 01:33:46 R1_CAR_12.34 Sun Oct 3 13:09:53 2010 00:02:34 R2_BUS_56.78 Sun Oct 3 21:11:39 2010 00:43:21 R3_TRAIN_COACH_90.12 Mon Oct 4... (1 Reply)
Discussion started by: crunchie
1 Replies

6. Shell Programming and Scripting

Filtering issues with multiple columns in a single file

Hi, I am new to unix and would greatly appreciate some help. I have a file containing multiple colums containing different sets of data e.g. File 1: John Ireland 27_December_69 Mary England 13_March_55 Mike France 02_June_80 I am currently using the awk... (10 Replies)
Discussion started by: crunchie
10 Replies

7. Shell Programming and Scripting

Issues with filtering duplicate records using gawk script

Hi All, I have huge trade file with milions of trades.I need to remove duplicate records (e.g I have following records) 30/10/2009,trdeId1,..,.. 26/10/2009.tradeId1,..,..,, 30/10/2009,tradeId2,.. In the above case i need to filter duplicate recods and I should get following output.... (2 Replies)
Discussion started by: nmumbarkar
2 Replies

8. UNIX for Dummies Questions & Answers

filtering and copying contains of a file using awk/sed

Hello folks, I have 2 files one( file1) contains the ddl for a view and file 2 contains the view defination/alias columns. I want to merge the 2 into a third file using awk/sed as follows: cheers ! :b: FILE1 ----- PROMPT FIRST_VIEW CREATE OR REPLACE FORCE VIEW FIRST_VIEW AS SELECT... (2 Replies)
Discussion started by: jville
2 Replies

9. Shell Programming and Scripting

Sed filtering issue

The problem I have is that I have 23,000 records I need to sort through to pull out LEN: XXXX XX XX XX XX and NCOS: XXX entries from so I can insert them into a database. But some of my records include TYPE: ISDN, THE DN IS UNASSIGNED, or INVALID entries in between some records and I would like... (2 Replies)
Discussion started by: roachmmflhyr
2 Replies

10. Shell Programming and Scripting

awk and sed filtering

Goo afternoon Sir'sould like to ask your help reagrding in this scenario using sed and awk. ******************************************************** Host:CDRMSAPPS1 Operating System:Linux 2.6.9-42.ELsmp Machine Type:UNIX Host Type:Client Version:5.1... (2 Replies)
Discussion started by: invinzin21
2 Replies
Login or Register to Ask a Question