Visit Our UNIX and Linux User Community


Filtering Issues Using sed and awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Filtering Issues Using sed and awk
# 1  
Old 11-24-2010
Filtering Issues Using sed and awk

Hi,

I am currently using the sed and awk commands to filter a file that has multiple sets of data in different columns. An example of part of the file i am filtering is as follows;

Code:
Sat Oct  2 07:42:45 2010    01:33:46 R1_CAR_12.34
Sun Oct  3 13:09:53 2010    00:02:34 R2_BUS_56.78
Sun Oct  3 21:11:39 2010    00:43:21 R3_TRAIN_COACH_90.12
Mon Oct  4 06:07:10 2010    00:01:50 R4_TRAIN_CARRAIGE_34.56X

when i filter the file i get the following result;

Code:
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,TRAIN,
Mon,Oct,4,2010,00:01:50,TRAIN,X


The sed and awk commands i am using are as follows;

Code:
sed 's/[^ \t][^ \t]*[ \t]//4;s/[^ \t_]*_//;s/_.*\(.\)$/ \1/;s/[^X]$//' |  awk '{print $1","$2","$3","$4","$5","$
6","$7}'

I am trying to figure out how to filter the data so that, for example, instead of getting;


Code:
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,TRAIN,
Mon,Oct,4,2010,00:01:50,TRAIN,X

i would like to get;

Code:
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,COACH,
Mon,Oct,4,2010,00:01:50,CARRAIGE,X


Could i use the sed command twice so that i would get;

Code:
Sat Oct  2 07:42:45 2010    01:33:46 CAR
Sun Oct  3 13:09:53 2010    00:02:34 BUS
Sun Oct  3 21:11:39 2010    00:43:21 TRAIN_COACH
Mon Oct  4 06:07:10 2010    00:01:50 TRAIN_CARRAIGE X

first and then use the sed command to remove the "TRAIN_" part to get;

Code:
Sat Oct  2 07:42:45 2010    01:33:46 CAR
Sun Oct  3 13:09:53 2010    00:02:34 BUS
Sun Oct  3 21:11:39 2010    00:43:21 COACH
Mon Oct  4 06:07:10 2010    00:01:50 CARRIAGE X

This is only a suggestion but a much better method could probably be used.

Unfotunately i am new to unix so i am only just getting used to all the commands

If i have made anything unclear please let me know and i will try to explain the problem better.

Any help would be greatly appreciated

Thanks in advance
# 2  
Old 11-24-2010
Code:
nawk '{n=split($NF,a,"[_.]");print $1,$2,$3,$5,$6,a[n-2],(/[A-Za-z]$/)?substr($0,length):""}' OFS=, myFile

# 3  
Old 11-24-2010
Hi vgersh,

The nawk command is working perfectly. Is there any way to add a comma as a delimiter between the different sets of data i.e. instead of

Code:
SatOct2201000:30:21CAR
SatOct2201000:30:24BUS
SatOct2201000:33:14COACH
SatOct2201000:41:51CARRAIGEX

that i am getting i would be able to get

Code:
Sat,Oct,2,2010,00:30:21,CAR,
Sat,Oct,2,2010,00:30:24,BUS,
Sat,Oct,2,2010,00:33:14,COACH,
Sat,Oct,2,2010,00:41:51,CARRAIGE,X

instead?

Thanks in advance
# 4  
Old 11-24-2010
based on the sample file you provided, this is the output I get:
Code:
Sat,Oct,2,2010,01:33:46,CAR,
Sun,Oct,3,2010,00:02:34,BUS,
Sun,Oct,3,2010,00:43:21,COACH,
Mon,Oct,4,2010,00:01:50,CARRAIGE,X

don't forget the OFS=, in the code I've posted!
# 5  
Old 11-25-2010
Hi,

A problem has cropped up. Whenever the program tries to filter the following line;

Code:
Mon Oct 11 15:07:16 2010    00:01:30 R3_TRAIN_COACH_12.1.2X

I get the following output;

Code:
Mon,Oct,11,2010,00:01:30,12,X

Is there any way to alter the code so that

Code:
 COACH

is filtered instead of the

Code:
 12

digit?

The command that i am using is as follows;

Code:
nawk '{n=split($NF,a,"[_.]");print $1,$2,$3,$5,$6,a[n-2],(/[A-Za-z]$/)?substr($0,length):""}' OFS=, $FileName

Any help would be greatly appreciated

Thanks in advance

Last edited by crunchie; 11-25-2010 at 11:28 AM..

Previous Thread | Next Thread
Test Your Knowledge in Computers #287
Difficulty: Medium
AIX 7.2 was initially released on December 1, 2013.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

File filtering using awk or sed

Hello Members, I have a file, having below contents: <KEYVALUE>$4,0,1,4,966505098999--&gt;RemoteSPC: 13 SSN: 146</KEYVALUE> <KEYVALUE>$4,123,1,4,966505050198--&gt;RemoteSPC: 1002 SSN: 222,Sec:RemoteSPC: 1004 SSN: 222</KEYVALUE> <KEYVALUE>$4,123,1,4,966505050598--&gt;RemoteSPC: 1002 SSN:... (9 Replies)
Discussion started by: umarsatti
9 Replies

2. Shell Programming and Scripting

Filtering data using uniq and sed

Hello, Does anyone know an easy way to filter this type of file? I want to get everything that has score (column 2) 100.00 and get rid of duplicates (for example gi|332198263|gb|EGK18963.1| below), so I guess uniq can be used for this? gi|3379182634|gb|EGK18561.1| 100.00... (6 Replies)
Discussion started by: narachaid
6 Replies

3. Shell Programming and Scripting

Awk/sed : help on:Filtering multiple lines to one:

Experts Good day, I want to filter multiple lines of same error of same day , to only 1 error of each day, the first line from the log. Here is the file: May 26 11:29:19 cmihpx02 vmunix: NFS write failed for server cmiauxe1: error 5 (RPC: Timed out) May 26 11:29:19 cmihpx02 vmunix: NFS... (4 Replies)
Discussion started by: rveri
4 Replies

4. Shell Programming and Scripting

filtering with awk

i have question about awk ex: input.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 1252468819,yahoo,1.2 msn,1252468812,8.9 1252468923,gmail,12 live,1252468812,3.4 yahoo,1252468812,9.0 1252468929,msn,1.2 output.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 msn,1252468812,8.9... (3 Replies)
Discussion started by: zvtral
3 Replies

5. Shell Programming and Scripting

Filtering issues while using nawk

Hi, I am currently filtering a file that has multiple sets of data. An example of some of the data is as follows; Sat Oct 2 07:42:45 2010 01:33:46 R1_CAR_12.34 Sun Oct 3 13:09:53 2010 00:02:34 R2_BUS_56.78 Sun Oct 3 21:11:39 2010 00:43:21 R3_TRAIN_COACH_90.12 Mon Oct 4... (1 Reply)
Discussion started by: crunchie
1 Replies

6. Shell Programming and Scripting

Filtering issues with multiple columns in a single file

Hi, I am new to unix and would greatly appreciate some help. I have a file containing multiple colums containing different sets of data e.g. File 1: John Ireland 27_December_69 Mary England 13_March_55 Mike France 02_June_80 I am currently using the awk... (10 Replies)
Discussion started by: crunchie
10 Replies

7. Shell Programming and Scripting

Issues with filtering duplicate records using gawk script

Hi All, I have huge trade file with milions of trades.I need to remove duplicate records (e.g I have following records) 30/10/2009,trdeId1,..,.. 26/10/2009.tradeId1,..,..,, 30/10/2009,tradeId2,.. In the above case i need to filter duplicate recods and I should get following output.... (2 Replies)
Discussion started by: nmumbarkar
2 Replies

8. UNIX for Dummies Questions & Answers

filtering and copying contains of a file using awk/sed

Hello folks, I have 2 files one( file1) contains the ddl for a view and file 2 contains the view defination/alias columns. I want to merge the 2 into a third file using awk/sed as follows: cheers ! :b: FILE1 ----- PROMPT FIRST_VIEW CREATE OR REPLACE FORCE VIEW FIRST_VIEW AS SELECT... (2 Replies)
Discussion started by: jville
2 Replies

9. Shell Programming and Scripting

Sed filtering issue

The problem I have is that I have 23,000 records I need to sort through to pull out LEN: XXXX XX XX XX XX and NCOS: XXX entries from so I can insert them into a database. But some of my records include TYPE: ISDN, THE DN IS UNASSIGNED, or INVALID entries in between some records and I would like... (2 Replies)
Discussion started by: roachmmflhyr
2 Replies

10. Shell Programming and Scripting

awk and sed filtering

Goo afternoon Sir'sould like to ask your help reagrding in this scenario using sed and awk. ******************************************************** Host:CDRMSAPPS1 Operating System:Linux 2.6.9-42.ELsmp Machine Type:UNIX Host Type:Client Version:5.1... (2 Replies)
Discussion started by: invinzin21
2 Replies

Featured Tech Videos