awk usage


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk usage
# 1  
Old 02-19-2016
awk usage

Hi

I have a file like this

Code:
user="joe"
phone="4280919"
duration=128
login=12/02/2016
ip-address=10.230.120.35

user="mary"
phone="6352728"
duration=23
login=16/02/2016
ip-address=10.123.231.54

I have to convert these to as follows (csv format) (Ignore Login field)
Code:
user,phone,duration,ip-address
joe,4280919,128,10.230.120.35
mary,6352728,23,10.123.231.54

I have a code like this:
Code:
$AWK 'BEGIN {
  RS="" ; FS="\n"
  print "user,phone,duration,ip-address";
}
{delete usr; delete ph; delete dur; delete ip;
 for (i=1; i<= NF; i++) {
   if ($i ~"user") {split ($i,usr,"=");gsub("\"","",usr[2]); };
   if ($i ~"phone") {split ($i,ph,"=");gsub("\"","",ph[2]); };
   if ($i ~"duration") {split ($i,dur,"=") ; };
   if ($i ~"ip-address") {split ($i,ip,"="); };
}
printf("%s,%s,%s,%s,\n",usr[2],ph[2],dur[2],ip[2]);
}' infile

This works fine but as the input file is about 2 GB in size, it takes too long to run. Can someone give me a different solution?

Last edited by Scrutinizer; 02-19-2016 at 06:36 PM.. Reason: Add CODE tags for sample input and output segments.
# 2  
Old 02-19-2016
Hello Balav,

Please use code tags as per forum rules for all commands/codes/Inputs which you are using into your posts. Following may help you in same.
Code:
awk -F"=" '/^user/ && A{A=A ORS;B=""} {B=B?B OFS $1:$1;gsub(/\"/,X,$NF);A=A?A OFS $NF:$NF;} END{sub(/\n\,/,"\n",A);print B ORS A}' OFS=,   Input_file

Output will be as follows.
Code:
user,phone,duration,login,ip-address
joe,4280919,128,12/02/2016,10.230.120.35
mary,6352728,23,16/02/2016,10.123.231.54

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 02-19-2016
Ignore a field

Thanks but how do you ignore the field that we don't need? I wanted to ignore "Login" date

Regards,
Bala
# 4  
Old 02-19-2016
Hello Bala,

Apologies for not seeing that part, could you please try following and let me know if this helps you.
Code:
awk -F"=" '/^login/{next} /^user/ && A{A=A ORS;B=""}{B=B?B OFS $1:$1;gsub(/\"/,X,$NF);A=A?A OFS $NF:$NF;} END{sub(/\n\,/,"\n",A);print B ORS A}' OFS=,   Input_file

Output will be as follows.
Code:
user,phone,duration,ip-address
joe,4280919,128,10.230.120.35
mary,6352728,23,10.123.231.54

Thanks,
R. Singh
These 2 Users Gave Thanks to RavinderSingh13 For This Post:
# 5  
Old 02-19-2016
Thanks very much for your help

Best Regards,
Bala
# 6  
Old 02-19-2016
Hi Balav,
can we break this in two steps.
by sed and awk.
Code:
sed -i 's/"//g' file

Code:
awk 'BEGIN{print "user,phone,duration,login,ip_address";RS="";OFS=",";FS="="}  { print $2 OFS $4 OFS $6 OFS $8 OFS $10}' file

---------- Post updated at 04:09 PM ---------- Previous update was at 04:01 PM ----------

Quote:
Originally Posted by RavinderSingh13
Hello Bala,

Apologies for not seeing that part, could you please try following and let me know if this helps you.
Code:
awk -F"=" '/^login/{next} /^user/ && A{A=A ORS;B=""}{B=B?B OFS $1:$1;gsub(/\"/,X,$NF);A=A?A OFS $NF:$NF;} END{sub(/\n\,/,"\n",A);print B ORS A}' OFS=,   Input_file

Output will be as follows.
Code:
user,phone,duration,ip-address
joe,4280919,128,10.230.120.35
mary,6352728,23,10.123.231.54

Thanks,
R. Singh
Hello Mr Ravinder,

Could you please explain how your code will increase performance ? also what was that part of Balav's code that was taking longer time.
Thanks,

@Balavs

Hello Mr Balav,
could you please share your updated throughput.

Thanks in Advance
This User Gave Thanks to looney For This Post:
# 7  
Old 02-19-2016
If the records are separated by two newlines with no spaces and the fields are always in the same order, one could also try:
Code:
awk 'FNR==1{print $1,$3,$5,$9}{print $2,$4,$6,$10}' FS='[="\n]*' OFS=, RS=

A bit similar to what looney proposes, and it also uses RS="" ( like also the OP used) ..

output:
Code:
user,phone,duration,ip-address
joe,4280919,128,10.230.120.35
mary,6352728,23,10.123.231.54

---
I did a test with 2.000.000 lines:
Code:
Original solution post #1                      : 10 seconds
This suggestion                                : 2.5 seconds
Looneys suggestion with sed and awk combined   : 3 seconds
Ravinder's suggestion in post #4               : > 240 seconds before I broke it off (it also used a lot of memory and it produced extra commas at the start or the end)


so I guess this is down to the extra splitting and gsubbing...

Last edited by Scrutinizer; 02-19-2016 at 07:26 PM..
These 3 Users Gave Thanks to Scrutinizer For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with awk - Disk usage

Hi All, Would appreciate your help as to why the following code not showing the correct output. the issue is on the last else cause. I am trying to report on disk space based on percentage usage. for some reason, it's showing output as OK even thou disk space is 90% !!!! any ideas why? thanks!!... (2 Replies)
Discussion started by: xcod3r
2 Replies

2. Shell Programming and Scripting

Usage of Int with NR in awk

Hello Everyone, I am new to awk and trying my hand with the diff codes and came across the below code today. It would be great if any of the Guru's help me to understand. awk '{filename = "sample_file" int((NR-1)/34) ".DAT"; print >> filename}' sample_file.DAT 34 is the no of lines each... (7 Replies)
Discussion started by: saratha14
7 Replies

3. Shell Programming and Scripting

usage of Awk command for output

Hi Experts, I have a Text file generated as below; <NAME> NEW#<technicalName><TAB> <Version> OLD#<technicalName><TAB> <Version> e.g. CH_PPV_AUDIT_DISTRIBUTOR NEW#EL_CFG_FTP_DISTRIBUTOR 2.1.0.upc2 OLD#EL_CFG_FTP_DISTRIBUTOR 2.1.0.upc1... (19 Replies)
Discussion started by: rajangupta2387
19 Replies

4. UNIX for Dummies Questions & Answers

awk FS usage problem

Hi All, I have a question about the usage of FS in awk and i appreciate if you can help me sort out what i'm doing wrong. I have a file with info ; separated and i just want to print the o/p of all fields ($1 -> $8) to the screen. As you can see it is very simple but i'm facing an error. ... (5 Replies)
Discussion started by: Dendany83
5 Replies

5. Shell Programming and Scripting

grep/awk usage

Hi everybody, i know this question is common on the forum and i've searched for my answer but haven't quite found it. I'm trying to extract some values from a number of log files which look like this: Peak Power Consumption: 0.20777 Watts Observed Average Power: 0.1414794247 Watts The... (9 Replies)
Discussion started by: aKiiY
9 Replies

6. Shell Programming and Scripting

problem with awk usage

I have to change the 4th feild of a line; Most of the time, the line has only 4 feilds, but sometimes, it has more than 4 feilds. I am using the below command: NEW_LINE=`echo $Line | awk '{sub ($4, "2010\\\/04\\\/07"); printf "# %-7s %9s %18s\n", $2,$3,$4}'` This works fine but if the... (7 Replies)
Discussion started by: sugan
7 Replies

7. AIX

How to monitor the IBM AIX server for I/O usage,memory usage,CPU usage,network..?

How to monitor the IBM AIX server for I/O usage, memory usage, CPU usage, network usage, storage usage? (3 Replies)
Discussion started by: laknar
3 Replies

8. Shell Programming and Scripting

Awk - user's computer usage

Hi! I need help with this: I want to write an AWK script, that matches computer usage of users, listed at the available log file, Users.log. Which is how many times a user has entered, and how much time was logged in? for example: $ cat Users.log user1; login, 2009-03-26, 10:30 ... (1 Reply)
Discussion started by: Ferenc87
1 Replies

9. Shell Programming and Scripting

usage of underscore in awk

Hi what is the purpose of using underscore in awk. I suppose it is for defining macro's and reducing repeatation but can some one show me an example? (6 Replies)
Discussion started by: ahmedwaseem2000
6 Replies

10. HP-UX

how can I find cpu usage memory usage swap usage and logical volume usage

how can I find cpu usage memory usage swap usage and I want to know CPU usage above X% and contiue Y times and memory usage above X % and contiue Y times my final destination is monitor process logical volume usage above X % and number of Logical voluage above can I not to... (3 Replies)
Discussion started by: alert0919
3 Replies
Login or Register to Ask a Question