Gawk --- produce the output in pattern space instead of END space


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Gawk --- produce the output in pattern space instead of END space
# 1  
Old 09-24-2018
Gawk --- produce the output in pattern space instead of END space

hi,

I'm trying to calculate IP addresses and their respective calls to our apache Server. The standard format of the input is

Code:
HOST IP DATE/TIME - - "GET/POST reuest" "User Agent"
HOST IP DATE/TIME - - "GET/POST reuest" "User Agent"
HOST IP DATE/TIME - - "GET/POST reuest" "User Agent"
HOST IP DATE/TIME - - "GET/POST reuest" "User Agent"
HOST IP DATE/TIME - - "GET/POST reuest" "User Agent"


I'm using below given gawk code to do this ( that is accumulating all requests for all IPs in a given input file.

Code:
gawk --re-interval -F\"  '
 /./  { split($1,IP," "); IPPP[IP[2]]++;}
 /./  { split($1,IP," "); LINE[IP[2]]=LINE[IP[2]]"<br>"$2; } 
END  { for(i in LINE){{  printf("\n\n%s\t%s",i,LINE[i]) }} }' other_vhosts_access.log


the problem:

input-file is actually around 47Gib in size and when I return the LINE array in END space of gawk, The process consumes all the available memory of the system and the system starts running out of memory for all other processes.

Question:

Can i return the LINE array in our pattern space rather than END space so that every IP matched is returned -- instead of adding it into array and then displaying the result.

------ Post updated at 02:16 PM ------

BTW, this code works fine for smaller file ( when I split the file into smaller chunks, which doesn't satisfy the requirement, as all the file must be scanned at once, so that I get all IPs list )
# 2  
Old 09-24-2018
You could try - if the IPs are sorted. Then, whenever the IP changes, print out the results for the just gone IP. But, sorting a file that big may be a challenge, too. sort, on the other hand, offers some options to deal with large files.
# 3  
Old 09-27-2018
Apart from what RudiC suggested you can print out the pattern space to individual files that are uniquely IP'd and after gawk finishes you can catenate them all into a single output file...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove trailing space in Gawk

Hi, I have simply made a shell script to convert *.csv to *.xml file. Xml file is required for input to one tool. But i am getting space after last field. How can i remove it. Shell script is as follows :- if then echo "" echo "Wrong syntax, Databse_update.sh... (6 Replies)
Discussion started by: stillrules
6 Replies

2. Shell Programming and Scripting

Pad space at the end of string and reformat

I need to read in the string from input file and reform it by cut each segment and check the last segement lenght. If the last segment length is not as expected (see below segment file or table. It is predefined), then pad enough space. Old string FU22222222CA6666666666AKxvbFMddreeadBP999... (11 Replies)
Discussion started by: menglm
11 Replies

3. UNIX for Dummies Questions & Answers

removal of space from the end

HI, I need the help from the experts like I have created one file with text like: Code: a b c de f g hi j k l So my question is that i have to write the script in which like in the first sentence it will take only one space after d and remove all the extra space in the end.I dont... (0 Replies)
Discussion started by: bhanudhingra
0 Replies

4. Shell Programming and Scripting

Replace end of line with a space

for eg: i have i/p file as: ================ i wnt to change end of line ================= my require ouput is like: i wnt to change end of line ==================== (7 Replies)
Discussion started by: RahulJoshi
7 Replies

5. Shell Programming and Scripting

Add a space at end of file

Hi I guess this is very simple.... I want to add a space at the last line in a file. The space has to be the last charachter on the last line, not at a new line. Anyone ?? (7 Replies)
Discussion started by: disel
7 Replies

6. Shell Programming and Scripting

Calculate total space, total used space and total free space in filesystem names matching keyword

Good afternoon! Im new at scripting and Im trying to write a script to calculate total space, total used space and total free space in filesystem names matching a keyword (in this one we will use keyword virginia). Please dont be mean or harsh, like I said Im new and trying my best. Scripting... (4 Replies)
Discussion started by: bigben1220
4 Replies

7. UNIX for Advanced & Expert Users

how can I read the space in the end of line

cat file1|while read i do echo "$i"|wc done with this command the space in the end of the line not considered how can solve that for example: read h "hgyr " echo "$h"|wc 4 (2 Replies)
Discussion started by: Ehab
2 Replies

8. Shell Programming and Scripting

to see space, tab, end of the line chracters

what can I use ?? In vi, I can use :set list <-- and see end of line $.. or use cat -A but I am wondering if there is command or program that allows me to see all the hidden characters( space, tab and etc) Please help thanks. (3 Replies)
Discussion started by: convenientstore
3 Replies

9. Shell Programming and Scripting

remove space in front or end of each field

Hi, I have a txt file called a.txt which contain over 10,000 records and I would like to remove space before comma or after comma....like below: The input (for example two record 00001,00002): 00001,client,card limited ,02292,N ,162:41 , 192, ... (6 Replies)
Discussion started by: happyv
6 Replies

10. Shell Programming and Scripting

Removing Space at the end of file

Hi.... I have a situation...I have a data file...that has space(an extra row with no data) at the end of file. I am trying to remove that spaces only if the file has a space at the end of file and if there is no space I don't want to do anything. Can you please help me in this regards. ... (4 Replies)
Discussion started by: rkumar28
4 Replies
Login or Register to Ask a Question