Sponsored Content
Full Discussion: awk eating too much memory?
Top Forums Shell Programming and Scripting awk eating too much memory? Post 302561535 by Corona688 on Tuesday 4th of October 2011 12:06:09 PM
Old 10-04-2011
Quote:
Originally Posted by anishkumarv
ya its working fine, dude but my problem was only the file size,
First you said it's memory, then CPU time, now file size -- which is your goal here?
Quote:
The file contain these kind of data's from that it using awk its sorting only uniqe domain names alone. so even i used your code(Corona688) also its taking time and load,
Of course it takes time and load. 8 gigabytes of data isn't going to be sorted in a nanosecond.

I asked questions which could be used to further improve the code. Is BIZFILE actually needed for anything, now that you don't need to recalculate the database count? If not, leaving out { print $0 > BIZFILE } will avoid a lot of disk-writing and give some more boost.

I'm not quite following the logic in this awk script:
Code:
/^[^ ]+ IN NS/ && !_[$1]++{print $1; tot++}

Absolutely nothing in that domain file snippet of yours contains 'IN NS', so that ought to never match. It doesn't look like the first field is what you're actually interested in anyway. How does this work?

---------- Post updated at 10:06 AM ---------- Previous update was at 09:25 AM ----------

I've been trying to think of an awkless way for you, so far I'm stumped.

Building it in pure C means needing an associative array, i.e. I'm ending up just building a hardcoded implementation of awk. It'd have to be a really good associative array to get the necessary speed -- I bet awk's would be faster.

Building it with other shell commands means piping it through grep and cut before feeding it into a sort -u, and then afterwards, reprocessing the output again to get the record count -- either that, or doing a tee and wc -l. That's a 5-long pipe chain for 8GB of data -- in effect processing 40 gigs of data, not 8... That's not going to be more efficient.

I could build a C program that does the grep | cut for you, which would let you pipe it directly into sort -u | tee | wc -l. That's only a 4-long pipe chain... Unless you've got 4 cores, that's probably still not better than the script you have now.

awk's flexible enough to do everything in one shot, which is pretty tough to beat.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Hosting Service Eating Space

Dear Group, I am not much used to UNIX. The company I am hosting wiht refuses to help me with this trouble, but as near as I can see, it is NOT my trouble. I have had this service for over a year. I just renewed for another year and all of a sudden the disk quota has been disappearing. I... (3 Replies)
Discussion started by: cindy
3 Replies

2. UNIX for Dummies Questions & Answers

Eating memory

Hello I run Gentoo Linux on my computer: Athlon XP 1700+ ~1,46 mhz 512 mb ram After a while, my computer works really slow, and when I cat /proc/meminfo, I see that I only have 8mb of 512 mb free! How is that possible? I dont run anything I can think of that eats that amount of... (4 Replies)
Discussion started by: Maestin
4 Replies

3. UNIX for Dummies Questions & Answers

/proc is eating my disk man

hi I have an sun ultra 5 running a firewall which has logging enabled (essential). The disk is sliced up with /proc on / (c0t0d0s0). / is sliced at 3 gig. My problem is this, one afternoon, a manager asked me to retrieve some firewall logs, so i went into the relevant directory (also on the /... (3 Replies)
Discussion started by: hcclnoodles
3 Replies

4. What is on Your Mind?

What are you eating ?

Hi, guys ! I was wondering... how many of you are vegetarians ? and why ? (31 Replies)
Discussion started by: Sergiu-IT
31 Replies

5. Solaris

This application is eating up the CPU

Hi, I am not very much fmiliar with Solaris OS. My main concern for posting is One application is eating 50% of CPU and I cannot run that application, If I perform any action in that application it takes real long time. I have solaris installed on my development machine.I have my application... (11 Replies)
Discussion started by: pandu345
11 Replies

6. Shell Programming and Scripting

Memory exhausted in awk

Dear All, I have executed a awk script in linux box which consists of 21 Million records.And i have two mapping files of 500 and 5200 records.To my surprise i found an error awk: cmd. line:19: (FILENAME=/home/FILE FNR=21031272) fatal: Memory exhausted. Is there any limitation for records... (3 Replies)
Discussion started by: cskumar
3 Replies

7. Solaris

Sendmail is eating high memory

Hi, I have installed sendmail on my solaris server. But sendmail its up high memory. its eat upto around 9-10 GB memory. What to do in this ? Thanks NeeleshG (6 Replies)
Discussion started by: neel.gurjar
6 Replies

8. Shell Programming and Scripting

[bash] IF is eating my loops

Hi! Could someone explain me why the below code is printing the contents of IF block 5 times instead of 0? #!/bin/bash VAR1="something" VAR2="something" for((i=0;i<10;i++)) do if(($VAR1=~$VAR2)) then echo VAR1: $VAR1 echo... (3 Replies)
Discussion started by: machinogodzilla
3 Replies

9. Shell Programming and Scripting

AWK Memory Limit ?

Is there an input file memory limit for awk? I have a 38Mb text file that I am trying to print out certatin lines and add a string to the end of that line. When I excute the script on the 38Mb file the string I am adding is put on a new line. If I do the same with a smaller file the... (3 Replies)
Discussion started by: cold_Que
3 Replies

10. Shell Programming and Scripting

how to find a job which is writing a big file and eating up space?

how to find a job which is writing a big file and eating up space? (3 Replies)
Discussion started by: rush2andy
3 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 01:34 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy