Sponsored Content
Full Discussion: awk eating too much memory?
Top Forums Shell Programming and Scripting awk eating too much memory? Post 302561535 by Corona688 on Tuesday 4th of October 2011 12:06:09 PM
Old 10-04-2011
Quote:
Originally Posted by anishkumarv
ya its working fine, dude but my problem was only the file size,
First you said it's memory, then CPU time, now file size -- which is your goal here?
Quote:
The file contain these kind of data's from that it using awk its sorting only uniqe domain names alone. so even i used your code(Corona688) also its taking time and load,
Of course it takes time and load. 8 gigabytes of data isn't going to be sorted in a nanosecond.

I asked questions which could be used to further improve the code. Is BIZFILE actually needed for anything, now that you don't need to recalculate the database count? If not, leaving out { print $0 > BIZFILE } will avoid a lot of disk-writing and give some more boost.

I'm not quite following the logic in this awk script:
Code:
/^[^ ]+ IN NS/ && !_[$1]++{print $1; tot++}

Absolutely nothing in that domain file snippet of yours contains 'IN NS', so that ought to never match. It doesn't look like the first field is what you're actually interested in anyway. How does this work?

---------- Post updated at 10:06 AM ---------- Previous update was at 09:25 AM ----------

I've been trying to think of an awkless way for you, so far I'm stumped.

Building it in pure C means needing an associative array, i.e. I'm ending up just building a hardcoded implementation of awk. It'd have to be a really good associative array to get the necessary speed -- I bet awk's would be faster.

Building it with other shell commands means piping it through grep and cut before feeding it into a sort -u, and then afterwards, reprocessing the output again to get the record count -- either that, or doing a tee and wc -l. That's a 5-long pipe chain for 8GB of data -- in effect processing 40 gigs of data, not 8... That's not going to be more efficient.

I could build a C program that does the grep | cut for you, which would let you pipe it directly into sort -u | tee | wc -l. That's only a 4-long pipe chain... Unless you've got 4 cores, that's probably still not better than the script you have now.

awk's flexible enough to do everything in one shot, which is pretty tough to beat.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Hosting Service Eating Space

Dear Group, I am not much used to UNIX. The company I am hosting wiht refuses to help me with this trouble, but as near as I can see, it is NOT my trouble. I have had this service for over a year. I just renewed for another year and all of a sudden the disk quota has been disappearing. I... (3 Replies)
Discussion started by: cindy
3 Replies

2. UNIX for Dummies Questions & Answers

Eating memory

Hello I run Gentoo Linux on my computer: Athlon XP 1700+ ~1,46 mhz 512 mb ram After a while, my computer works really slow, and when I cat /proc/meminfo, I see that I only have 8mb of 512 mb free! How is that possible? I dont run anything I can think of that eats that amount of... (4 Replies)
Discussion started by: Maestin
4 Replies

3. UNIX for Dummies Questions & Answers

/proc is eating my disk man

hi I have an sun ultra 5 running a firewall which has logging enabled (essential). The disk is sliced up with /proc on / (c0t0d0s0). / is sliced at 3 gig. My problem is this, one afternoon, a manager asked me to retrieve some firewall logs, so i went into the relevant directory (also on the /... (3 Replies)
Discussion started by: hcclnoodles
3 Replies

4. What is on Your Mind?

What are you eating ?

Hi, guys ! I was wondering... how many of you are vegetarians ? and why ? (31 Replies)
Discussion started by: Sergiu-IT
31 Replies

5. Solaris

This application is eating up the CPU

Hi, I am not very much fmiliar with Solaris OS. My main concern for posting is One application is eating 50% of CPU and I cannot run that application, If I perform any action in that application it takes real long time. I have solaris installed on my development machine.I have my application... (11 Replies)
Discussion started by: pandu345
11 Replies

6. Shell Programming and Scripting

Memory exhausted in awk

Dear All, I have executed a awk script in linux box which consists of 21 Million records.And i have two mapping files of 500 and 5200 records.To my surprise i found an error awk: cmd. line:19: (FILENAME=/home/FILE FNR=21031272) fatal: Memory exhausted. Is there any limitation for records... (3 Replies)
Discussion started by: cskumar
3 Replies

7. Solaris

Sendmail is eating high memory

Hi, I have installed sendmail on my solaris server. But sendmail its up high memory. its eat upto around 9-10 GB memory. What to do in this ? Thanks NeeleshG (6 Replies)
Discussion started by: neel.gurjar
6 Replies

8. Shell Programming and Scripting

[bash] IF is eating my loops

Hi! Could someone explain me why the below code is printing the contents of IF block 5 times instead of 0? #!/bin/bash VAR1="something" VAR2="something" for((i=0;i<10;i++)) do if(($VAR1=~$VAR2)) then echo VAR1: $VAR1 echo... (3 Replies)
Discussion started by: machinogodzilla
3 Replies

9. Shell Programming and Scripting

AWK Memory Limit ?

Is there an input file memory limit for awk? I have a 38Mb text file that I am trying to print out certatin lines and add a string to the end of that line. When I excute the script on the 38Mb file the string I am adding is put on a new line. If I do the same with a smaller file the... (3 Replies)
Discussion started by: cold_Que
3 Replies

10. Shell Programming and Scripting

how to find a job which is writing a big file and eating up space?

how to find a job which is writing a big file and eating up space? (3 Replies)
Discussion started by: rush2andy
3 Replies
platform::shell(n)					       Tcl Bundled Packages						platform::shell(n)

__________________________________________________________________________________________________________________________________________________

NAME
platform::shell - System identification support code and utilities SYNOPSIS
package require platform::shell ?1.1.4? platform::shell::generic shell platform::shell::identify shell platform::shell::platform shell _________________________________________________________________ DESCRIPTION
The platform::shell package provides several utility commands useful for the identification of the architecture of a specific Tcl shell. This package allows the identification of the architecture of a specific Tcl shell different from the shell running the package. The only requirement is that the other shell (identified by its path), is actually executable on the current machine. While for most platform this means that the architecture of the interrogated shell is identical to the architecture of the running shell this is not generally true. A counter example are all platforms which have 32 and 64 bit variants and where a 64bit system is able to run 32bit code. For these running and interrogated shell may have different 32/64 bit settings and thus different identifiers. For applications like a code repository it is important to identify the architecture of the shell which will actually run the installed packages, versus the architecture of the shell running the repository software. COMMANDS
platform::shell::identify shell This command does the same identification as platform::identify, for the specified Tcl shell, in contrast to the running shell. platform::shell::generic shell This command does the same identification as platform::generic, for the specified Tcl shell, in contrast to the running shell. platform::shell::platform shell This command returns the contents of tcl_platform(platform) for the specified Tcl shell. KEYWORDS
operating system, cpu architecture, platform, architecture platform::shell 1.1.4 platform::shell(n)
All times are GMT -4. The time now is 09:44 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy