Need Optimization shell/awk script to aggreagte (sum) for all the columns of Huge data file Post: 303025071

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script to check the unique numbers in huge data

Friends, I have to write a shell script,the description is---- i Have to check the uniqueness of the numbers in a file. A file is containing 200thousand tickets and a ticket have 15 numbers in asecending order.And there is a strip that is having 6 tickets that means 90 numbers.I...

2. Shell Programming and Scripting

Find percent between sum of 2 columns awk help

Hi I'm new to this forum and I'm a beginner when it comes to shell and awk programming. But I have the following problem: I have 5 csv files (data1.csv, data2.csv, etc.) and need to calculate the average between the total sum of the 1st and 7 column. csv example:...

3. Shell Programming and Scripting

awk sum columns

can anyone help me how do i add the colums using awk seperated by character @. for eg i have 3@4 2@9 5@1 the result should be 10 14 i tried using { sum+= $1 } END { print sum } but it just gives the result 10. can anyone help me with this one thank you and best regards

4. UNIX for Advanced & Expert Users

A variable and sum of its value in a huge data.

Hi Experts, I got a question.. In the following output of `ps -elf | grep DataFlow` I get:- 242001 A mqsiadm 2076676 1691742 0 60 20 26ad4f400 130164 * May 09 - 3:02 DataFlowEngine EAIDVBR1_BROKER 5e453de8-2001-0000-0080-fd142b9ce8cb VIPS_INQ1 0 242001 A mqsiadm...

5. Shell Programming and Scripting

AWK/Shell script for formatting data in a file

Hi All, Need an urgent help to convert a unix file in to a particular format: **source file:** 1111111 2d2f2h2 3dfgsd3 ........... 1111111 <-- repeats in every nth line. remaining all lines will be different 123ss41 432ff45 ........... 1111111 <-- repetition qwe1234 123weq3...

6. Shell Programming and Scripting

Sum up values of columns in 4 files using shell script

I am new to shell script.I have records like below in 4 different files which have about 10000 records each, all records unique and sorted based on column 2. 1 2 3 4 5 6 --------------------------- SR|1010478|000044590|1|0|0| SR|1014759|000105790|1|0|0| SR|1016609|000108901|1|0|0|...

7. Shell Programming and Scripting

Awk based script to find the median of all individual columns in a data file

Hi All, I have some data like below. Step1,Param1,Param2,Param3 1,2,3,4 2,3,4,5 2,4,5,6 3,0,1,2 3,0,0,0 3,2,1,3 ........ so on Where I need to find the median(arithmetic) of each column from Param1...to..Param3 for each set of Step1 values. (Sort each specific column, if the...

8. Shell Programming and Scripting

awk based script to find the average of all the columns in a data file

Hi All, I need the modification for the below mentioned code (found in one more post https://www.unix.com/shell-programming-scripting/27161-script-generate-average-values.html) to find the average values for all the columns(but for a specific rows) and print the averages side by side. I have...

9. Shell Programming and Scripting

awk does not work well with huge data?

Dear all , I found that if we work with thousands line of data, awk does not work perfectly. It will cut hundreds line (others are deleted) and works only on the remain data. I used this command : awk '$1==1{$1="Si"}{print>FILENAME}' coba.xyz to change value of first column whose value is 1...

10. Shell Programming and Scripting

Sum of columns using awk

Hello everyone I am a beginner in Shell scripting. Need your help to achieve desired result. I have a file (sample format below) 001g8aX0007jxLz xxxxxxxxxxxxxxx 9213974926411 CO-COMM-133 CO-L001-DLY 7769995578239 44938 1 1 ...

LEARN ABOUT DEBIAN

amplot

AMPLOT(8)						  System Administration Commands						 AMPLOT(8)

NAME

       amplot - visualize the behavior of Amanda

SYNOPSIS

       amplot [-b] [-c] [-e] [-g] [-l] [-p] [-t T] amdump_files

DESCRIPTION

       Amplot reads an amdump output file that Amanda generates each run (e.g.	amdump.1) and translates the information into a picture format
       that may be used to determine how your installation is doing and if any parameters need to be changed.  Amplot also prints out amdump lines
       that it either does not understand or knows to be warning or error lines and a summary of the start, end and total time for each backup
       image.

       Amplot is a shell script that executes an awk program (amplot.awk) to scan the amdump output file. It then executes a gnuplot program
       (amplot.g) to generate the graph. The awk program is written in an enhanced version of awk, such as GNU awk (gawk(1) version 2.15 or later)
       or nawk(1).

       During execution, amplot generates a few temporary files that gnuplot uses. These files are deleted at the end of execution.

       See the amanda(8) man page for more details about Amanda.

OPTIONS

       -b
	   Generate b/w postscript file (need -p).

       -c
	   Compress amdump_files after plotting.

       -e
	   Extend the X (time) axis if needed.

       -g
	   Direct gnuplot output directly to the X11 display (default).

       -p
	   Direct postscript output to file YYYYMMDD.ps (opposite of -g).

       -l
	   Generate landscape oriented output (needs -p).

       -t T
	   Set the right edge of the plot to be T hours.

       The amdump_files may be in various compressed formats (compress, gzip, pact, compact).

INTERPRETATION

       The figure is divided into a number of regions. There are titles on the top that show important statistical information about the
       configuration and from this execution of amdump. In the figure, the X axis is time, with 0 being the moment amdump was started. The Y axis
       is divided into 5 regions:

       QUEUES: How many backups have not been started, how many are waiting on space in the holding disk and how many have been transferred
       successfully to tape.

       %BANDWIDTH: Percentage of allowed network bandwidth in use.

       HOLDING DISK: The higher line depicts space allocated on the holding disk to backups in progress and completed backups waiting to be
       written to tape. The lower line depicts the fraction of the holding disk containing completed backups waiting to be written to tape
       including the file currently being written to tape. The scale is percentage of the holding disk.

       TAPE: Tape drive usage.

       %DUMPERS: Percentage of active dumpers.

       The idle period at the left of the graph is time amdump is asking the machines how much data they are going to dump. This process can take
       a while if hosts are down or it takes them a long time to generate estimates.

BUGS

       Reports lines it does not recognize, mainly error cases but some are legitimate lines the program needs to be taught about.

SEE ALSO

       amanda(8), amdump(8), gnuplot(1), compress(1), gzip(1)

       The Amanda Wiki: : http://wiki.zmanda.com/

AUTHORS

       Olafur Gudmundsson <ogud@tis.com>
	   Trusted Information Systems

       Stefan G. Weichinger <sgw@amanda.org>

Amanda 3.3.1							    02/21/2012								 AMPLOT(8)