Aggregation of huge data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Aggregation of huge data
# 8  
Old 04-07-2014
Hi Akshay,

I even removed the blank and tried it - still facing the same issue. Also I copied few records to a new file (Say 5 lines) and even then - it is occuring !

Data:

Code:
21000000
-3000
3000
-670500
2963700

Command used:

Code:
 awk 'BEGIN { print "Z = 0;" } { sub(/-/, ""); print "Z += ",$1,";" } END { print "Z;" }' test.txt

Output:

Code:
Z = 0;
Z +=  21000000 ;
Z +=  3000 ;
Z +=  3000 ;
Z +=  670500 ;
Z +=  2963700 ;
Z;

To my knowledge the above stated output, the value of z should be incremented na ?


Kindly advise me on the same.

Regards,
Ravichander

Last edited by Ravichander; 04-07-2014 at 02:56 AM..
# 9  
Old 04-07-2014
Three weeks ago I suggested the code:
Code:
awk -F'|' -v dqANDms='["-]' '
BEGIN {	f=156
	printf("s=0\n")
}
NR > 2 {gsub(dqANDms, "", $f)
	printf("s+=%s\n",  $f)
}
END {	printf("s\n")
}' file | bc

in another thread (Aggregation of Huge files) where you wanted to process the 156th field instead of the 1st field, wanted to strip out double quote characters if any were present, and had two header lines in your input that were to be ignored. You said that when your input file contained 7 million records, my code didn't work; but you weren't able to show any input that caused it to produce the wrong result. Instead of answering requests to show sample input that caused suggested scripts provided to you to fail, you started this new thread.

Simplifying that code for the data you've presented here yields:
Code:
awk '
BEGIN {	printf("s=0\n")
}
{	sub(/-/, "")
	printf("s+=%s\n", $1)
}
END {	printf("s\n")
}' test.txt | bc

which, with the sample input you provided in message #8 in this thread produces the output:
Code:
24640200

which still looks like the correct result to me. If this isn't the result you wanted, what were you expecting?

If it matters, the output from awk that the above script feeds into bc is:
Code:
s=0
s+=21000000
s+=3000
s+=3000
s+=670500
s+=2963700
s

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Phrase XML with Huge Data

HI Guys, I have Big XML file with Below Format :- Input :- <pokl>MKL=1,FN=1,GBNo=B10C</pokl> <d>192</d> <d>315</d> <d>35</d> <d>0,7,8</d> <pokl>MKL=1,dFN=1,GBNo=B11C</pokl> <d>162</d> <d>315</d> <d>35</d> <d>0,5,6</d> <pokl>MKL=1,dFN=1,GBNo=B12C</pokl> <d>188</d> (4 Replies)
Discussion started by: pareshkp
4 Replies

2. Solaris

The Fastest for copy huge data

Dear Experts, I would like to know what's the best method for copy data around 3 mio (spread in a hundred folders, size each file around 1kb) between 2 servers? I already tried using Rsync and tar command. But using these command is too long. Please advice. Thanks Edy (11 Replies)
Discussion started by: edydsuranta
11 Replies

3. Shell Programming and Scripting

awk does not work well with huge data?

Dear all , I found that if we work with thousands line of data, awk does not work perfectly. It will cut hundreds line (others are deleted) and works only on the remain data. I used this command : awk '$1==1{$1="Si"}{print>FILENAME}' coba.xyz to change value of first column whose value is 1... (4 Replies)
Discussion started by: ariesto
4 Replies

4. Shell Programming and Scripting

Aggregation of Huge files

Hi Friends !! I am facing a hash total issue while performing over a set of files of huge volume: Command used: tail -n +2 <File_Name> |nawk -F"|" -v '%.2f' qq='"' '{gsub(qq,"");sa+=($156<0)?-$156:$156}END{print sa}' OFMT='%.5f' Pipe delimited file and 156 column is for hash totalling.... (14 Replies)
Discussion started by: Ravichander
14 Replies

5. Red Hat

Disk is Full but really does not contain huge data

Hi All, My disk usage show 100 % . When I check “df –kh” it shows my root partition is full. But when I run the “du –skh /” shows only 7 GB is used. Filesystem Size Used Avail Use% Mounted on /dev/sda1 30G 28G 260MB 100% / How I can identify who is using the 20 GB of memory. Os: Centos... (10 Replies)
Discussion started by: kalpeer
10 Replies

6. UNIX for Dummies Questions & Answers

Copy huge data into vi editor

Hi All, HP-UX dev4 B.11.11 U 9000/800 3251073457 I need to copy huge data from windows text file to vi editor. when I tried copy huge data, the format of data is not preserverd and appered to scatterd through the vi, something like give below. Please let me know, how can I correct this? ... (18 Replies)
Discussion started by: alok.behria
18 Replies

7. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA... (7 Replies)
Discussion started by: patrick87
7 Replies

8. UNIX for Advanced & Expert Users

A variable and sum of its value in a huge data.

Hi Experts, I got a question.. In the following output of `ps -elf | grep DataFlow` I get:- 242001 A mqsiadm 2076676 1691742 0 60 20 26ad4f400 130164 * May 09 - 3:02 DataFlowEngine EAIDVBR1_BROKER 5e453de8-2001-0000-0080-fd142b9ce8cb VIPS_INQ1 0 242001 A mqsiadm... (5 Replies)
Discussion started by: varungupta
5 Replies

9. Shell Programming and Scripting

How to extract data from a huge file?

Hi, I have a huge file of bibliographic records in some standard format.I need a script to do some repeatable task as follows: 1. Needs to create folders as the strings starts with "item_*" from the input file 2. Create a file "contents" in each folders having "license.txt(tab... (5 Replies)
Discussion started by: srsahu75
5 Replies

10. UNIX for Dummies Questions & Answers

search and grab data from a huge file

folks, In my working directory, there a multiple large files which only contain one line in the file. The line is too long to use "grep", so any help? For example, if I want to find if these files contain a string like "93849", what command I should use? Also, there is oder_id number... (1 Reply)
Discussion started by: ting123
1 Replies
Login or Register to Ask a Question