Shell script count lines and sum numbers from multiple files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Shell script count lines and sum numbers from multiple files
# 1  
Old 11-17-2016
Shell script count lines and sum numbers from multiple files

I want to count the number of lines, I need this result be a number, and sum the last numeric column, I had done to make this one at time, but I need to make this for a crontab, so, it has to be an script, here is my lines:

It counts the number of lines:

Code:
egrep -i String file_name_201611* | egrep -i "cdr,20161115" | awk -F"," '{print $4}' | sort | uniq | wc -l

This sum the number of last column:

Code:
egrep -i String file_name_201611* | egrep -i ".cdr,20161115"| awk -F"," '{print $8}' | paste -s -d"+" |bc

Lines looks like:

Code:
COMGPRS,CGSCO05,COMGPRS_CGSCO05_400594.dat,processed_cdr_20161117100941_00627727.cdr,20161117095940,20161117,18,46521

The expected output:

Code:
CGSCO05,sum_#_lines, Sum_$8

Code:
CGSCO05, 225, 1500


Any idea?
# 2  
Old 11-17-2016
That request is not too clear, and sample input data missing doesn't help either. Just a guess based on some assumptions, (untested!):
Code:
awk -F, '/String/ && /\.cdr,20161115/ {CNT4++; SUM8 += $8} END {print $2, CNT4, SUM8} ' OFS=, file_name_201611*

Here, the uniq effect is not accounted for, nor is case sensitivity for the search strings. The field 2 printed is the one from the last line in the input stream.
If you can't live with any of these shortcomngs, be way clearer in your description.
# 3  
Old 11-17-2016
Hi RudiC, I am sorry my mistake, I was not clear up, so, look:

I have a directoy with files .CSV with names like this : ESTCOL_GPRS_201611*, these files are a lot, in the * are hour, Minute and seconds. The contents of these files are strings like these:

Code:
COMGPRS,CGHW12,COMGPRS_CGHW12_610617.dat,processed_cdr_20161117061743_01680861.cdr,20161117060116,20161117,225,42832

I wanna to count the unique occurrences for the column 2 and sum all of them and the values sum of column 8 to have an output like this:
Files Records
2,433 , 119,930,636


I have been trying something like this, but yet I have not achieved it

Code:
#!/bin/awk -f
BEGIN {
        FS=",";
}
{
        if (($1 == "COMGPRS") && ($2 == "ALK_01P")) {
                if (substr($5,1,8) == "20161115") {
                    sum+=$8;
                }
        }
}
END {
                print "Registros," $2 "," $sum;
}

I have not looked even count the number of files (Lineas unique)
# 4  
Old 11-17-2016
Given that the only line that you have shown us from your input file(s) is not matched by either of the egreps in either of your pipelines, it is hard to guess how to create test data that might be used by see if we are correctly interpreting your requirements.

Your 1st pipeline seems to be attempting to count a number of unique field #4 values. But your expected output shows sum_#_lines... What is being summed?

Your 2nd pipeline seems straightforward, but one wonders why the patterns being matched by the 2nd egrep is those two pipelines is different???

And, of course, the search patterns used in the awk script shown in post #3 do not seem to have any relationship to what you showed us in post #1???

Please show us a small set of sample input lines and then show us the exact output that should be produced from that sample along with a clear explanation of the logic used to produce that output from your sample input.
# 5  
Old 11-17-2016
Hi Don Cragun, I am sorry by the confusion,

1. The number of unique occurrences, how many times uniques files
Code:
egrep -i ALK_01P ESTCOL_GPRS_201611* | egrep -i "cdr,20161116" | awk -F"," '{print $4}' | sort | uniq | wc -l

The result:
Code:
2433

this result is total number of files

The second is to sum the values of #8, here are some lines :

Code:
COMGPRS,ALK_01S,COMGPRS_ALK_01S_018555.dat,processed_cdr_20161117055325_00018556.cdr,20161117060108,20161117,18,45533
COMGPRS,MEG_03P,COMGPRS_MEG_03P_030770.dat,processed_cdr_20161117055016_00033056.cdr,20161117060109,20161117,225,49187
COMGPRS,CGSCO05,COMGPRS_CGSCO05_400108.dat,processed_cdr_20161117060701_00627241.cdr,20161117060109,20161117,18,46050
COMGPRS,CGSCO05,COMGPRS_CGSCO05_400109.dat,processed_cdr_20161117060757_00627242.cdr,20161117060110,20161117,18,45848
COMGPRS,ALK_01S,COMGPRS_ALK_01S_018556.dat,processed_cdr_20161117055449_00018557.cdr,20161117060111,20161117,18,45089
COMGPRS,MEG_03P,COMGPRS_MEG_03P_030771.dat,processed_cdr_20161117055108_00033057.cdr,20161117060112,20161117,225,48409
COMGPRS,CGHW12,COMGPRS_CGHW12_610616.dat,processed_cdr_20161117061631_01680860.cdr,20161117060112,20161117,225,43037
COMGPRS,MEG_03P,COMGPRS_MEG_03P_030772.dat,processed_cdr_20161117055201_00033058.cdr,20161117060112,20161117,225,49096
COMGPRS,CGSCO05,COMGPRS_CGSCO05_400110.dat,processed_cdr_20161117060852_00627243.cdr,20161117060113,20161117,18,45474
COMGPRS,MEG_03P,COMGPRS_MEG_03P_030773.dat,processed_cdr_20161117055253_00033059.cdr,20161117060113,20161117,225,48855
COMGPRS,CGSCO05,COMGPRS_CGSCO05_400111.dat,processed_cdr_20161117060947_00627244.cdr,20161117060114,20161117,18,45229
COMGPRS,CGHW12,COMGPRS_CGHW12_610617.dat,processed_cdr_20161117061743_01680861.cdr,20161117060116,20161117,225,42832
COMGPRS,CGHW12,COMGPRS_CGHW12_610618.dat,processed_cdr_20161117061852_01680862.cdr,20161117060120,20161117,225,43142
COMGPRS,ALK_02P,COMGPRS_ALK_02P_030792.dat,processed_cdr_20161117054847_00032422.cdr,20161117060206,20161117,225,48781
COMGPRS,ALK_02P,COMGPRS_ALK_02P_030793.dat,processed_cdr_20161117054941_00032423.cdr,20161117060206,20161117,225,47695
COMGPRS,CGVEN08,COMGPRS_CGVEN08_770418.dat,processed_cdr_20161117061228_02136487.cdr,20161117060207,20161117,225,42512
COMGPRS,ALK_02P,COMGPRS_ALK_02P_030794.dat,processed_cdr_20161117055035_00032424.cdr,20161117060207,20161117,225,48761
COMGPRS,ALK_02P,COMGPRS_ALK_02P_030795.dat,processed_cdr_20161117055129_00032425.cdr,20161117060208,20161117,225,48990
COMGPRS,ZCGHW4,COMGPRS_ZCGHW4_493748.dat,processed_cdr_20161117060216_03231049.cdr,20161117060208,20161117,225,42921
COMGPRS,ALK_02P,COMGPRS_ALK_02P_030796.dat,processed_cdr_20161117055221_00032426.cdr,20161117060209,20161117,225,48149
COMGPRS,CGVEN16,COMGPRS_CGVEN16_500074.dat,processed_cdr_20161117061325_01657026.cdr,20161117060209,20161117,225,42554
COMGPRS,CGVEN08,COMGPRS_CGVEN08_770419.dat,processed_cdr_20161117061315_02136488.cdr,20161117060211,20161117,225,42232
COMGPRS,ZCGHW4,COMGPRS_ZCGHW4_493749.dat,processed_cdr_20161117060359_03231050.cdr,20161117060213,20161117,225,42849
COMGPRS,CGVEN16,COMGPRS_CGVEN16_500075.dat,processed_cdr_20161117061452_01657027.cdr,20161117060213,20161117,225,42561

I am triying to make this in shell, thats the reason of my second post
# 6  
Old 11-18-2016
You are quite economical with the facts. I'm afraid I can't be of any further help unless way more details are revealed.

There seems to be ONE SINGLE output line for the entire stream. Why, then, the uniq function?
And, please be more precise and unambiguous: which unique field is to be counted: $4 as in the pipe in post#1, $2 as commented ("for the column 2") in post#3, or the combination of $1 and $2 as in the code snippet in post#3?

What be the result for your sample data lines in post#5? Applying the pipe from that post yields 0. Please show the expected output and the logics to be applied to achieve it, in plain English.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script Shell: Count The sum of numbers in a file

Hi all; Here is my file: V1.3=4 V1.4=5 V1.1=3 V1.2=6 V1.3=6 Please, can you help me to write a script shell that counts the sum of values in my file (4+5+3+6+6) ? Thank you so much for help. Kind regards. (3 Replies)
Discussion started by: chercheur111
3 Replies

2. Shell Programming and Scripting

Sum of numbers in three or more files

I have files : cat file1 15 88 44 667 33 4cat file2 445 66 77 3 56 (12 Replies)
Discussion started by: Natalie
12 Replies

3. Shell Programming and Scripting

Count lines from multiple files (3)

Hey everyone, I've to count lines from string of files names then to show sum output of lines. for example: read x = F1 F2 F3 F1 = 12 lines F2 = 14 lines F3 = 10 lines = 36 what I did is: read x echo $x >|temp for x in $(cat temp) do wc -l < $x (3 Replies)
Discussion started by: Aviv
3 Replies

4. Shell Programming and Scripting

sum numbers of multiple files

Hi, I want to count the number of occurrences of numbers from a file of 6,000,000 lines. Because its too large, I decided to split the counts up in multiple files. So I have files of the counts of 5,000 lines. Now I want to add up the counts of all those files. The "counts file" looks like... (9 Replies)
Discussion started by: linseyr
9 Replies

5. Shell Programming and Scripting

Sum Numbers from different files

Hi All, I need to print the sum of numbers from different files. Input files: file1.out 10 20 30 file2.out 10 20 30 (5 Replies)
Discussion started by: saint2006
5 Replies

6. Homework & Coursework Questions

Help with shell script to find sum of first n numbers of Fibonacci series

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Shell script to find sum of first n numbers of Fibonacci series 2. Relevant commands, code, scripts,... (0 Replies)
Discussion started by: Kshitija
0 Replies

7. Shell Programming and Scripting

Shell script to find the sum of first n Fibonacci numbers

pls give me the solution for this i need it for my exam pls pls pls Shell script to find the sum of first n Fibonacci numbers (1 Reply)
Discussion started by: Kshitija
1 Replies

8. Shell Programming and Scripting

sum numbers in multiple files

I have 11 directories with around 200 files in each. In each directory the files are labeled out.0 through out.201 . Each file has around 118 numbers in a single column. I need to sum the files in each directory so each directory will have a resultant vector that is 118 numbers long. I then... (5 Replies)
Discussion started by: pattywac
5 Replies

9. Shell Programming and Scripting

how to find a sum of multiple numbers

I have a command which returns some numbers as follows: $ls -l ${dbname}.ix* | awk '{print $5 }' 929792 36864 57344 73728 53248 114688 How can I find the sum of those numbers by piping this output into 'awk' or some other editor/command? Thanks a lot -A (3 Replies)
Discussion started by: aoussenko
3 Replies

10. UNIX for Dummies Questions & Answers

trying to count lines in multiple files

Hi there, I need help. I want to run the command: less filename | wc -l But on multiple files in a directory So to get those files I would run ls -ltr | grep filename_2000123 or of course ls -ltr *filename_2000123* But I am having a problem running a loop to get a count of each... (1 Reply)
Discussion started by: llsmr777
1 Replies
Login or Register to Ask a Question