Manipulating Data Records for reporting


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Manipulating Data Records for reporting
# 1  
Old 03-18-2016
Manipulating Data Records for reporting

Hello All,

I have Data Records (DRs) with the following format:

Code:
SEQ_NO|REQ_TIMESTAMP|RES_TIMESTAMP|DURATION|EVENT_TYPE|RECEIVER_MSISDN|PAYER_MSISDN|RESPONSE_CODE|INTERNAL_ERR_CODE|RESPONSE_CODE_DESCRIPTION|TRX_ID|FLOWNAME|MESSAGE_ID|ORIGINAL_MESSAGE|NODE_REQUEST|NODE_RESPONSE|NODE_IDENTIFIER|NODE_OPERATION|

And we have tousands of DRs which are produced everyday and every day we use previous day's records for reporting purpose. Daily we have many files generated like the following and each of them have hundreds of rows like the format above:

Code:
-rw-r--r--   1 sscm     sscm     8556408 Mar 15 00:29 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_00.1.gz
-rw-r--r--   1 sscm     sscm     8503748 Mar 15 00:49 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_00.2.gz
-rw-r--r--   1 sscm     sscm     3883294 Mar 15 01:00 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_00.3.gz
-rw-r--r--   1 sscm     sscm     8808936 Mar 15 01:28 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_01.0.gz
-rw-r--r--   1 sscm     sscm     6367825 Mar 15 02:00 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_01.1.gz
-rw-r--r--   1 sscm     sscm     8169001 Mar 15 03:00 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_02.0.gz
-rw-r--r--   1 sscm     sscm     5149508 Mar 15 04:00 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_03.0.gz
-rw-r--r--   1 sscm     sscm     2770578 Mar 15 05:00 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_04.0.gz
-rw-r--r--   1 sscm     sscm     6679069 Mar 15 05:57 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_05.0.gz
.
.
.
-rw-r--r--   1 sscm     sscm     7077509 Mar 15 22:00 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_21.3.gz
-rw-r--r--   1 sscm     sscm     8943708 Mar 15 22:16 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_22.0.gz
-rw-r--r--   1 sscm     sscm     8620239 Mar 15 22:33 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_22.1.gz
-rw-r--r--   1 sscm     sscm     8936831 Mar 15 22:51 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_22.2.gz
-rw-r--r--   1 sscm     sscm     4066354 Mar 15 23:00 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_22.3.gz
-rw-r--r--   1 sscm     sscm     8818361 Mar 15 23:17 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_23.0.gz
-rw-r--r--   1 sscm     sscm     8592770 Mar 15 23:34 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_23.1.gz
-rw-r--r--   1 sscm     sscm     8749865 Mar 15 23:51 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_23.2.gz
-rw-r--r--   1 sscm     sscm     4520274 Mar 16 00:00 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_23.3.gz

We use some special fields to filter and create our reports. I use the following code but it does not give me what actually I want:

Code:
gzcat log.SSCM-ESB1.sscm_esb.14622.2016-03-15*gz | gawk -F\| '/subscriberCreate|subscriberPaymentMethodChange|subscriberProfileUpdate|subscriberStatusChange|accountOwnershipChange/{if(($5==2001) && ($8 != 0)) print "SUCCESS " $18}{if(($5==2001) && ($8 != 0)) print "FAILURE REASON CODE = " $8,$18}' | sort | uniq -c

Result is like:
Code:
   1 FAILURE REASON CODE = -2 deactivateBBService
   2 FAILURE REASON CODE = -2 getOffers
  31 FAILURE REASON CODE = -2 refillVoucherless
 250 FAILURE REASON CODE = -2 retrieveEligibilityProfiles
   1 FAILURE REASON CODE = -2 setHLRSub
36348 FAILURE REASON CODE = -3 refillVoucherless
37203 FAILURE REASON CODE = -3 retrieveEligibilityProfiles
   1 FAILURE REASON CODE = -3 updateDataOfferSelection
   1 FAILURE REASON CODE = -3 updateDataOfferSelectionRetrieve
  52 FAILURE REASON CODE = -4 addAndActivateService
16517 FAILURE REASON CODE = -4 retrieveEligibilityProfiles
 645 FAILURE REASON CODE = -4 sendSmsMessage
  42 FAILURE REASON CODE = -4 setHLRSub
  13 FAILURE REASON CODE = 100 getBalanceAndDate
   1 FAILURE REASON CODE = 100 getFafList
  13 FAILURE REASON CODE = 100 getOffersByDA
  59 FAILURE REASON CODE = 100 retrieveEligibilityProfiles
  12 FAILURE REASON CODE = 126 retrieveEligibilityProfiles
1917 FAILURE REASON CODE = 3 retrieveEligibilityProfiles
 134 FAILURE REASON CODE = 3300000 retrieveEligibilityProfiles
   1 FAILURE REASON CODE = 3300001 retrieveEligibilityProfiles
  23 FAILURE REASON CODE = 4020100 subscriberProfileUpdate
  27 FAILURE REASON CODE = 4020100 subscriptionCreateReserve
1764 FAILURE REASON CODE = 4020102 subscriptionCreateReserve
  71 FAILURE REASON CODE = 4020110 subscriptionCreateReserve
12620 FAILURE REASON CODE = 4020115 subscriptionCreateReserve
   1 FAILURE REASON CODE = 4020117 queryOfferCampaignDetails
5780 FAILURE REASON CODE = 4020119 subscriptionCreateReserve
   1 SUCCESS subscriberPaymentMethodChange
  24 SUCCESS subscriberProfileUpdate
1107 SUCCESS subscriberStatusChange

I want hourly breakdown of these records. How could I achieve it?

when I do "
Code:
ls -ltr *19.*

" it gives me the following input:

Code:
-rw-r--r--   1 sscm     sscm     8664875 Mar 15 19:15 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_19.0.gz
-rw-r--r--   1 sscm     sscm     8665776 Mar 15 19:32 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_19.1.gz
-rw-r--r--   1 sscm     sscm     8832978 Mar 15 19:48 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_19.2.gz
-rw-r--r--   1 sscm     sscm     6035911 Mar 15 20:00 log.SSCM-ESB1.sscm_esb.14622.2016-03-15_19.3.gz

Also this give me actual hours:

Code:
-bash-4.1$ ls -ltr | gawk '{split($NF,a,".");print a[5]}'
2016-03-15_00
2016-03-15_00
2016-03-15_00
2016-03-15_01
2016-03-15_01
2016-03-15_02
2016-03-15_03
.
.

so how could I merge both codes to have an output like the following?

Code:
   2016-03-19_00 report is:
   1 FAILURE REASON CODE = -2 deactivateBBService
   2 FAILURE REASON CODE = -2 getOffers
   12 FAILURE REASON CODE = 420409 checkActiveSubscriptionForFaF
   4 FAILURE REASON CODE = 420999 subscriptionCreateReserve
   1 FAILURE REASON CODE = 440299 subscriptionCreateCommit
   1 FAILURE REASON CODE = 440302 subscriptionCreateReserve
 373 FAILURE REASON CODE = 500100 Finish
 566 FAILURE REASON CODE = 99 updateDataOfferSelectionRetrieve
   1 SUCCESS subscriberPaymentMethodChange
  24 SUCCESS subscriberProfileUpdate
1107 SUCCESS subscriberStatusChange
   2016-03-20_00 report is:
   1 FAILURE REASON CODE = -2 deactivateBBService
   2 FAILURE REASON CODE = -2 getOffers
   12 FAILURE REASON CODE = 420409 checkActiveSubscriptionForFaF
   4 FAILURE REASON CODE = 420999 subscriptionCreateReserve
   1 FAILURE REASON CODE = 440299 subscriptionCreateCommit
   1 FAILURE REASON CODE = 440302 subscriptionCreateReserve
 373 FAILURE REASON CODE = 500100 Finish
 566 FAILURE REASON CODE = 99 updateDataOfferSelectionRetrieve
   1 SUCCESS subscriberPaymentMethodChange
  24 SUCCESS subscriberProfileUpdate
1107 SUCCESS subscriberStatusChange

# 2  
Old 03-18-2016
More guessing than understanding that you want to group files by the last number group in the file name's timestamp, would this come close to what you want:
Code:
for i in {15..19}; do ls log.SSCM-ESB1.sscm_esb.14622.2016-03-15_$i.*; echo; done

In lieu of the ls, you could run your script on the results within the loop.
# 3  
Old 03-18-2016
Quote:
Originally Posted by RudiC
More guessing than understanding that you want to group files by the last number group in the file name's timestamp, would this come close to what you want:
Code:
for i in {15..19}; do ls log.SSCM-ESB1.sscm_esb.14622.2016-03-15_$i.*; echo; done

In lieu of the ls, you could run your script on the results within the loop.
It would start from 2016-03-15_00 to 2016-03-15_23 and yes I would like to group them hourly.. However I would like to create something generic for example when I run this command:

Code:
date=`TZ=GMT-2 date +%Y-%m-%d_%H`

it gives my current local time in terms of hours:

Code:
2016-03-18_16

and I would use this date actually like the following:

Code:
gzcat /data/edr/processed/*$date*gz .......

so how could I modify it? Maybe something like you mentioned but we do SSH connection too before gzcat and I wouldnt like to open more than one ssh connection for each "$i" in for loop:

Code:
ssh -l root $Host1 gzcat....

Code:
for i in `echo "$date" | nawk -F"_" '{print $2}'`;do...

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Data manipulating script. Please HELP!

Dear friends, I'm struggling to preparing a bunch of gromacs input files, say manually. It's really a time-consuming work without any techniques. I suppose that it could be done by a smart script automatically. But I lack some basic knowledge on scripting. Please help! My original input looks... (3 Replies)
Discussion started by: liuzhencc
3 Replies

2. UNIX for Dummies Questions & Answers

Splitting data into new records

Hi, My file is seperated with ";" delimiter, after 13 delimiter i want to put the data in new line... eg: My current file:- a;b;c;d;e;f;g;h;e;f;h;s;t;a;i;o;q;t;q;r;yu;f;sz;f;t;r........... i want o/p as:- a;b;c;d;e;f;g;h;e;f;h;s;t a;i;o;q;t;q;r;yu;f;sz;f;t;r How to achieve ths,... (2 Replies)
Discussion started by: gnnsprapa
2 Replies

3. Shell Programming and Scripting

Manipulating xml data with awk

Hi everyone, I have a little bit of complicated task to finish with AWK. Here it is; I have a data file in xml format which looks like this <data> a1 a2 a3 a4 a5 b1 b2 b3 b4 b5 c1 c2 c3 c4 c5 d1 d2 d3 d4 d5 e1 e2 e3 e4 e5 </data> lets say each data block contains 5 rows and 5 columns,... (13 Replies)
Discussion started by: hayreter
13 Replies

4. Shell Programming and Scripting

Assistance Required - manipulating records

Hi guys, I need to do a bit of work to a file. Basically I want to run a SQL statement against each record in this file (hundreds of lines): sample input: 233333 233334 233335 233336 I want to do the following: 1. Add a some text at the beginning to prepare the SQL (3 lines) 2. Add... (7 Replies)
Discussion started by: mcclunyboy
7 Replies

5. Shell Programming and Scripting

reading from two files and manipulating the data

hi i have a file of the following format FILE1 5 937 8 1860 1850 1 683 2 1 129 2 2 5 938 8 1122 1123 1 20 520 4 1860 1851 1 5 939 8 1122 1124 1 20 521 4i have another file which... (3 Replies)
Discussion started by: vaibhavkorde
3 Replies

6. Shell Programming and Scripting

manipulating data

Hi guys Firstly, I'd like to say hi and how great this forum is. I'm not new to UNIX but am relatively new to scripting. I have a personal project that I'm working on just to try and speed up my learning. I working with a text file, well more of a logfile really. It has several columns of... (6 Replies)
Discussion started by: abcd69
6 Replies

7. Emergency UNIX and Linux Support

Manipulating Data

Hi. I haven't had to write bash scripts in a long time and have a simple task to do, but need some help: Input: chrY:22627291-22651542 chrY:23045932-23070172 chrY:23684890-23696359 chrY:25318610-25330083 chrY:25451096-25462570 chr10:1054847-1061799 chr10:1058606-1080131... (7 Replies)
Discussion started by: awknerd
7 Replies

8. Shell Programming and Scripting

Manipulating Pick multi dimensional data with awk.

Hi. I am reasonably new to awk, but have done quite a lot of unix scripting in the past. I have resolved the issues below with unix scripting but it runs like a dog. Moved to awk for speed and functionality but running up a big learning curve in a hurry, so hope there is some help here. I... (6 Replies)
Discussion started by: mike.strategis
6 Replies

9. Shell Programming and Scripting

Need to Extract Data From 94000 records

i have a input file which does not have a delimiter All i Need to do is to identify a line and extract the data from it and run the loop again and need to ensure that it was not extracted earlier Input file ------------ abcd 12345 egfhijk ip 192.168.0.1 CNN.com abcd 12345 egfhijk ip... (12 Replies)
Discussion started by: vasimm
12 Replies

10. Shell Programming and Scripting

Manipulating data in variable

Hi, I have two variables - A and B - containing a bunch of file paths. I am comparing them and when I find a match I want to remove that entry from A so that as the compare proceeds A shrinks entry by entry. How can I remove a matched entry from A whilst leaving the non matched entries... (6 Replies)
Discussion started by: ajcannon
6 Replies
Login or Register to Ask a Question