PIG Latin : Record count


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting PIG Latin : Record count
# 1  
Old 01-17-2014
PIG Latin : Record count

Hello ,

I have a sample file with below information
Code:
Ipaddress , time
-------------------
ipaddress-1,10:58
ipaddress-1,11:50
ipaddress-1-10:58
ipaddress-2,11:50
ipaddress-2,10:58
ipaddress-2,10:58

Expected output should be
Code:
Ipaddress,time,count
----------------------------------
ipaddress-1,10:58,2
ipadress-1,11:50,1
ipaddress-2,10:58,2
ipaddress-2,11:50,1

This output is required to understand how many times an ip hit the server at particular time frame .

, I tried using HIVE and was able to get the report using Excel ( ODBC ) . But unable to find the query for the same in PIG.

Last edited by Franklin52; 01-17-2014 at 03:18 AM.. Reason: code tags
# 2  
Old 01-17-2014
Hi Rakesh,

Following may help you.


Code:
awk -F"," 'BEGIN{print "IPaddress""  ""time""  ""count"} NR==1 || NR==2 {next} {a[$1","$2]++;next} END{for(i in a) print i" "a[i]}' file_name | sed 's/\-/ /g;s/\,/ /g'

Output will be as follows.

Code:
IPaddress  time  count
ipaddress 2 10:58 2
ipaddress 1 11:50 1
ipaddress 1 10:58  1
ipaddress 1 10:58 1
ipaddress 2 11:50 1

1st code is treating ipaddress-1,10:58 and ipaddress-1,10:58 different.

2nd code:
Code:
awk ' BEGIN{print "IPaddress""  ""time""  ""count"} gsub(/\-/,X) gsub(/\,/,Y) {if(NR==1 || NR==2) {next}; a[$1" "$2]++;next} END{for(i in a) print i" "a[i]}' file_name



Output will be as follows.

Code:
IPaddress  time  count
ipaddress210:58  2
ipaddress110:58  2
ipaddress211:50  1
ipaddress111:50  1

Thanks,
R. Singh

Last edited by RavinderSingh13; 01-17-2014 at 03:17 AM.. Reason: added a new code
# 3  
Old 01-17-2014
Thanks Ravinder but i am looking something in Pig Latin syntax , I am trying to process the data in Hadoop using PigLatin.

Appreciate if you can suggest the same code in Pig Latin.Programming
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Need record count on every 30 minute

We have the below records where we need record count of every 30 minute like 00:01 to 00:30 so in that we will have 48 record count in 24 hrs , and also we need sum of record count from 00:01 to 23:30. Please find sample data as well. 00:01 21 00:02 23 00:03 34 00:04 34 00:05 30... (10 Replies)
Discussion started by: nadeemrafikhan
10 Replies

2. Shell Programming and Scripting

count of record in files

Hi all, I have written a scripts which count number of lines in all the files in a directory and write in a text file. I don't know how to format it while writing. OS suns solaris 10 my scripts is as below for i in /ersdg3/ERS/ERS_INPUT_LOGS/RIO/LOGS/RIO_02-Aug-2012/ *.LOG do echo... (11 Replies)
Discussion started by: guddu_12
11 Replies

3. Shell Programming and Scripting

Validating the record count

Hi , I am having a text file with several records., it has a header record and trailer record. The header record has the number of rows (records) found in the text file and time-stamp. The footer record has the total number of records ( along with the header and trailer., Suppose: wc -l... (4 Replies)
Discussion started by: cratercrabs
4 Replies

4. Shell Programming and Scripting

Character count per record

I have a flat file. How can i retrive the character count per record in whole file. Can anybody assist me on this Cheers (9 Replies)
Discussion started by: subrat
9 Replies

5. Shell Programming and Scripting

Validate record count

Hi all How to verify the number of records in case of delimited file, If the file has records. And then if it is not equal to mentioned no. of records, email is triggered and file is moved to bad directory path. Example ----- input file = a.txt bad directory path : /usr/bin/bad (6 Replies)
Discussion started by: balaji23_d
6 Replies

6. UNIX for Advanced & Expert Users

character count per record

Hello can someone please advise. I need to send records in a file that are over 10,000 characters long to a seperate file. Any ideas? Thanks (2 Replies)
Discussion started by: Dolph
2 Replies

7. Shell Programming and Scripting

replaying a record count with another record count

i use unix command to take the record count for a file1 awk 'END{print NR}' filename i already have a file2 which conatin the count like ... .. rec_cnt=100 .. .. I want to replace the record in the file2 using the record i take from file1. suggest me some simple ways of doing it... (2 Replies)
Discussion started by: er_zeeshan05
2 Replies

8. Shell Programming and Scripting

record count

i have a file named file_names.dat where there are several files along with their path. exp: /data1/dir1/CTA_ACD_20071208.dat /data1/dir1/CTA_DFG_20071208.dat i want to write a script which will create a csv file with the name of the file and record count of that file the output file... (4 Replies)
Discussion started by: dr46014
4 Replies

9. Shell Programming and Scripting

Need help with Isql record count

What I am trying to do is check if the database query returned any records. If no records returned then output a message else output results to a file. Right now if I take out the if and else statements the code runs fine and sends the email. If no records returned the email sends the column... (4 Replies)
Discussion started by: johnu122
4 Replies

10. UNIX for Dummies Questions & Answers

How to count the record count in an EBCDIC file.

How do I get the record count in an EBCDIC file on a Linux Box. :confused: (1 Reply)
Discussion started by: oracle8
1 Replies
Login or Register to Ask a Question