File processing - have to get the count of similiar types


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File processing - have to get the count of similiar types
# 1  
Old 04-04-2012
Question File processing - have to get the count of similiar types

Input File:
Code:
c_id=india
---some data--
c_id=US
--some data---
c_id=UK
--some data--
c_id=india
--some data--
c_id=india
--some data--
c_id=Russia
--some data--
c_id=UK
--some data--
c_id=US
--some data--
c_id=Africa
--some data

Now we need to group the same c_id's and to generate the count.
For above eg:
Output:
Code:
india=3
US=2
UK=2
Russia=1
Africa=1

PS: The c_id values changes, so grep "india" | wc -l will not work

Thanks!
# 2  
Old 04-04-2012
What have you tried so far? Why does the grep not work:
Code:
$ grep -c "india" infile
3

# 3  
Old 04-04-2012
Code:
awk '
BEGIN {
  FS="="
}
$1=="c_id" {
  ++arr[$2]
}
END {
  for (i in arr) {
    printf("%s=%d\n", i, arr[i])
}' input-file

# 4  
Old 04-04-2012
MySQL

Quote:
Originally Posted by Scrutinizer
What have you tried so far? Why does the grep not work:
Code:
$ grep -c "india" infile
3


As I told, the value changes we can't expect the value of c_id. Thats why it wont work.

---------- Post updated at 07:36 PM ---------- Previous update was at 07:31 PM ----------

Quote:
Originally Posted by chihung
Code:
awk '
BEGIN {
  FS="="
}
$1=="c_id" {
  ++arr[$2]
}
END {
  for (i in arr) {
    printf("%s=%d\n", i, arr[i])
}' input-file


Works perfectly... just correction printf("%s=%d\n", i, arr[i])}
(end parathesis is missing)
# 5  
Old 04-04-2012
Well just grep for c_id then and count

Code:
 
$ grep c_id= infile | sort | uniq -c
   1 c_id=Africa
   1 c_id=Russia
   2 c_id=UK
   2 c_id=US
   3 c_id=india

# 6  
Old 04-04-2012
Quote:
Originally Posted by karumudi7
As I told, the value changes we can't expect the value of c_id. Thats why it wont work.

---------- Post updated at 07:36 PM ---------- Previous update was at 07:31 PM ----------




Works perfectly... just correction printf("%s=%d\n", i, arr[i])}
(end parathesis is missing)
Quote:
Originally Posted by Scrutinizer
Well just grep for c_id then and count

Code:
 
$ grep c_id= infile | sort | uniq -c
   1 c_id=Africa
   1 c_id=Russia
   2 c_id=UK
   2 c_id=US
   3 c_id=india


Now the problem arised Smilie It was worked perfecly for the example I gave. But not with my file.
I have some data before c_id variable.
like
Code:
 
Activity date="2012-04-02" source="online" c_id="india" tdm_seq_num=23

As tdm_seq_num is different the uniq is also not working. Smilie

---------- Post updated at 08:14 PM ---------- Previous update was at 08:05 PM ----------

Got the solution as:
Code:
 
grep "c_id" inputfile | cut -f6 -d"\"" | sort | uniq -c

Sample input:
Code:
 
<Activity date="2012-03-29" source="online" c_id="India" tdm_seq_num="23" time="19:58:50">

Any better solution than this? which will work efficiently.

Thanks!
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Searching for file types by count in specific folder in RHEL 6

So I'm trying to search for the top 10 or 15 items under a directory by file type. I want to run a command on a directory and get something like the following: Example of expected output.. .PDF: 100, .txt: 95, .word: 80.. What would be the best way of going about this? I've searched around... (2 Replies)
Discussion started by: shackle101
2 Replies

2. Shell Programming and Scripting

Help with print if two columns is somehow similiar

Input File: GO:0009437,GO:0006355,GO:0006351 GO:0009437 GO:0006777,GO:0032324 GO:0035433,GO:0015992,GO:0071422 GO:0009082,GO:0009097,GO:0006566 GO:0009082,GO:0006351 GO:0000160,GO:0045893,GO:0006351 GO:0006071,GO:0045892,GO:0006351 GO:0009244 GO:0009244 GO:0046417,GO:0009094,GO:0006571... (12 Replies)
Discussion started by: perl_beginner
12 Replies

3. UNIX for Dummies Questions & Answers

Hex dump into Wireshark or similiar

Hi Guy's I am trying to find a way of importing the raw hex data from a router dump into a wireshark trace for example. I have had a look at the text2pcap pages and cant seem to work it out. Does anyone have any expierence in this ? If it requires format changes whats the best way of doing... (3 Replies)
Discussion started by: mutley2202
3 Replies

4. Shell Programming and Scripting

Cp -r except certain file types

the following excludes certain directories successfully cp -r probe/!(dir) /destination I want to exclude certain file types and tried unsuccessfully cp -r probe/!(*.avi) /destination (2 Replies)
Discussion started by: tmf
2 Replies

5. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

6. Red Hat

Copy certain file types recursively while maintaining file structure on destination?

Hi guys, I have just been bothered by a fairly small issue for some time now. I am trying to search (using find -name) for some .jpg files recursively. This is a Redhat environment with bash. I get this job done though I need to copy ALL of them and put them in a separate folder BUT I also... (1 Reply)
Discussion started by: rockf1bull
1 Replies

7. UNIX for Dummies Questions & Answers

how do i compare and extract similiar data

I have 2 files. The first file contains user names in one column. The second, and considerably longer, file contains user names in the first column and corresponding full names in the second column. Currently these are in the .xls format. I'd like to be able to compare file1 with file2 and extract... (2 Replies)
Discussion started by: raptrmastr
2 Replies

8. UNIX for Dummies Questions & Answers

Similiar to active directory in Unix?

Hi all, If Windows server have an active directory and active folder/mapping(maybe in unix NFS). Is there any similiar fuctions in unix. Actually if we have a hundred client in unix/linux with unix server, I want to manage user client and access control easier as in windows. Thank you in... (5 Replies)
Discussion started by: blesets
5 Replies

9. Filesystems, Disks and Memory

associated file types

I have a file of type .for extension .In a guui based unix environment like solaris if I double click on that file a specific program designed by me has to run which takes this file as the parameter and exceutes the program. Can anyone help me? (8 Replies)
Discussion started by: nhk_srd
8 Replies
Login or Register to Ask a Question