Problem counting unique disks/slices


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Problem counting unique disks/slices
# 1  
Old 09-08-2011
Problem counting unique disks/slices

I want to create a unique listing of slices/disks from a large list that will have duplicates. Here is a sample of the input file.

Code:
#array.txt
Disk4:\s93
Disk4:\s93
Disk4:\s94
Disk4:\s95\s96\s97
Disk4:\s93
Disk4:\s95\s96\s103
Disk4:\s93
Disk4:\s93
Disk4:\s95\s96\s105
Disk4:\s93
Disk4:\s95\s96\s105
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s95\s96\s106

I think the following command should give me what I want. It should output the unique disk/slice name in the first column and how many times that disk/slice occurred in the list in the 2nd column.

Code:
cat array.txt | awk 'count[$1]++ END {for (i in count) print i, count[i]}' > array.txt_unique

However, only some of the lines in the output file are correct. Sometimes it seems that it is not matching the disk name and it thinks it is different when in reality it is the same. See my sample output file below.

Code:
#array.txt_unique
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s95\s96\s105
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s93
Disk4:\s95\s96\s103 1
Disk4:\s95\s96\s105 2
Disk4:\s95\s96\s106 1
Disk4:\s95\s96\s97 1
Disk4:\s93 14
Disk4:\s94 1

I also tried removing the colons and back slashes prior to counting the unique slices, but I had the same result. Can someone help me with this?

Thanks in advance,
Jonathan
# 2  
Old 09-08-2011
Code:
sort yourfile | uniq

Is this what you want?

--ahamed
This User Gave Thanks to ahamed101 For This Post:
# 3  
Old 09-08-2011
Code:
awk '{x[$0]++}END{for(i in x) print i,x[i]}' array.txt

This User Gave Thanks to shamrock For This Post:
# 4  
Old 09-09-2011
Shamrock and Ahamed,

Thank you both for your help!

Shamrock,
I think the code you provided is the same as what I originally tried that only seems to work some of the time:
Code:
cat array.txt | awk 'count[$1]++ END {for (i in count) print i, count[i]}' > array.txt_unique

Ahamed, your snippet works, but I was having trouble fitting it into the larger task I was trying to accomplish.

Let me explain my problem in full. I have a trace file I will use as input that looks like the following:

Code:
#input.txt
53600.88  "Disk4:\s93" 129048320 16 0
53601.96  "Disk4:\s93" 100679424 8 0
53602.16  "Disk4:\s94" 14080 1 0
53603.97  "Disk4:\s95\s96\s97" 95010560 128 0
53614.06  "Disk4:\s93" 129052416 16 0
53616.24  "Disk4:\s95\s96\s103" 204544 128 0
53620.87  "Disk4:\s93" 100679424 8 0
53623.21  "Disk4:\s95\s96\s105" 11179776 128 0
53624.2  "Disk4:\s93" 100681472 8 0
53628.79  "Disk4:\s95\s96\s105" 11179776 128 0
53629.91  "Disk4:\s93" 100679424 8 0
53641.74  "Disk4:\s95\s96\s106" 20336384 8 0
53643.65  "Disk4:\s93" 100679424 8 0
53647.63  "Disk4:\s95\s96\s107" 124010240 64 0
53649.5  "Disk4:\s93" 100679424 8 0
53653.25  "Disk4:\s95\s96\s108" 60641024 8 0
53656.19  "Disk4:\s95\s96\s97" 95010560 8 0
69015.39  "Disk4:\s152\s153" 81643264 16 0
88588.57  "Disk4:\s172\s173" 72648448 16 1
103611.34  "Disk4:\s93" 129062656 16 0
103612.41  "Disk4:\s93" 100681472 8 0
103917.55  "Disk4:\s172\s173" 115363584 16 1
113755.24  "Disk4:\s252" 113782528 8 0
113755.76  "Disk4:\s253\$UsnJrnl:$J" 22150912 21 0

I would like to convert all of the disk information in the 2nd column to be a device number like 0, 1, 2, etc. I don't care which device number gets assigned to which string. It just has to be unique and match to the original.

So...

"Disk4:\s93" can simply become 0
"Disk4:\s94" can simply become 1
"Disk4:\s95\s96\s97" can simply become 2

For example, I want my completed output file to look like this:

Code:
#output.txt
53600.88  0 129048320 16 0
53601.96  0 100679424 8 0
53602.16  1 14080 1 0
53603.97  2 95010560 128 0
53614.06  0 129052416 16 0
53616.24  3 204544 128 0
53620.87  0 100679424 8 0
53623.21  4 11179776 128 0
53624.2  0 100681472 8 0
53628.79  4 11179776 128 0
53629.91  0 100679424 8 0
53641.74  5 20336384 8 0
53643.65  0 100679424 8 0
53647.63  6 124010240 64 0
53649.5  0 100679424 8 0
53653.25  7 60641024 8 0
53656.19  2 95010560 8 0
69015.39  8 81643264 16 0
88588.57  9 72648448 16 1
103611.34  0 129062656 16 0
103612.41  0 100681472 8 0
103917.55  9 115363584 16 1
113755.24  10 113782528 8 0
113755.76  11 22150912 21 0

Hopefully this explains what I'm trying to do better. Thank you!
# 5  
Old 09-09-2011
Code:
awk '{gsub(/\\|\$/,"#");if($2 in a){}else{a[$2]=i++;}gsub($2,a[$2])}1' infile

53600.88  0 129048320 16 0
53601.96  0 100679424 8 0
53602.16  1 14080 1 0
53603.97  2 95010560 128 0
53614.06  0 129052416 16 0
53616.24  3 204544 128 0
53620.87  0 100679424 8 0
53623.21  4 11179776 128 0
53624.2  0 100681472 8 0
53628.79  4 11179776 128 0
53629.91  0 100679424 8 0
53641.74  5 20336384 8 0
53643.65  0 100679424 8 0
53647.63  6 124010240 64 0
53649.5  0 100679424 8 0
53653.25  7 60641024 8 0
53656.19  2 95010560 8 0
69015.39  8 81643264 16 0
88588.57  9 72648448 16 1
103611.34  0 129062656 16 0
103612.41  0 100681472 8 0
103917.55  9 115363584 16 1
113755.24  10 113782528 8 0
113755.76  11 22150912 21 0

--ahamed

Last edited by ahamed101; 09-09-2011 at 10:41 AM.. Reason: removed the extra gsub
This User Gave Thanks to ahamed101 For This Post:
# 6  
Old 09-09-2011
Ahamed,

This is awesome! It is exactly what I needed. Thank you so much! So I noticed the weird special characters in the last line too. The file is an excerpt from a well known trace file. It is probably incorrect, but as long as I can uniquely identify it as a disk, it will still work for me.

Thank you!
Jonathan
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Counting Pattern and unique pattern

Hi, I have a log file which amongst other text has these lines: (id is always the same format - ) e.g. <username>user1</username> <name>fdfsdf</name> Multiple other tags <id>A111</id> <username>user2</username> <name>fdfsdf</name> Multiple other tags <id>A222</id>... (1 Reply)
Discussion started by: arsenalfan01
1 Replies

2. UNIX for Dummies Questions & Answers

partition of slices

Hello, I am using solaris 10 x86. my root and backup slices is having same memory 10 GB and same cylinders numbers . My root and backup cylinders ends at same cylinder number 1031. so for creating a new slice i am giving starting cylinder from 1302 and this is giving me error as "out of range" .... (2 Replies)
Discussion started by: bhargav90
2 Replies

3. AIX

Interesting Problem! 2 VIOs, One is problematic, assigning disks and resources from the other only

Hi, The scenario is like this: 1.We needed to assign two hdisks to an LPAR 2.SAN team gives us two ldevs 3.One of our VIO is hanging on cfgmgr operation 4. We ran cfgmgr on the smooth VIO. Got the disks and assigned the disks from there to the LPAR.(By passed the other VIO as in didnt run... (11 Replies)
Discussion started by: aixromeo
11 Replies

4. UNIX for Advanced & Expert Users

Problem while counting number of fields in TAB delimited file

I'm facing a strange problem, please help me out. Here we go. I want to count number of fields in particular file. filename and delimiter character will be passed through parameter. On command prompt if i type following i get 27 as output (which is correct) cat customer.dat | head -1 | awk... (12 Replies)
Discussion started by: vikanna
12 Replies

5. Shell Programming and Scripting

Counting unique IP in warning log

Hi I have a log that look like this: 12:20:28.522 Connection from IP: 185.164.118.136 Login Failed! 12:20:29.389 Connection from IP: 84.20.182.63 Login Failed! 12:20:30.111 Connection from IP: 80.180.143.79 Login Failed! 12:20:31.038 Connection from IP: 83.226.102.106 Login Failed!... (1 Reply)
Discussion started by: Jotne
1 Replies

6. Solaris

SAN DISKS - Number of slices ?

Good morning to one and all :-) Thank god its Friday, as its bee na rubbish week for me ! So, a quick question. Disks ! Ive got a few local disks, and a few SAN disks used on my solaris server. Whats confusing me, and Im not sure if there's an issue at the SAN end, or my end, regarding the... (3 Replies)
Discussion started by: sbk1972
3 Replies

7. Solaris

Problem with accessing SAN disks

Hi, I'm having a problem when attempting to define the OCR location for my 10g RAC setup on Solaris 10. I get the following error: The specified shared raw partition /dev/did/rdsk/d1s0 may not have the correct permission. Verify that the partition is owned by Oracle user. As per the Oracle10g... (15 Replies)
Discussion started by: michael.chow
15 Replies

8. Shell Programming and Scripting

problem with the script in counting

Hi, I have a script which greps for a word in a file contains records. I grabbed a particular column & sent the colomn values to a file. I need to find each column value, the times it appeared in the file. My script is: grep sceneority <file> | cut -f 6 >> swi With... (4 Replies)
Discussion started by: pradeep_script
4 Replies

9. Linux

Problem for restoring lvm on different disks

Hi all, I have a server running in RH ES4, the SCSI HD are running in RAID 1. I backup the LVM config by using 'vgcfgbackup' and then remove all the HD. I insert another HD (same size & branch but different model) into the machine and run linux rescue to recreate the... (0 Replies)
Discussion started by: donaldfung
0 Replies

10. Linux

problem with disks on SAN

Hi I have a linux box attched to a SAN storage from EMC with RAID 5 .I understand that it has 3g cache howver a 20gb file creation takes too much time here are my results any ideas why time dd if=/dev/zero of=disk.img bs=1048576 count=20000 20000+0 records in 20000+0 records out 997.59s... (2 Replies)
Discussion started by: xiamin
2 Replies
Login or Register to Ask a Question