Data mining a text file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Data mining a text file.
# 1  
Old 06-12-2008
Data mining a text file.

I'm auditing UID consistency across our hosts, and have created the following datafile, consisting of four fields. I would like to get a count of the combination of the last two fields. ie: I would like to find out how many instances there are of "root 0" and how many of "uucp 5", for every line in the file. I know basic perl and basic awk, but can't get my head around how to do this.

Can anyone offer advice?

thank you in advance

akbar



UID: crfw root 0
UID: crfw daemon 1
UID: crfw bin 2
UID: crfw sys 3
UID: crfw adm 4
UID: crfw lp 71
UID: crfw uucp 5
UID: crfw nuucp 9
UID: crfw smmsp 25
UID: crfw listen 37
UID: crfw gdm 50
UID: crfw webservd 80
UID: crfw nobody 60001
UID: crfw noaccess 60002
UID: creb root 0
UID: creb daemon 1
UID: creb bin 2
UID: creb sys 3
UID: creb adm 4
UID: creb lp 71
UID: creb uucp 5
UID: creb nuucp 9
UID: creb smmsp 25
UID: creb listen 37
UID: creb gdm 50
UID: creb webservd 80
UID: creb nobody 60001
UID: creb noaccess 60003
# 2  
Old 06-12-2008
Code:
awk '/(root|[^n]uucp)/  { 
 total[$(NF-1)]++
}END {
 for( i in total) {
  print i,total[i]
 }
}' file

# 3  
Old 06-13-2008
Hi.
If you want to enumerate all the like items:
Code:
#!/bin/bash -

# @(#) s1       Demonstrate counting occurrences.

echo
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) cut sort uniq
echo

FILE=${1-data1}

cut -d" " -f3- $FILE |
sort |
uniq -c

exit 0

Producing:
Code:
$ ./s1

(Versions displayed with local utility "version")
SunOS 5.10
GNU bash 3.00.16
cut - no version provided for /usr/bin/cut.
sort - no version provided for /usr/xpg4/bin/sort.
uniq - no version provided for /usr/bin/uniq.

   2 adm 4
   2 bin 2
   2 daemon 1
   2 gdm 50
   2 listen 37
   2 lp 71
   1 noaccess 60002
   1 noaccess 60003
   2 nobody 60001
   2 nuucp 9
   2 root 0
   2 smmsp 25
   2 sys 3
   2 uucp 5
   2 webservd 80

See man pages for details ... cheers, drl
# 4  
Old 06-13-2008
Hammer & Screwdriver perhaps one-line command?

Code:
> cat uid_data | cut -d" " -f3-4 | sort | uniq -c
      2 adm 4
      2 bin 2
      2 daemon 1
      2 gdm 50
      2 listen 37
      2 lp 71
      1 noaccess 60002
      1 noaccess 60003
      2 nobody 60001
      2 nuucp 9
      2 root 0
      2 smmsp 25
      2 sys 3
      2 uucp 5
      2 webservd 80

Login or Register to Ask a Question

Previous Thread | Next Thread

4 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare 2 text file with 1 column in each file and write mismatch data to 3rd file

Hi, I need to compare 2 text files with around 60000 rows and 1 column. I need to compare these and write the mismatch data to 3rd file. File1 - file2 = file3 wc -l file1.txt 58112 wc -l file2.txt 55260 head -5 file1.txt 101214200123 101214700300 101250030067 101214100500... (10 Replies)
Discussion started by: Divya Nochiyil
10 Replies

2. Shell Programming and Scripting

Filter a .kml file (xml) with data set from text file

I have a .kml file. So I want filter the .kml to get only the tags that have this numeric codes that they are in a text file 11951 11952 74014 11964 11965 11969 11970 11971 11972 60149 74018 74023 86378 11976 11980 11983 11984 11987 (5 Replies)
Discussion started by: pcoj33
5 Replies

3. Shell Programming and Scripting

Find and replace data in text file with data in same file

OK I will do my best to explain what I need help with. I am trying to format an ldif file so I can import it into Oracle oid. I need the file to look like this example. Keep in mind there are 3000 of these in the file. changetype: modify replace: userpassword dn:... (0 Replies)
Discussion started by: timothyha22
0 Replies

4. Shell Programming and Scripting

Extracting data from text file based on configuration set in config file

Hi , a:) i have configuration file with pattren <Range start no>,<Range end no>,<type of records to be extracted from the data file>,<name of the file to store output> eg: myfile.confg 9899000000,9899999999,DATA,b.dat 9899000000,9899999999,SMS,a.dat b:) Stucture of my data file is... (3 Replies)
Discussion started by: suparnbector
3 Replies
Login or Register to Ask a Question