The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
Google UNIX.COM


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Output in my shell isn't showing properly. satyakide Shell Programming and Scripting 7 04-01-2008 07:05 AM
Help with showing the difference in two lines of input Kweekwom Shell Programming and Scripting 12 02-14-2008 03:16 PM
displaying/ counting lines aga Shell Programming and Scripting 2 08-08-2007 12:35 AM
word count showing wrong number of lines tselvanin UNIX for Dummies Questions & Answers 3 01-06-2004 07:33 PM
Counting lines and files jorge.ferreira UNIX for Dummies Questions & Answers 6 12-11-2003 07:24 AM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 08-02-2004
Registered User
 

Join Date: Aug 2004
Posts: 11
Stumble this Post!
Exclamation counting lines and showing the output

First time poster -

I have a huge file and i want to sort and compress it to something more readable

Ex:
FUTNCA01-SL1 DMT8a4 5 3
FUTNCA01-SL1 DMT8a4 5 9
FUTNCA01-SL1 DMT8a4 5 21
FUTNCA01-SL1 DMT8a4 5 22
FUTNCA01-SL1 DMT8a4 5 23
FUTNCA01-SL1 DMT8a4 5 24
FUTNCA01-SL1 DMT8a4 6 2
FUTNCA01-SL1 DMT8a4 6 3
FUTNCA01-SL1 DMT8a4 6 7
ALBQNMMA-SL1 DMT8a4 3 1
ALBQNMMA-SL1 DMT8a4 3 2
ALBQNMMA-SL1 DMT8a4 3 3
ALBQNMMA-SL1 DMT8a4 3 4
ALBQNMMA-SL1 DMT8a4 12 7


Output 1 desired
FUTNCA01-SL1 DMT8a4 5 6 ---> count of 2nd to the last field
FUTNCA01-SL1 DMT8a4 6 2 --> count of last field
ALBQNMMA-SL1 DMT8a4 3 4
ALBQNMMA-SL1 DMT8a4 12 1

Output 2 desired
FUTNCA01-SL1 DMT8a4 8 --> total number lines
ALBQNMMA-SL1 DMT8a4 5

Last edited by jjoves; 08-02-2004 at 10:40 AM.
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 08-02-2004
RTM's Avatar
RTM RTM is offline
Hog Hunter
 
Join Date: Apr 2002
Location: On my motorcycle
Posts: 3,039
Stumble this Post!
See the man page for the sort and grep or egrep commands.

When grepping for the 2nd to last column, use quotes.

$ grep -c "FUTNCA01-SL1 DMT8a4 5" /your-file-name

Last edited by RTM; 08-02-2004 at 12:12 PM.
Reply With Quote
  #3 (permalink)  
Old 08-02-2004
Registered User
 

Join Date: Aug 2004
Posts: 11
Stumble this Post!
I think i have to rephrase my question:

Input File
Field1 - Field2 - Field3 - Field4
A - B - C - 1
A - B - C - 2
A - B - C - 3
A - B - C - 4
A - B - C - 5
A - B - D - 1
A - B - D - 2
A - B - D - 3
A - B - D - 4
E - F - G - 1
E - F - G - 2

Output file
A - B - C -> 5 --- Since the third field appeared 5 times
A - B - D -> 4
E - F - G - > 2

My sort depends on the field - each field changes.

THANKS ALOT
Reply With Quote
  #4 (permalink)  
Old 08-02-2004
RTM's Avatar
RTM RTM is offline
Hog Hunter
 
Join Date: Apr 2002
Location: On my motorcycle
Posts: 3,039
Stumble this Post!
That was posted as one example.

You would have to grep for each individual string

grep -c "A - B - C"
grep -c "A - B - D"
grep -c "E - F - G"
Reply With Quote
  #5 (permalink)  
Old 08-02-2004
...@...
 

Join Date: Feb 2004
Location: NM
Posts: 3,841
Stumble this Post!
Assuming I got what you wanted ---
Code:
#!/bin/ksh
# assuming the file in question, test.dat,  is already sorted 
# adjust the columns to cut the 1-10 part of the cut statement

old_compare=`head -1 test.dat | cut -c 1-10`
let old_count=0
while read record
do   
     tmp=`echo $record | cut -c 1-10`
     if [ "$old_compare" = "$tmp" ] ; then
            let old_count=old_count+1
     else
            echo "$old_compare $old_count"
            let old_count=1
            old_compare="$tmp"   
     fi
done < test.dat                           
echo "$old_compare $old_count"
exit
Reply With Quote
  #6 (permalink)  
Old 08-02-2004
Registered User
 

Join Date: Aug 2004
Posts: 11
Stumble this Post!
Here is an example of my input file:
===============
PLALCA12-SL2 2 1
PLALCA12-SL2 2 2
PLALCA12-SL2 2 3
PLALCA12-SL2 2 4
PLALCA12-SL2 2 5
PLALCA12-SL2 2 6
PLALCA12-SL2 2 7
PLALCA12-SL2 2 8
PLALCA12-SL2 2 9
PLALCA12-SL2 2 18

BDBKNJBD-SL1 2 1
BDBKNJBD-SL1 2 2
BDBKNJBD-SL1 2 3
BDBKNJBD-SL1 2 4
BDBKNJBD-SL1 2 5
BDBKNJBD-SL1 2 6
BDBKNJBD-SL1 2 7
BDBKNJBD-SL1 2 8

KHVLPAKU-SL1 2 1
KHVLPAKU-SL1 2 2
KHVLPAKU-SL1 2 3

PRVDRIWA-SL2 8 13
PRVDRIWA-SL2 8 14
PRVDRIWA-SL2 8 15
PRVDRIWA-SL2 8 16
PRVDRIWA-SL2 9 7
PRVDRIWA-SL2 10 7
PRVDRIWA-SL2 10 8
PRVDRIWA-SL2 11 7
PRVDRIWA-SL2 11 8
PRVDRIWA-SL2 12 13
PRVDRIWA-SL2 12 14
================

And here is the output that I need

PLALCA12-SL2 2 18 --> since there is 18 lines
BDBKNJBD-SL1 2 8
KHVLPAKU-SL1 2 3
PRVDRIWA-SL2 8 4
PRVDRIWA-SL2 9 1
PRVDRIWA-SL2 10 2
PRVDRIWA-SL2 11 2
PRVDRIWA-SL2 12 2

THANKS for all the help...
Reply With Quote
  #7 (permalink)  
Old 08-02-2004
zazzybob's Avatar
Registered Geek
 

Join Date: Dec 2003
Location: Melbourne, Australia
Posts: 2,100
Stumble this Post!
Assuming that I fully understand the question,

Code:
#!/bin/ksh

# The first two fields, space seperated, 
# form the unique identifier, correct?

TMPFILE=./tmpfile
awk -v t=$TMPFILE '{print $1 " " $2 >> t}' infile

sort $TMPFILE | uniq | while read id
do
   echo "$id $( grep -c "$id" infile )"
done

rm $TMPFILE

exit 0
is the script for you.

Cheers
ZB
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 05:04 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0