![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Forum Rules | FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Output in my shell isn't showing properly. | satyakide | Shell Programming and Scripting | 7 | 04-01-2008 07:05 AM |
| Help with showing the difference in two lines of input | Kweekwom | Shell Programming and Scripting | 12 | 02-14-2008 04:16 PM |
| displaying/ counting lines | aga | Shell Programming and Scripting | 2 | 08-08-2007 12:35 AM |
| word count showing wrong number of lines | tselvanin | UNIX for Dummies Questions & Answers | 3 | 01-06-2004 08:33 PM |
| Counting lines and files | jorge.ferreira | UNIX for Dummies Questions & Answers | 6 | 12-11-2003 08:24 AM |
|
|
Submit Tools | LinkBack | Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
First time poster -
I have a huge file and i want to sort and compress it to something more readable Ex: FUTNCA01-SL1 DMT8a4 5 3 FUTNCA01-SL1 DMT8a4 5 9 FUTNCA01-SL1 DMT8a4 5 21 FUTNCA01-SL1 DMT8a4 5 22 FUTNCA01-SL1 DMT8a4 5 23 FUTNCA01-SL1 DMT8a4 5 24 FUTNCA01-SL1 DMT8a4 6 2 FUTNCA01-SL1 DMT8a4 6 3 FUTNCA01-SL1 DMT8a4 6 7 ALBQNMMA-SL1 DMT8a4 3 1 ALBQNMMA-SL1 DMT8a4 3 2 ALBQNMMA-SL1 DMT8a4 3 3 ALBQNMMA-SL1 DMT8a4 3 4 ALBQNMMA-SL1 DMT8a4 12 7 Output 1 desired FUTNCA01-SL1 DMT8a4 5 6 ---> count of 2nd to the last field FUTNCA01-SL1 DMT8a4 6 2 --> count of last field ALBQNMMA-SL1 DMT8a4 3 4 ALBQNMMA-SL1 DMT8a4 12 1 Output 2 desired FUTNCA01-SL1 DMT8a4 8 --> total number lines ALBQNMMA-SL1 DMT8a4 5 Last edited by jjoves; 08-02-2004 at 10:40 AM. |
| Forum Sponsor | ||
|
|
|
#2
|
||||
|
||||
|
See the man page for the sort and grep or egrep commands.
When grepping for the 2nd to last column, use quotes. $ grep -c "FUTNCA01-SL1 DMT8a4 5" /your-file-name Last edited by RTM; 08-02-2004 at 12:12 PM. |
|
#3
|
|||
|
|||
|
I think i have to rephrase my question:
Input File Field1 - Field2 - Field3 - Field4 A - B - C - 1 A - B - C - 2 A - B - C - 3 A - B - C - 4 A - B - C - 5 A - B - D - 1 A - B - D - 2 A - B - D - 3 A - B - D - 4 E - F - G - 1 E - F - G - 2 Output file A - B - C -> 5 --- Since the third field appeared 5 times A - B - D -> 4 E - F - G - > 2 My sort depends on the field - each field changes. THANKS ALOT |
|
#4
|
||||
|
||||
|
That was posted as one example.
You would have to grep for each individual string grep -c "A - B - C" grep -c "A - B - D" grep -c "E - F - G" |
|
#5
|
|||
|
|||
|
Assuming I got what you wanted ---
Code:
#!/bin/ksh
# assuming the file in question, test.dat, is already sorted
# adjust the columns to cut the 1-10 part of the cut statement
old_compare=`head -1 test.dat | cut -c 1-10`
let old_count=0
while read record
do
tmp=`echo $record | cut -c 1-10`
if [ "$old_compare" = "$tmp" ] ; then
let old_count=old_count+1
else
echo "$old_compare $old_count"
let old_count=1
old_compare="$tmp"
fi
done < test.dat
echo "$old_compare $old_count"
exit
|
|
#6
|
|||
|
|||
|
Here is an example of my input file:
=============== PLALCA12-SL2 2 1 PLALCA12-SL2 2 2 PLALCA12-SL2 2 3 PLALCA12-SL2 2 4 PLALCA12-SL2 2 5 PLALCA12-SL2 2 6 PLALCA12-SL2 2 7 PLALCA12-SL2 2 8 PLALCA12-SL2 2 9 PLALCA12-SL2 2 18 BDBKNJBD-SL1 2 1 BDBKNJBD-SL1 2 2 BDBKNJBD-SL1 2 3 BDBKNJBD-SL1 2 4 BDBKNJBD-SL1 2 5 BDBKNJBD-SL1 2 6 BDBKNJBD-SL1 2 7 BDBKNJBD-SL1 2 8 KHVLPAKU-SL1 2 1 KHVLPAKU-SL1 2 2 KHVLPAKU-SL1 2 3 PRVDRIWA-SL2 8 13 PRVDRIWA-SL2 8 14 PRVDRIWA-SL2 8 15 PRVDRIWA-SL2 8 16 PRVDRIWA-SL2 9 7 PRVDRIWA-SL2 10 7 PRVDRIWA-SL2 10 8 PRVDRIWA-SL2 11 7 PRVDRIWA-SL2 11 8 PRVDRIWA-SL2 12 13 PRVDRIWA-SL2 12 14 ================ And here is the output that I need PLALCA12-SL2 2 18 --> since there is 18 lines BDBKNJBD-SL1 2 8 KHVLPAKU-SL1 2 3 PRVDRIWA-SL2 8 4 PRVDRIWA-SL2 9 1 PRVDRIWA-SL2 10 2 PRVDRIWA-SL2 11 2 PRVDRIWA-SL2 12 2 THANKS for all the help... |
|
#7
|
||||
|
||||
|
Assuming that I fully understand the question,
Code:
#!/bin/ksh
# The first two fields, space seperated,
# form the unique identifier, correct?
TMPFILE=./tmpfile
awk -v t=$TMPFILE '{print $1 " " $2 >> t}' infile
sort $TMPFILE | uniq | while read id
do
echo "$id $( grep -c "$id" infile )"
done
rm $TMPFILE
exit 0
Cheers ZB |
||||
| Google The UNIX and Linux Forums |