05-13-2014
What awk were you using? There is sometimes quite a bit to be gained by choosing the right version of awk. If you used gawk, then that would be the slowest, BSD awk should be faster and mawk can really be surprisingly fast at times, sometime several times faster. It is not an optimal algorithmic solution, but it might be worth exploring..
--
Is the order of output important by the way?
Last edited by Scrutinizer; 05-13-2014 at 03:31 PM..
This User Gave Thanks to Scrutinizer For This Post:
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi all,
I have a file that contains a list of codes (shown below).
I want to 'uniq' the file using only the first field. Anyone know an easy way of doing it?
Cheers,
Dave
##### Input File #####
1xr1 1xws 1yxt 1yxu 1yxv 1yxx 2o3p 2o63 2o64 2o65
1xr1 1xws 1yxt 1yxv 1yxx 2o3p 2o63 2o64... (8 Replies)
Discussion started by: Digby
8 Replies
2. UNIX for Dummies Questions & Answers
Hi ;
I have a question regarding the uniq command in unix
How do I uniq 3rd field in a file ?
original file :
zoom coord 39 18652 39 18652
zoom coord 39 18653 39 18653
zoom coord 39 18818 39 18818
zoom coord 39 18840 39 18840
zoom coord 41 15096 41 15096
zoom... (1 Reply)
Discussion started by: babycakes
1 Replies
3. Shell Programming and Scripting
How can I use uniq on a certain field or what else could I use? If I want to use uniq on the second field and the output would remove one of the lines with a 5.
bob 5 hand
jane 3 leg
jon 4 head
chris 5 lungs (1 Reply)
Discussion started by: Bandit390
1 Replies
4. Shell Programming and Scripting
Anyone can help for filter the uniq record for below example? Thank you very much
Input file
20090503011111|test|abc
20090503011112|tet1|abc|def
20090503011112|test1|bcd|def
20090503011131|abc|abc
20090503011131|bbc|bcd
20090503011152|bcd|abc
20090503011151|abc|abc... (8 Replies)
Discussion started by: bleach8578
8 Replies
5. Shell Programming and Scripting
Hi New to unix.
I want to display only the unrepeated lines from a file using first field.
Ex:
1234 uname1 status1
1235 uname2 status2
1234 uname3 status3
1236 uname5 status5
I used
sort filename | uniq -u
output:
1234 uname1 status1
1235 uname2 status2
1234 uname3 status3
1236... (10 Replies)
Discussion started by: venummca
10 Replies
6. Shell Programming and Scripting
I have a flatfile A.txt
2012/12/04 14:06:07 |trees|Boards 2, 3|denver|mekong|mekong12
2012/12/04 17:07:22 |trees|Boards 2, 3|denver|mekong|mekong12
2012/12/04 17:13:27 |trees|Boards 2, 3|denver|mekong|mekong12
2012/12/04 14:07:39 |rain|Boards 1|tampa|merced|merced11
How do i sort and get... (3 Replies)
Discussion started by: sabercats
3 Replies
7. Shell Programming and Scripting
Hi All,
I am searching for a script which will produce an output file with the uniq first field with the second field having highest value among all the duplicates..
The output file will produce only the uniqs which are duplicate 3 times..
Input file
X 9
B 5
A 1
Z 9
T 4
C 9
A 4... (13 Replies)
Discussion started by: ailnilanjan
13 Replies
8. Shell Programming and Scripting
Hii,
I am reading data from files by defining path as *.log etc,
Files names are like app1a_test2_heep.log , cdc2a_test3_heep.log etc
How to configure logstash so that the part of string that is string before underscore (app1a, cdc2a..) should be grepped and added to host field and... (7 Replies)
Discussion started by: Ravi Kishore
7 Replies
9. Shell Programming and Scripting
Hi All,
I am trying to output uniq values per column. see file below. can you please assist? Thank you in advance.
cat names
joe allen ibm
joe smith ibm
joe allen google
joe smith google
rachel allen google
desired output is:
joe allen google
rachel smith ibm (5 Replies)
Discussion started by: Apollo
5 Replies
10. Shell Programming and Scripting
In the awk below I am trying to set/update the value of $14 in file2 in
bold, using the matching NM_ in $12 or $9 in file2
with the NM_ in $2 of file1.
The lengths of $9 and $12 can be variable but what is consistent is the start pattern
will always be NM_ and the end pattern is always ;... (2 Replies)
Discussion started by: cmccabe
2 Replies
LEARN ABOUT CENTOS
amplot
AMPLOT(8) System Administration Commands AMPLOT(8)
NAME
amplot - visualize the behavior of Amanda
SYNOPSIS
amplot [-b] [-c] [-e] [-g] [-l] [-p] [-t T] amdump_files
DESCRIPTION
Amplot reads an amdump output file that Amanda generates each run (e.g. amdump.1) and translates the information into a picture format
that may be used to determine how your installation is doing and if any parameters need to be changed. Amplot also prints out amdump lines
that it either does not understand or knows to be warning or error lines and a summary of the start, end and total time for each backup
image.
Amplot is a shell script that executes an awk program (amplot.awk) to scan the amdump output file. It then executes a gnuplot program
(amplot.g) to generate the graph. The awk program is written in an enhanced version of awk, such as GNU awk (gawk(1) version 2.15 or later)
or nawk(1).
During execution, amplot generates a few temporary files that gnuplot uses. These files are deleted at the end of execution.
See the amanda(8) man page for more details about Amanda.
OPTIONS
-b
Generate b/w postscript file (need -p).
-c
Compress amdump_files after plotting.
-e
Extend the X (time) axis if needed.
-g
Direct gnuplot output directly to the X11 display (default).
-p
Direct postscript output to file YYYYMMDD.ps (opposite of -g).
-l
Generate landscape oriented output (needs -p).
-t T
Set the right edge of the plot to be T hours.
The amdump_files may be in various compressed formats (compress, gzip, pact, compact).
INTERPRETATION
The figure is divided into a number of regions. There are titles on the top that show important statistical information about the
configuration and from this execution of amdump. In the figure, the X axis is time, with 0 being the moment amdump was started. The Y axis
is divided into 5 regions:
QUEUES: How many backups have not been started, how many are waiting on space in the holding disk and how many have been transferred
successfully to tape.
%BANDWIDTH: Percentage of allowed network bandwidth in use.
HOLDING DISK: The higher line depicts space allocated on the holding disk to backups in progress and completed backups waiting to be
written to tape. The lower line depicts the fraction of the holding disk containing completed backups waiting to be written to tape
including the file currently being written to tape. The scale is percentage of the holding disk.
TAPE: Tape drive usage.
%DUMPERS: Percentage of active dumpers.
The idle period at the left of the graph is time amdump is asking the machines how much data they are going to dump. This process can take
a while if hosts are down or it takes them a long time to generate estimates.
BUGS
Reports lines it does not recognize, mainly error cases but some are legitimate lines the program needs to be taught about.
SEE ALSO
amanda(8), amdump(8), gnuplot(1), compress(1), gzip(1)
The Amanda Wiki: : http://wiki.zmanda.com/
AUTHORS
Olafur Gudmundsson <ogud@tis.com>
Trusted Information Systems
Stefan G. Weichinger <sgw@amanda.org>
Amanda 3.3.3 01/10/2013 AMPLOT(8)