Using grep -f works, but with two files as large as indicated will take its (serious) time, and may eventually run out of memory. Try
, then use the resultant file in similar way (uniq -u) to extract unique values from either original file.
Comparison of both approaches on ~20k files: EDIT: Times spent for two files with roughly 4E6 entries each, and about 1E6 lines overlap (on a two processor linux host):
Hello
I do want to write a script which will check any errors say "-error" in the log file then have to send email to the concern person . And the concern person will correct the error .
Next time if the script runs eventhough the error has been corrected it will ... (1 Reply)
Hi all,
I'm having some trouble with a shell script that I have put together to search our web pages for links to PDFs.
The first thing I did was:
ls -R | grep .pdf > /tmp/dave_pdfs.outWhich generates a list of all of the PDFs on the server. For the sake of arguement, say it looks like... (8 Replies)
I have a file that is 20 - 80+ MB in size that is a certain type of log file.
It logs one of our processes and this process is multi-threaded. Therefore the log file is kind of a mess. Here's an example:
The logfile looks like: "DATE TIME - THREAD ID - Details", and a new file is created... (4 Replies)
I need help in the following script. I want to grep the sql errors insert into the error table and exit the shell script if there is any error, otherwise keep running the scripts.
Here is my script
#!/bin/csh -f
source .orapass
set user = $USER
set pass = $PASS
cd /opt/data/scripts
echo... (2 Replies)
Hi guys - below is my script that is checking for current file, size and timestamp.
However I added a "grep" feature in it (line in red), but not getting the desired result.
I am trying to acheive in output:
1. Show me the file name, timestamp, size and grep'ed words
It would be a... (2 Replies)
Hi all,
I have problem with searching hundreds of CSV files, the problem is that search is lasting too long (over 5min).
Csv files are "," delimited, and have 30 fields each line, but I always grep same 4 fields - so is there a way to grep just those 4 fields to speed-up search.
Example:... (11 Replies)
Hello everybody,
I'm still slowly treading my way into bash scripting (without any prior programming experience) and hence my code is mostly what some might call "creative" if they meant well :D
I have created a script that serves its purpose but it does so very slowly, since it needs to work... (4 Replies)
This is my first experience writing unix script. I've created the following script. It does what I want it to do, but I need it to be a lot faster. Is there any way to speed it up?
cat 'Tax_Provision_Sample.dat' | sort | while read p; do fn=`echo $p|cut -d~ -f2,4,3,8,9`; echo $p >> "$fn.txt";... (20 Replies)
Hi,
I've written a ksh script that read a file and parse/filter/format each line. The script runs as expected but it runs for 24+ hours for a file that has 2million lines. And sometimes, the input file has 10million lines which means it can be running for more than 2 days and still not finish.... (9 Replies)
Hello experts,
we have input files with 700K lines each (one generated for every hour). and we need to convert them as below and move them to another directory once.
Sample INPUT:-
# cat test1
1559205600000,8474,NormalizedPortInfo,PctDiscards,0.0,Interface,BG-CTA-AX1.test.com,Vl111... (7 Replies)
Discussion started by: prvnrk
7 Replies
LEARN ABOUT SUNOS
uniq
uniq(1) User Commands uniq(1)NAME
uniq - report or filter out repeated lines in a file
SYNOPSIS
uniq [-c | -d | -u] [-f fields] [-s char] [ input_file [output_file]]
uniq [-c | -d | -u] [-n] [ + m] [ input_file [output_file]]
DESCRIPTION
The uniq utility will read an input file comparing adjacent lines, and write one copy of each input line on the output. The second and suc-
ceeding copies of repeated adjacent input lines will not be written.
Repeated lines in the input will not be detected if they are not adjacent.
OPTIONS
The following options are supported:
-c Precedes each output line with a count of the number of times the line occurred in the input.
-d Suppresses the writing of lines that are not repeated in the input.
-f fields Ignores the first fields fields on each input line when doing comparisons, where fields is a positive decimal integer. A
field is the maximal string matched by the basic regular expression:
[[:blank:]]*[^[:blank:]]*
If fields specifies more fields than appear on an input line, a null string will be used for comparison.
-s chars Ignores the first chars characters when doing comparisons, where chars is a positive decimal integer. If specified in con-
junction with the -f option, the first chars characters after the first fields fields will be ignored. If chars specifies
more characters than remain on an input line, a null string will be used for comparison.
-u Suppresses the writing of lines that are repeated in the input.
-n Equivalent to -f fields with fields set to n.
+m Equivalent to -s chars with chars set to m.
OPERANDS
The following operands are supported:
input_file A path name of the input file. If input_file is not specified, or if the input_file is -, the standard input will be used.
output_file A path name of the output file. If output_file is not specified, the standard output will be used. The results are unspeci-
fied if the file named by output_file is the file named by input_file.
EXAMPLES
Example 1: Using the uniq command
The following example lists the contents of the uniq.test file and outputs a copy of the repeated lines.
example% cat uniq.test
This is a test.
This is a test.
TEST.
Computer.
TEST.
TEST.
Software.
example% uniq -d uniq.test
This is a test.
TEST.
example%
The next example outputs just those lines that are not repeated in the uniq.test file.
example% uniq -u uniq.test
TEST.
Computer.
Software.
example%
The last example outputs a report with each line preceded by a count of the number of times each line occurred in the file:
example% uniq -c uniq.test
2 This is a test.
1 TEST.
1 Computer.
2 TEST.
1 Software.
example%
ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of uniq: LANG, LC_ALL, LC_CTYPE, LC_MES-
SAGES, and NLSPATH.
EXIT STATUS
The following exit values are returned:
0 Successful completion.
>0 An error occurred.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWesu |
+-----------------------------+-----------------------------+
|CSI |Enabled |
+-----------------------------+-----------------------------+
|Interface Stability |Standard |
+-----------------------------+-----------------------------+
SEE ALSO comm(1), pack(1), pcat(1), sort(1), uncompress(1), attributes(5), environ(5), standards(5)SunOS 5.10 20 Dec 1996 uniq(1)