10-29-2014
It depends on how much input data you have.
The grep method is very fast if you have enough memory, but that is its limit... If file1 is too large, it's liable to run out of memory and grind to a halt, or just plain crash. I wouldn't trust it with a file1 larger than a hundred or two megabytes. (file2 can be any size, though.) You should be doing grep -v -F -f file1 file2 by the way -- the -F makes sure the lines are all considered raw, instead of being used as regular expressions.
The sort method can reliably tolerate any size of input (though I would have used comm -1 -3 rather than diff).
So all else being equal, I'd use the sort method and worry less.
Last edited by Corona688; 10-29-2014 at 02:24 PM..
This User Gave Thanks to Corona688 For This Post:
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hello all,
I have a test file that has the format:
.....
O
3.694950 -.895050 1.480000
O
5.485050 .895050 1.480000
Ti
-4.590000 4.590000 2.960000
Ti
-2.295000 ... (5 Replies)
Discussion started by: aarondesk
5 Replies
2. UNIX for Dummies Questions & Answers
Hi I have to grep for 2000 strings in a file one after the other.Say the file name is Snxx.out which has these strings.
I have to search for all the strings in the file Snxx.out one after the other.
What is the fastest way to do it ??
Note:The current grep process is taking lot of time per... (7 Replies)
Discussion started by: preethgideon
7 Replies
3. Shell Programming and Scripting
hello folks
i have a file that have data like
/test/aa/123
/test/aa/xyz
/test/bb/xyz
/test/bb/123
in above lines i just wants to grep "aa" and "bb".
Thanks,
Bash (4 Replies)
Discussion started by: learnbash
4 Replies
4. Shell Programming and Scripting
My requiremeny is as follows,
I have two files
file a
A BONES RD,NHILL,3418,VIC
37TH PARALLEL RD,DEEP LEAD,3385,VIC
4 AK RD,OAKEY,4401,QLD
A & J FARRS RD,BARMOYA,4703,QLD
A B PATTERSON DR,ARUNDEL,4214,QLD
A BLAIRS RD,BUCKRABANYULE,3525,VIC
file b
A BONES... (12 Replies)
Discussion started by: feelmyfrd
12 Replies
5. Shell Programming and Scripting
Pseudo name=hdiskpower54
Symmetrix ID=000190101757
Logical device ID=0601
state=alive; policy=SymmOpt; priority=0; queued-IOs=0
==============================================================================
---------------- Host --------------- - Stor - -- I/O Path - -- Stats ---
### HW... (7 Replies)
Discussion started by: Daniel Gate
7 Replies
6. Shell Programming and Scripting
Hi,
Can anyone let me know what is difference between
grep .* foo.c
grep '.*' foo.c
I am not able to understand what is exact difference.
Thanks in advance (2 Replies)
Discussion started by: SasDutta
2 Replies
7. Homework & Coursework Questions
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
1. Print the number of people that are in the /etc/passwd file with the name of George
2. Sort by name and... (8 Replies)
Discussion started by: Jagst3r21
8 Replies
8. Homework & Coursework Questions
1. The problem statement, all variables and given/known data:
Please bare in mind I am a complete novice to this and have very very basic knowledge so please keep any answers as simple as possible and explain in terms I will understand ahha :):)
I have a text file of names and test scores... (1 Reply)
Discussion started by: jamesb18
1 Replies
9. UNIX for Dummies Questions & Answers
Hi All,
I am new to this forum and this is my first post.
My requirement is like to optimize the time taken to grep the file with 40000 lines.
There are two files FILEA(40000 lines) FILEB(40000 lines).
The requirement is like this, both the file will be in the format below... (11 Replies)
Discussion started by: mad man
11 Replies
10. UNIX for Advanced & Expert Users
I have a very big input file <inputFile1.txt> which has list of mobile no
inputFile1.txt
3434343
3434323
0970978
85233
... around 1 million records
i have another file as inputFile2.txt which has some log detail big file
inputFile2.txt
afjhjdhfkjdhfkd df h8983 3434343 | 3483 | myout1 |... (3 Replies)
Discussion started by: reldb
3 Replies
LEARN ABOUT OPENSOLARIS
comm
comm(1) User Commands comm(1)
NAME
comm - select or reject lines common to two files
SYNOPSIS
comm [-123] file1 file2
DESCRIPTION
The comm utility reads file1 and file2, which must be ordered in the current collating sequence, and produces three text columns as output:
lines only in file1; lines only in file2; and lines in both files.
If the input files were ordered according to the collating sequence of the current locale, the lines written will be in the collating
sequence of the original lines. If not, the results are unspecified.
OPTIONS
The following options are supported:
-1 Suppresses the output column of lines unique to file1.
-2 Suppresses the output column of lines unique to file2.
-3 Suppresses the output column of lines duplicated in file1 and file2.
OPERANDS
The following operands are supported:
file1 A path name of the first file to be compared. If file1 is -, the standard input is used.
file2 A path name of the second file to be compared. If file2 is -, the standard input is used.
USAGE
See largefile(5) for the description of the behavior of comm when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes).
EXAMPLES
Example 1 Printing a list of utilities specified by files
If file1, file2, and file3 each contain a sorted list of utilities, the command
example% comm -23 file1 file2 | comm -23 - file3
prints a list of utilities in file1 not specified by either of the other files. The entry:
example% comm -12 file1 file2 | comm -12 - file3
prints a list of utilities specified by all three files. And the entry:
example% comm -12 file2 file3 | comm -23 -file1
prints a list of utilities specified by both file2 and file3, but not specified in file1.
ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of comm: LANG, LC_ALL, LC_COLLATE,
LC_CTYPE, LC_MESSAGES, and NLSPATH.
EXIT STATUS
The following exit values are returned:
0 All input files were successfully output as specified.
>0 An error occurred.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWesu |
+-----------------------------+-----------------------------+
|CSI |enabled |
+-----------------------------+-----------------------------+
|Interface Stability |Standard |
+-----------------------------+-----------------------------+
SEE ALSO
cmp(1), diff(1), sort(1), uniq(1), attributes(5), environ(5), largefile(5), standards(5)
SunOS 5.11 3 Mar 2004 comm(1)