Why check for duplicate files if you can avoid producing them in the first place? Try
This little script keeps an LCNT (here: 10) deep cyclic buffer of the lines encountered, and, if the search pattern is matched, prints these buffered LCNT lines, the actual line, and LCNT lines to come. Caveat: if the pattern is encountered again BEFORE the latter have been printed, they will stop, and the cycle starts anew with printing the buffer. You may redirect - immediately in awk itself - the results to individual files belonging to the originals.
The actual file name, when first encountered, adorned with BOL and EOL anchors, is retained in a, say, "control file" and will never be treated again. Feel free to put the "control file" anywhere else. Little drawback: you have to touch the "control file" once before the first run to make sure it exists.
The list of files presented to awk is the lsed directory contents with the "already done files" removed by grep's -v option. The /dev/null empty file serves as a dummy to avoid awk reading from terminal / stdin when no new files exist, and all old files fall victim to this procedure.
Hi,
I am trying to remove duplicate lines from a file. For example the contents of example.txt is:
this is a test
2342
this is a test
34343
this is a test
43434
and i want to remove the "this is a test" lines only and end up with the numbers in the file, that is, end up with:
2342... (4 Replies)
Hi,
I need to concatenate three files in to one destination file.In this if some duplicate data occurs it should be deleted.
eg:
file1:
-----
data1 value1
data2 value2
data3 value3
file2:
-----
data1 value1
data4 value4
data5 value5
file3:
-----
data1 value1
data4 value4 (3 Replies)
Hello,
I am in need of removing duplicate lines from within a file per section.
File:
ABC1 012345 header
ABC2 7890-000
ABC3 012345 Header Table
ABC4
ABC5 593.0000 587.4800
ABC5 593.5000 587.6580 <= dup need to remove
ABC5 593.5000 ... (5 Replies)
So I have two files. The first file, file1.txt, has lines of numbers separated by commas.
file1.txt
10,2,30,50
22,6,3,15,16,100
73,55
78,40,33,30,11
73,55
99,82,85
22,6,3,15,16,100
The second file, file2.txt, has sentences.
file2.txt
"the cat is fat"
"I like eggs"
"fish live in... (6 Replies)
Hi,
I have attached an output file which is some kind of database file mapping. It is basically like an allocation mapping of a tablespace and its datafile/s.
The output is generated by the SQL script that I found from 401 Authorization Required
Excerpts of the file are as below:
... (2 Replies)
Discussion started by: newbie_01
2 Replies
LEARN ABOUT MOJAVE
zipgrep
ZIPGREP(1L)ZIPGREP(1L)NAME
zipgrep - search files in a ZIP archive for lines matching a pattern
SYNOPSIS
zipgrep [egrep_options] pattern file[.zip] [file(s) ...] [-x xfile(s) ...]
DESCRIPTION
zipgrep will search files within a ZIP archive for lines matching the given string or pattern. zipgrep is a shell script and requires
egrep(1) and unzip(1L) to function. Its output is identical to that of egrep(1).
ARGUMENTS
pattern
The pattern to be located within a ZIP archive. Any string or regular expression accepted by egrep(1) may be used. file[.zip] Path
of the ZIP archive. (Wildcard expressions for the ZIP archive name are not supported.) If the literal filename is not found, the
suffix .zip is appended. Note that self-extracting ZIP files are supported, as with any other ZIP archive; just specify the .exe
suffix (if any) explicitly.
[file(s)]
An optional list of archive members to be processed, separated by spaces. If no member files are specified, all members of the ZIP
archive are searched. Regular expressions (wildcards) may be used to match multiple members:
* matches a sequence of 0 or more characters
? matches exactly 1 character
[...] matches any single character found inside the brackets; ranges are specified by a beginning character, a hyphen, and an end-
ing character. If an exclamation point or a caret (`!' or `^') follows the left bracket, then the range of characters within
the brackets is complemented (that is, anything except the characters inside the brackets is considered a match).
(Be sure to quote any character that might otherwise be interpreted or modified by the operating system.)
[-x xfile(s)]
An optional list of archive members to be excluded from processing. Since wildcard characters match directory separators (`/'),
this option may be used to exclude any files that are in subdirectories. For example, ``zipgrep grumpy foo *.[ch] -x */*'' would
search for the string ``grumpy'' in all C source files in the main directory of the ``foo'' archive, but none in any subdirectories.
Without the -x option, all C source files in all directories within the zipfile would be searched.
OPTIONS
All options prior to the ZIP archive filename are passed to egrep(1).
SEE ALSO egrep(1), unzip(1L), zip(1L), funzip(1L), zipcloak(1L), zipinfo(1L), zipnote(1L), zipsplit(1L)URL
The Info-ZIP home page is currently at
http://www.info-zip.org/pub/infozip/
or
ftp://ftp.info-zip.org/pub/infozip/ .
AUTHORS
zipgrep was written by Jean-loup Gailly.
Info-ZIP 20 April 2009 ZIPGREP(1L)