Sponsored Content
Top Forums Shell Programming and Scripting remove all duplicate lines from all files in one folder Post 302321055 by lowmaster on Saturday 30th of May 2009 01:32:33 AM
Old 05-30-2009
remove all duplicate lines from all files in one folder

Hi,

is it possible to remove all duplicate lines from all txt files in a specific folder?

This is too hard for me maybe someone could help.

lets say we have an amount of textfiles 1 or 2 or 3 or... maximum 50
each textfile has lines with text.

I want all lines of all textfiles together to be unique. but the not duplicate lines must remain in txt file where they are.

it does not matter, in what txt-file the dupicate lines are deleted, but one occurance has to stay in least one txt file... An even better solution would delete the duplicate occourances first in textfile 1 then in 2 then in 3, so that the amount of lines deleted are spread to all txt files.


example with 4 textfiles (amount can vary, up to 50) we also do not know how many lines.

txt1:
aaaaaaa
bbbbbbb
ccccccc

txt2:
aaaaaaa
ccccccc
ddddddd

txt3
ccccccc
ddddddd
eeeeeee

txt4
ggggggg
hhhhhhh
kkkkkkkk

a result could look for example like this:

txt1:
aaaaaaa
bbbbbbb
ccccccc

txt2:
ddddddd

txt3
eeeeeee

txt4
ggggggg
hhhhhhh
kkkkkkkk

a perfect result (if possible) looks like this:

txt1:
aaaaaaa
bbbbbbb

txt2:
ccccccc
ddddddd

txt3
eeeeeee

txt4
ggggggg
hhhhhhh
kkkkkkkk
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to remove duplicate lines

I have following file content (3 fields each line): 23 888 10.0.0.1 dfh 787 10.0.0.2 dssf dgfas 10.0.0.3 dsgas dg 10.0.0.4 df dasa 10.0.0.5 df dag 10.0.0.5 dfd dfdas 10.0.0.5 dfd dfd 10.0.0.6 daf nfd 10.0.0.6 ... as can be seen, that the third field is ip address and sorted. but... (3 Replies)
Discussion started by: fredao
3 Replies

2. UNIX for Dummies Questions & Answers

Remove Duplicate lines from File

I have a log file "logreport" that contains several lines as seen below: 04:20:00 /usr/lib/snmp/snmpdx: Agent snmpd appeared dead but responded to ping 06:38:08 /usr/lib/snmp/snmpdx: Agent snmpd appeared dead but responded to ping 07:11:05 /usr/lib/snmp/snmpdx: Agent snmpd appeared dead but... (18 Replies)
Discussion started by: Nysif Steve
18 Replies

3. Shell Programming and Scripting

perl/shell need help to remove duplicate lines from files

Dear All, I have multiple files having number of records, consist of more than 10 columns some column values are duplicate and i want to remove these duplicate values from these files. Duplicate values may come in different files.... all files laying in single directory.. Need help to... (3 Replies)
Discussion started by: arvindng
3 Replies

4. Shell Programming and Scripting

Remove duplicate lines

Hi, I have a huge file which is about 50GB. There are many lines. The file format likes 21 rs885550 0 9887804 C C T C C C C C C C 21 rs210498 0 9928860 0 0 C C 0 0 0 0 0 0 21 rs303304 0 9941889 A A A A A A A A A A 22 rs303304 0 9941890 0 A A A A A A A A A The question is that there are a few... (4 Replies)
Discussion started by: zhshqzyc
4 Replies

5. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example: input: <string-array name="threeItems"> <item>item1</item> <item>item2</item> <item>item3</item> </string-array> <string-array name="twoItems"> <item>item1</item> <item>item2</item>... (19 Replies)
Discussion started by: raidzero
19 Replies

6. Shell Programming and Scripting

How do I remove the duplicate lines in this file?

Hey guys, need some help to fix this script. I am trying to remove all the duplicate lines in this file. I wrote the following script, but does not work. What is the problem? The output file should only contain five lines: Later! (5 Replies)
Discussion started by: Ernst
5 Replies

7. UNIX for Dummies Questions & Answers

Remove Duplicate Lines

Hi I need this output. Thanks. Input: TAZ YET FOO FOO VAK TAZ BAR Output: YET VAK BAR (10 Replies)
Discussion started by: tara123
10 Replies

8. Shell Programming and Scripting

Remove duplicate lines from a file

Hi, I have a csv file which contains some millions of lines in it. The first line(Header) repeats at every 50000th line. I want to remove all the duplicate headers from the second occurance(should not remove the first line). I don't want to use any pattern from the Header as I have some... (7 Replies)
Discussion started by: sudhakar T
7 Replies

9. Windows & DOS: Issues & Discussions

Remove duplicate lines from text files.

So, I have text files, one "fail.txt" And one "color.txt" I now want to use a command line (DOS) to remove ANY line that is PRESENT IN BOTH from each text file. Afterwards there shall be no duplicate lines. (1 Reply)
Discussion started by: pasc
1 Replies

10. Shell Programming and Scripting

How to remove duplicate lines?

Hi All, I am storing the result in the variable result_text using the below code. result_text=$(printf "$result_text\t\n$name") The result_text is having the below text. Which is having duplicate lines. file and time for the interval 03:30 - 03:45 file and time for the interval 03:30 - 03:45 ... (4 Replies)
Discussion started by: nalu
4 Replies
pdbtxt2html(1)						      General Commands Manual						    pdbtxt2html(1)

NAME
pdbtxt2html - Doc Text to HTML converter for Palm Pilots SYNOPSIS
pdbtxt2html [ -t ] file.txt [ file.html ] pdbtxt2html -v DESCRIPTION
pdbtxt2html converts text converted from a Doc(4) file via txt2pdbdoc(1) to HTML. If no HTML filename is given, the generated HTML is sent to standard output. Document Title The first line of the file is used for the HTML document title. Bookmarks The last line of the file is examined and, if it contains a string enclosed between < and >, that is taken to be the bookmark marker. The entire file is then scanned looking for lines beginning with it (ignoring leading whitespace). These lines are converted to HTML headings. The number of whitespace characters after the first bookmark marker is used for heading level 1. The level of subsequent headings is set to the number of whitespace characters between the bookmark marker and the bookmark text minus the number for the first bookmark plus one. Embedded URLs Valid URLs (according to RFC 1630) embedded in the text are turned into hyperlinks. The ftp, gopher, http, https, mailto, news, telnet, and wais URLs are recognized. OPTIONS
-t Compile a table of contents and insert it between the first heading and the body. -v Print the version number to standard output and exit. EXAMPLE
To convert a Doc file to HTML: txt2pdbdoc alice.pdb alice.txt pdbtxt2html alice.txt alice.html SEE ALSO
html2pdbtxt(1), txt2pdbdoc(1), doc(4), pdb(4) Tim Berners Lee. Universal Resource Identifiers in WWW, Network Working Group of the Internet Engineering Task Force, June 1994. http://info.internet.isi.edu/in-notes/rfc/files/rfc1630.txt AUTHOR
Paul J. Lucas <pauljlucas@mac.com> txt2pdbdoc January 21, 2005 pdbtxt2html(1)
All times are GMT -4. The time now is 12:54 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy