perl/shell need help to remove duplicate lines from files Post: 302483206

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove Duplicate Lines in File

I am doing KSH script to remove duplicate lines in a file. Let say the file has format below. FileA 1253-6856 3101-4011 1827-1356 1822-1157 1822-1157 1000-1410 1000-1410 1822-1231 1822-1231 3101-4011 1822-1157 1822-1231 and I want to simply it with no duplicate line as file...

2. Shell Programming and Scripting

how to remove duplicate lines

I have following file content (3 fields each line): 23 888 10.0.0.1 dfh 787 10.0.0.2 dssf dgfas 10.0.0.3 dsgas dg 10.0.0.4 df dasa 10.0.0.5 df dag 10.0.0.5 dfd dfdas 10.0.0.5 dfd dfd 10.0.0.6 daf nfd 10.0.0.6 ... as can be seen, that the third field is ip address and sorted. but...

3. Shell Programming and Scripting

remove all duplicate lines from all files in one folder

Hi, is it possible to remove all duplicate lines from all txt files in a specific folder? This is too hard for me maybe someone could help. lets say we have an amount of textfiles 1 or 2 or 3 or... maximum 50 each textfile has lines with text. I want all lines of all textfiles...

4. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Input: hello hello hello hello monkey donkey hello hello drink dance drink Output should be: hello hello monkey donkey drink dance

5. Shell Programming and Scripting

remove duplicate lines using awk

Hi, I came to know that using awk '!x++' removes the duplicate lines. Can anyone please explain the above syntax. I want to understand how the above awk syntax removes the duplicates. Thanks in advance, sudvishw :confused:

6. Shell Programming and Scripting

Remove duplicate lines

Hi, I have a huge file which is about 50GB. There are many lines. The file format likes 21 rs885550 0 9887804 C C T C C C C C C C 21 rs210498 0 9928860 0 0 C C 0 0 0 0 0 0 21 rs303304 0 9941889 A A A A A A A A A A 22 rs303304 0 9941890 0 A A A A A A A A A The question is that there are a few...

7. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example: input: <string-array name="threeItems"> <item>item1</item> <item>item2</item> <item>item3</item> </string-array> <string-array name="twoItems"> <item>item1</item> <item>item2</item>...

8. UNIX for Dummies Questions & Answers

Remove Duplicate Lines

Hi I need this output. Thanks. Input: TAZ YET FOO FOO VAK TAZ BAR Output: YET VAK BAR

9. Windows & DOS: Issues & Discussions

Remove duplicate lines from text files.

So, I have text files, one "fail.txt" And one "color.txt" I now want to use a command line (DOS) to remove ANY line that is PRESENT IN BOTH from each text file. Afterwards there shall be no duplicate lines.

10. Shell Programming and Scripting

How to remove duplicate lines?

Hi All, I am storing the result in the variable result_text using the below code. result_text=$(printf "$result_text\t\n$name") The result_text is having the below text. Which is having duplicate lines. file and time for the interval 03:30 - 03:45 file and time for the interval 03:30 - 03:45 ...

LEARN ABOUT ULTRIX

sort

sort(1) General Commands Manual sort(1)

Name
sort - sort file data

Syntax
sort [options] [-k keydef] [+pos1[-pos2]] [file...]

Description
The command sorts lines of all the named files together and writes the result on the standard output. The name `-' means the standard
input. If no input files are named, the standard input is sorted.

Options
The default sort key is an entire line. Default ordering is lexicographic by bytes in machine collating sequence. The ordering is
affected globally by the following options, one or more of which may appear.

-b Ignores leading blanks (spaces and tabs) in field comparisons.

-d Sorts data according to dictionary ordering: letters, digits, and blanks only.

-f Folds uppercase to lowercase while sorting.

-i Ignore characters outside the ASCII range 040-0176 in nonnumeric comparisons.

-k keydef The keydefargument is a key field definition. The format is field_start, [field_end] [type], where field_start and field_end
are the definition of the restricted search key, and type is a modifier from the option list [bdfinr]. These modifiers have the
functionality, for this key only, that their command line counter-parts have for the entire record.

-n Sorts fields with numbers numerically. An initial numeric string, consisting of optional blanks, optional minus sign, and zero
or more digits with optional decimal point, is sorted by arithmetic value. (Note that -0 is taken to be equal to 0.) Option n
implies option b.

-r Reverses the sense of comparisons.

-tx Uses specified character as field separator.

The notation +pos1 -pos2 restricts a sort key to a field beginning at pos1 and ending just before pos2. Pos1 and pos2 each have the form
m.n, optionally followed by one or more of the options bdfinr, where m tells a number of fields to skip from the beginning of the line and
n tells a number of characters to skip further. If any options are present they override all the global ordering options for this key. If
the b option is in effect n is counted from the first nonblank in the field; b is attached independently to pos2. A missing .n means .0; a
missing -pos2 means the end of the line. Under the -tx option, fields are strings separated by x; otherwise fields are nonempty nonblank
strings separated by blanks.

When there are multiple sort keys, later keys are compared only after all earlier keys compare equal. Lines that otherwise compare equal
are ordered with all bytes significant.

These are additional options:

-c Checks sorting order and displays output only if out of order.

-m Merges previously sorted data.

-o name Uses specified file as output file. This file may be the same as one of the inputs.

-T dir Uses specified directory to build temporary files.

-u Suppresses all duplicate entries. Ignored bytes and bytes outside keys do not participate in this comparison.

Examples
Print in alphabetical order all the unique spellings in a list of words. Capitalized words differ from uncapitalized.
sort -u +0f +0 list

Print the password file, sorted by user id number (the 3rd colon-separated field).
sort -t: +2n /etc/passwd

Print the first instance of each month in an already sorted file of (month day) entries. The options -um with just one input file make the
choice of a unique representative from a set of equal lines predictable.
sort -um +0 -1 dates

Restrictions
Very long lines are silently truncated.

Diagnostics
Comments and exits with nonzero status for various trouble conditions and for disorder discovered under option c.

Files
/usr/tmp/stm*, /tmp/* first and second tries for temporary files

See Also
comm(1), join(1), rev(1), uniq(1)

sort(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove Duplicate Lines in File

Discussion started by: Teh Tiack Ein

2. Shell Programming and Scripting

how to remove duplicate lines

Discussion started by: fredao

3. Shell Programming and Scripting

remove all duplicate lines from all files in one folder

Discussion started by: lowmaster

4. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Discussion started by: cola