Sponsored Content
Full Discussion: Filtering duplicate lines
Top Forums UNIX for Advanced & Expert Users Filtering duplicate lines Post 14994 by thehoghunter on Friday 8th of February 2002 01:03:21 PM
Old 02-08-2002
What is the original order of the file? Is it order or chaos?

If the file has an order (by date time, by nodename, by some field) then you can also sort by that field (check the man page)

If it is chaos - meaning no specific order (it just came that way!) then I believe you would need to write a script (Perl ) or a program (your preference of lanuage) to get what you are trying to do.
thehoghunter
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Issues with filtering duplicate records using gawk script

Hi All, I have huge trade file with milions of trades.I need to remove duplicate records (e.g I have following records) 30/10/2009,trdeId1,..,.. 26/10/2009.tradeId1,..,..,, 30/10/2009,tradeId2,.. In the above case i need to filter duplicate recods and I should get following output.... (2 Replies)
Discussion started by: nmumbarkar
2 Replies

2. UNIX for Dummies Questions & Answers

Filtering similar lines in a big list

I received this question for homework: We have to write our program into a .sh file, with "#!/bin/bash" as the first line. And we have the list of access logs in a file, looking like this (it's nearly 10,000 lines long): 65.214.44.112 - - "GET /~user0/cgg/msg08400.html HTTP/1.0" 304 -... (1 Reply)
Discussion started by: Andrew9191
1 Replies

3. Shell Programming and Scripting

filtering out duplicate substrings, regex string from a string

My input contains a single word lines. From each line data.txt prjtestBlaBlatestBlaBla prjthisBlaBlathisBlaBla prjthatBlaBladpthatBlaBla prjgoodBlaBladpgoodBlaBla prjgood1BlaBla123dpgood1BlaBla123 Desired output --> data_out.txt prjtestBlaBla prjthisBlaBla... (8 Replies)
Discussion started by: kchinnam
8 Replies

4. Homework & Coursework Questions

Filtering Unique Lines

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: The uniq command excludes consecutive duplicate lines. It has a -c option to display a count of the number... (1 Reply)
Discussion started by: billydeanmak
1 Replies

5. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Hi All, I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space. I dont know if this works : I want to read each line of the File in a For Loop, and want to... (16 Replies)
Discussion started by: krishnix
16 Replies

6. Shell Programming and Scripting

Perl: filtering lines based on duplicate values in a column

Hi I have a file like this. I need to eliminate lines with first column having the same value 10 times. 13 18 1 + chromosome 1, 122638287 AGAGTATGGTCGCGGTTG 13 18 1 + chromosome 1, 128904080 AGAGTATGGTCGCGGTTG 13 18 1 - chromosome 14, 13627938 CAACCGCGACCATACTCT 13 18 1 + chromosome 1,... (5 Replies)
Discussion started by: polsum
5 Replies

7. UNIX for Dummies Questions & Answers

Filtering data -extracting specific lines

I have a table to data which one of the columns include string of text from within that, I am searching to include few lines but not others for example I want to to include some combination of word address such as (address.| address? |the address | your address) but not (ip address | email... (17 Replies)
Discussion started by: A-V
17 Replies

8. Shell Programming and Scripting

Filtering out lines in a .csv file

Hi Guys, Would need your expert help with the following situation.. I have a comma seperated .csv file, with a header row and data as follows H1,H2,H3,H4,H5..... (header row) 0,0,0,0,0,1,2.... (data rows follow) 0,0,0,0,0,0,1 ......... ......... i need a code... (10 Replies)
Discussion started by: dev.devil.1983
10 Replies

9. Shell Programming and Scripting

Awk/sed : help on:Filtering multiple lines to one:

Experts Good day, I want to filter multiple lines of same error of same day , to only 1 error of each day, the first line from the log. Here is the file: May 26 11:29:19 cmihpx02 vmunix: NFS write failed for server cmiauxe1: error 5 (RPC: Timed out) May 26 11:29:19 cmihpx02 vmunix: NFS... (4 Replies)
Discussion started by: rveri
4 Replies

10. Shell Programming and Scripting

Filtering log file with lines older than 10 days.

Hi, I am trying to compare epoch time in a huge log file (2 million lines) with todays date. I have to create two files one which has lines older than 10 days and another file with less than 10 days. I am using while do but it takes forever to complete the script. It would be helpful if you can... (12 Replies)
Discussion started by: shunya
12 Replies
SORTM(1)							     [nmh-1.5]								  SORTM(1)

NAME
sortm - sort messages SYNOPSIS
sortm [+folder] [msgs] [-datefield field] [-textfield field] [-notextfield] [-limit days] [-nolimit] [-verbose | -noverbose] [-version] [-help] DESCRIPTION
Sortm sorts the specified messages in the named folder according to the chronological order of the "Date:" field of each message. The -verbose switch directs sortm to tell the user the general actions that it is taking to place the folder in sorted order. The -datefield field switch tells sortm the name of the field to use when making the date comparison. If the user has a special field in each message, such as "BB-Posted:" or "Delivery-Date:", then the -datefield switch can be used to direct sortm which field to examine. The -textfield field switch causes sortm to sort messages by the specified text field. If this field is "subject", any leading "re:" is stripped off. In any case, all characters except letters and numbers are stripped and the resulting strings are sorted datefield-major, textfield-minor, using a case insensitive comparison. With -textfield field, if -limit days is specified, messages with similar textfields that are dated within `days' of each other appear together. Specifying -nolimit makes the limit infinity. With -limit 0, the sort is instead made textfield-major, date-minor. For example, to order a folder by date-major, subject-minor, use: sortm -textfield subject +folder FILES
$HOME/.mh_profile The user profile PROFILE COMPONENTS
Path: To determine the user's nmh directory Current-Folder: To find the default current folder SEE ALSO
folder(1) DEFAULTS
`+folder' defaults to the current folder `msgs"'defaultstoall" `-datefield' defaults to date `-notextfield' `-noverbose' `-nolimit' CONTEXT
If a folder is given, it will become the current folder. If the current message is moved, sortm will preserve its status as current. HISTORY
Timezones used to be ignored when comparing dates: they aren't any more. Messages which were in the folder, but not specified by `msgs', used to be moved to the end of the folder; now such messages are left untouched. Sortm sometimes did not preserve the message numbering in a folder (e.g., messages 1, 3, and 5, might have been renumbered to 1, 2, 3 after sorting). This was a bug, and has been fixed. To compress the message numbering in a folder, use "folder -pack" as always. BUGS
If sortm encounters a message without a date-field, or if the message has a date-field that sortm cannot parse, then sortm attempts to keep the message in the same relative position. This does not always work. For instance, if the first message encountered lacks a date which can be parsed, then it will usually be placed at the end of the messages being sorted. When sortm complains about a message which it can't temporally order, it complains about the message number prior to sorting. It should indicate what the message number will be after sorting. MH.6.8 11 June 2012 SORTM(1)
All times are GMT -4. The time now is 07:04 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy