Sponsored Content
Top Forums Shell Programming and Scripting Remove lines with duplicate first field Post 302608529 by ajp7701 on Saturday 17th of March 2012 06:15:03 PM
Old 03-17-2012
Remove lines with duplicate first field

Trying to cut down the size of some log files. Now that I write this out it looks more dificult than i thought it would be.

Need a bash script or command that goes sequentially through all lines of a file, and does this:

if field1 (space separated) is the number 2012 print the entire line. Do this DEFINITELY ALWAYS.


if field1 is not the number 2012, follow this rule:

if field1 of current line is same as field1 of previous line, DONT print the line, otherwise DO print the line.


Another way of saying the rule is:
only if field1 of current line is DIFFERENT than field1 of the previous line, print entire line (except 2012, always print lines with 2012 for field1)
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove Duplicate Lines in File

I am doing KSH script to remove duplicate lines in a file. Let say the file has format below. FileA 1253-6856 3101-4011 1827-1356 1822-1157 1822-1157 1000-1410 1000-1410 1822-1231 1822-1231 3101-4011 1822-1157 1822-1231 and I want to simply it with no duplicate line as file... (5 Replies)
Discussion started by: Teh Tiack Ein
5 Replies

2. Shell Programming and Scripting

how to remove duplicate lines

I have following file content (3 fields each line): 23 888 10.0.0.1 dfh 787 10.0.0.2 dssf dgfas 10.0.0.3 dsgas dg 10.0.0.4 df dasa 10.0.0.5 df dag 10.0.0.5 dfd dfdas 10.0.0.5 dfd dfd 10.0.0.6 daf nfd 10.0.0.6 ... as can be seen, that the third field is ip address and sorted. but... (3 Replies)
Discussion started by: fredao
3 Replies

3. Shell Programming and Scripting

Remove duplicate lines (the first matching line by field criteria)

Hello to all, I have this file 2002 1 23 0 0 2435.60 131.70 5.60 20.99 0.89 0.00 285.80 2303.90 2002 1 23 15 0 2436.60 132.90 6.45 21.19 1.03 0.00 285.80 2303.70 2002 1 23 ... (6 Replies)
Discussion started by: joggdial3000
6 Replies

4. Shell Programming and Scripting

Remove duplicate lines

Hi, I have a huge file which is about 50GB. There are many lines. The file format likes 21 rs885550 0 9887804 C C T C C C C C C C 21 rs210498 0 9928860 0 0 C C 0 0 0 0 0 0 21 rs303304 0 9941889 A A A A A A A A A A 22 rs303304 0 9941890 0 A A A A A A A A A The question is that there are a few... (4 Replies)
Discussion started by: zhshqzyc
4 Replies

5. Shell Programming and Scripting

Remove duplicate lines based on field and sort

I have a csv file that I would like to remove duplicate lines based on field 1 and sort. I don't care about any of the other fields but I still wanna keep there data intact. I was thinking I could do something like this but I have no idea how to print the full line with this. Please show any method... (8 Replies)
Discussion started by: cokedude
8 Replies

6. Shell Programming and Scripting

Remove duplicate value based on two field $4 and $5

Hi All, i have input file like below... CA009156;20091003;M;AWBKCA72;123;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;321;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;231;;CANADIAN... (2 Replies)
Discussion started by: mohan sharma
2 Replies

7. UNIX for Dummies Questions & Answers

awk to sum column field from duplicate row/lines

Hello, I am new to Linux environment , I working on Linux script which should send auto email based on the specific condition from log file. Below is the sample log file Name m/c usage abc xxx 10 abc xxx 20 abc xxx 5 xyz ... (6 Replies)
Discussion started by: asjaiswal
6 Replies

8. UNIX for Dummies Questions & Answers

Remove Duplicate Lines

Hi I need this output. Thanks. Input: TAZ YET FOO FOO VAK TAZ BAR Output: YET VAK BAR (10 Replies)
Discussion started by: tara123
10 Replies

9. UNIX for Dummies Questions & Answers

Using awk to remove duplicate line if field is empty

Hi all, I've got a file that has 12 fields. I've merged 2 files and there will be some duplicates in the following: FILE: 1. ABC, 12345, TEST1, BILLING, GV, 20/10/2012, C, 8, 100, AA, TT, 100 2. ABC, 12345, TEST1, BILLING, GV, 20/10/2012, C, 8, 100, AA, TT, (EMPTY) 3. CDC, 54321, TEST3,... (4 Replies)
Discussion started by: tugar
4 Replies

10. Shell Programming and Scripting

How to remove duplicate lines?

Hi All, I am storing the result in the variable result_text using the below code. result_text=$(printf "$result_text\t\n$name") The result_text is having the below text. Which is having duplicate lines. file and time for the interval 03:30 - 03:45 file and time for the interval 03:30 - 03:45 ... (4 Replies)
Discussion started by: nalu
4 Replies
SNIFFIT(5)							File Formats Manual							SNIFFIT(5)

NAME
sniffit - configuration file for sniffit (name arbirtary) DESCRIPTION
This page describes the format for the config file for sniffit (see sniffit(8) ). This file allows you to specify in great detail witch packets should be processed by sniffit. This file also controls (or will control) some functions for the continuous logging ('-L' option). A sniffit config file might look like (Be sure to end it with a BLANK line): # Sniffit Sample Config file -- Brecht Claerhout logfile /var/log/sniffit.today.log # First select all packets! select both mhosts 1 select both mhosts 2 # Now deselect all packets from/to those damn 'surfers' deselect both port 80 deselect both port 8001 This file will tell sniffit to process all packets on the subnet except those FROM/TO ports 80 and 8001 (thus we don't want logs of those mass WWW connections witch turn our logs unreadable). GLOBAL FORMAT
The file consists of lines, lines are formed by fields, fields are separated with SPACES (NO TABS). Unix comment lines (starting with '#' are allowed). So this gives us: <field1> <field2> <field3> <field4> <field5> FIELD FORMAT
<field1> select - Sniffit will look for packets that match the following description (other fields) deselect - Sniffit will ignore packets that match the description logfile - change the logfile name to <field2> instead of the default 'sniffit.log' <field2> from - Packets FROM the host matching the following desc. are considered. to - similar, Packets TO the.... both - similar, Packets FROM or TO the.... a filename - as an argument of 'logfile' in <field1> <field3> host - The (de)selection criteria involves a hostname. port - similar, ... a portnumber mhosts - The (de)selection criteria involves multiple-hosts, like with the wildcars in 0.3.0, but without the 'x' <field4> Either a hostname, a portnumber, a service name or a numbet-dot partial notation indicating multiple hosts depending on <field3> (service names like 'ftp' are resolved as the services available present on the host that runs Sniffit, and translated into a port nr) <field5> A portnumber, if <field3> was 'host' or 'mhosts' (optional, if not filled in, all ports are going to be (de)selected) FILE INTERPRETING
The config file is interpreted SEQUENTIAL, so watch it, don't mix lines in a file. Example: select both mhosts 100.100.12. deselect both port 80 select both host 100.100.12.2 This file will get you the packets: a) Send by hosts '100.100.12.*' b) EXCEPT the WWW packets c) BUT showing the WWW packets concerning 100.100.12.2 select both mhosts 100.100.12. select both host 100.100.12.2 deselect both port 80 Will give you the packets (probably unwanted result): a) Send by hosts '100.100.12.*' b) Send from/to 100.100.12.2 (useless line) c) deselecting all WWW packets on the subnet AUTHOR
Brecht Claerhout <coder@reptile.rug.ac.be> SEE ALSO
sniffit(8) SNIFFIT(5)
All times are GMT -4. The time now is 11:30 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy