Sponsored Content
Top Forums Shell Programming and Scripting Remove lines with duplicate first field Post 302608529 by ajp7701 on Saturday 17th of March 2012 06:15:03 PM
Old 03-17-2012
Remove lines with duplicate first field

Trying to cut down the size of some log files. Now that I write this out it looks more dificult than i thought it would be.

Need a bash script or command that goes sequentially through all lines of a file, and does this:

if field1 (space separated) is the number 2012 print the entire line. Do this DEFINITELY ALWAYS.


if field1 is not the number 2012, follow this rule:

if field1 of current line is same as field1 of previous line, DONT print the line, otherwise DO print the line.


Another way of saying the rule is:
only if field1 of current line is DIFFERENT than field1 of the previous line, print entire line (except 2012, always print lines with 2012 for field1)
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove Duplicate Lines in File

I am doing KSH script to remove duplicate lines in a file. Let say the file has format below. FileA 1253-6856 3101-4011 1827-1356 1822-1157 1822-1157 1000-1410 1000-1410 1822-1231 1822-1231 3101-4011 1822-1157 1822-1231 and I want to simply it with no duplicate line as file... (5 Replies)
Discussion started by: Teh Tiack Ein
5 Replies

2. Shell Programming and Scripting

how to remove duplicate lines

I have following file content (3 fields each line): 23 888 10.0.0.1 dfh 787 10.0.0.2 dssf dgfas 10.0.0.3 dsgas dg 10.0.0.4 df dasa 10.0.0.5 df dag 10.0.0.5 dfd dfdas 10.0.0.5 dfd dfd 10.0.0.6 daf nfd 10.0.0.6 ... as can be seen, that the third field is ip address and sorted. but... (3 Replies)
Discussion started by: fredao
3 Replies

3. Shell Programming and Scripting

Remove duplicate lines (the first matching line by field criteria)

Hello to all, I have this file 2002 1 23 0 0 2435.60 131.70 5.60 20.99 0.89 0.00 285.80 2303.90 2002 1 23 15 0 2436.60 132.90 6.45 21.19 1.03 0.00 285.80 2303.70 2002 1 23 ... (6 Replies)
Discussion started by: joggdial3000
6 Replies

4. Shell Programming and Scripting

Remove duplicate lines

Hi, I have a huge file which is about 50GB. There are many lines. The file format likes 21 rs885550 0 9887804 C C T C C C C C C C 21 rs210498 0 9928860 0 0 C C 0 0 0 0 0 0 21 rs303304 0 9941889 A A A A A A A A A A 22 rs303304 0 9941890 0 A A A A A A A A A The question is that there are a few... (4 Replies)
Discussion started by: zhshqzyc
4 Replies

5. Shell Programming and Scripting

Remove duplicate lines based on field and sort

I have a csv file that I would like to remove duplicate lines based on field 1 and sort. I don't care about any of the other fields but I still wanna keep there data intact. I was thinking I could do something like this but I have no idea how to print the full line with this. Please show any method... (8 Replies)
Discussion started by: cokedude
8 Replies

6. Shell Programming and Scripting

Remove duplicate value based on two field $4 and $5

Hi All, i have input file like below... CA009156;20091003;M;AWBKCA72;123;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;321;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;231;;CANADIAN... (2 Replies)
Discussion started by: mohan sharma
2 Replies

7. UNIX for Dummies Questions & Answers

awk to sum column field from duplicate row/lines

Hello, I am new to Linux environment , I working on Linux script which should send auto email based on the specific condition from log file. Below is the sample log file Name m/c usage abc xxx 10 abc xxx 20 abc xxx 5 xyz ... (6 Replies)
Discussion started by: asjaiswal
6 Replies

8. UNIX for Dummies Questions & Answers

Remove Duplicate Lines

Hi I need this output. Thanks. Input: TAZ YET FOO FOO VAK TAZ BAR Output: YET VAK BAR (10 Replies)
Discussion started by: tara123
10 Replies

9. UNIX for Dummies Questions & Answers

Using awk to remove duplicate line if field is empty

Hi all, I've got a file that has 12 fields. I've merged 2 files and there will be some duplicates in the following: FILE: 1. ABC, 12345, TEST1, BILLING, GV, 20/10/2012, C, 8, 100, AA, TT, 100 2. ABC, 12345, TEST1, BILLING, GV, 20/10/2012, C, 8, 100, AA, TT, (EMPTY) 3. CDC, 54321, TEST3,... (4 Replies)
Discussion started by: tugar
4 Replies

10. Shell Programming and Scripting

How to remove duplicate lines?

Hi All, I am storing the result in the variable result_text using the below code. result_text=$(printf "$result_text\t\n$name") The result_text is having the below text. Which is having duplicate lines. file and time for the interval 03:30 - 03:45 file and time for the interval 03:30 - 03:45 ... (4 Replies)
Discussion started by: nalu
4 Replies
CONTROL.CTL(5)							File Formats Manual						    CONTROL.CTL(5)

NAME
control.ctl - specify handling of Usenet control messages DESCRIPTION
The file /etc/news/control.ctl is used to determine what action is taken when a control message is received. It is read by the parsecon- trol script, which is called by all the control scripts. (For an explanation of how the control scripts are invoked, see innd(8).) The file consists of a series of lines; blank lines and lines beginning with a number sign (``#'') are ignored. All other lines consist of four fields separated by a colon: message:from:newsgroups:action The first field is the name of the message for which this line is valid. It should be either the name of the control message, or the word ``all'' to mean that it is valid for all messages. The second field is a shell-style pattern that matches the email address of the person posting the message. (The poster's address is first converted to lowercase.) The matching is done using the shell's case statement; see sh (1) for details. If the control message is ``newgroup'' or ``rmgroup'' then the third field specifies the shell-style pattern that must match the group being created or removed. If the control message is of a different type, then this field is ignored. The fourth field specifies what action to take if this line is selected for the message. The following actions are understood: doit The action requested by the control message should be performed. In most cases the control script will also send mail to usenet. doifarg If the control message has an argument, this is treated as a ``doit'' action. If no argument was given, it is treated as a ``mail'' entry. This is used in ``sendsys'' entries script so that a site can request its own newsfeeds(5) entry by posting a ``sendsys mysite'' article. On the other hand, sendsys ``bombs'' ask that the entire newsfeeds file be sent to a forged reply-to address; by using ``doifarg'' such messages will not be processed automatically. doit=file The action is performed, but a log entry is written to the specified log file, file. If file is the word ``mail'' then the record is mailed. A null string is equivalent to /dev/null. A pathname that starts with a slash is taken as the absolute filename to use as the log. All other pathnames are written to /var/log/news/file.log. The log is written by writelog (see newslog(8)). drop No action is taken; the message is ignored. log A one-line log notice is sent to standard error. Innd normally directs this to the file /var/log/news/errlog. log=file A log entry is written to the specified log file, file, which is interpreted as described above. mail A mail message is sent to the news administrator. Lines are matched in order; the last match found in the file is the one that is used. For example, with the following three lines: newgroup:*:*:drop newgroup:tale@*.uu.net:comp.*|misc.*|news.*|rec.*|sci.*|soc.*|talk.*:doit newgroup:kre@munnari.oz.au:aus.*:mail A newgroup coming from ``tale'' at a UUNET machine will be honored if it is in the mainstream Usenet hierarchy. If ``kre'' posts a new- group message creating ``aus.foo'', then mail will be sent. All other newgroup messages are ignored. HISTORY
Written by Rich $alz <rsalz@uunet.uu.net> for InterNetNews. This is revision 1.8, dated 1996/09/06. SEE ALSO
innd(8), newsfeeds(5), scanlogs(8). CONTROL.CTL(5)
All times are GMT -4. The time now is 11:52 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy