Duplicate Line Report per Section


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Duplicate Line Report per Section
# 1  
Old 02-02-2010
Duplicate Line Report per Section

I've been working on a script (/bin/sh) in which I have requested and received help here (in which I am very grateful for!). The client has modified their requirements (a tad), so without messing up the script to much, I come once again for assistance.

Here are the file.dat contents:
Code:
ABC1 012345 header
ABC2 7890-000
ABC3 012345 Header Table  <= Need this line in report
ABC4
ABC5 593.0000 587.4800
ABC5 593.5000 587.6580    <=Duplicate to be put in file.out
ABC5 593.5000 587.6580
ABC5 594.0000 588.0971
ABC5 594.5000 588.5361
ABC1 67890 header
ABC2 1234-0001
ABC3 67890 Header Table   <= Need this line in report
ABC4
ABC5 594.5000 588.5361 
ABC5 601.0000 594.1603
ABC5 601.5000 594.6121
ABC5 602.0000 595.0642
ABC5 602.0000 595.0642   <=Duplicate to be put in file.out

My current code will find the section header (ABC1) and the duplicates (ABC5) in that section and output that information into another file.

New client requirement: I need to add the “ABC3” line into the report

Needed Output file (file.out):
Code:
ABC1 012345 header
ABC3 012345 Header Table  <= Need this line added per section
ABC5 593.5000 587.6580 
ABC1 67890 header
ABC3 67890 Header Table   <= Need this line added per section
ABC5 602.0000 595.0642


Here is my current code:
Code:
# This will find the start of a section ABC1 and print the 
# header and duplicate lines data into a file.out
awk 'NF==3 && /ABC1/; $0!=s{s=$0;next}1' file.dat > file.out


Any suggestions?
# 2  
Old 02-02-2010
Code:
awk '/ABC[13]/&&h=$2;_[h,$2,$3]++==1' file.dat

# 3  
Old 02-02-2010
That Worked!!!!!

Thank you "radoulov" for helping me out with this!

Could you explain this part?
Code:
"h=$2_[h,$2,$3]"

Thanks!
# 4  
Old 02-03-2010
Quote:
Originally Posted by petersf
[...]
Could you explain this part?
Code:
"h=$2_[h,$2,$3]"

Thanks!
Yes,
but it's:

Code:
h=$2;_[h,$2,$3]++==1

not:

Code:
h=$2_[h,$2,$3]++==1

The statement separator ; matters.

There are two expressions in the code,
the first one is:

Code:
/ABC[13]/ && h=$2

If the current record matches the pattern ABC[13] AND the value returned by the assignment of the value of the second field to the variable h is true - print the record.
OK, I'm cheating here, it's a shortcut that will fail if the value of $2 is 0, on the other hand, if that value could never be 0, there's less code to write (the fact is that we don't care what's the return value of the assignment, we just need to print those records and save the value of $2).

So we print the records ABC1, ABC3 and save the numeric part of the header in the h variable.

The second expression is:

Code:
_[h,$2,$3]++==1

We build an associative array keyed by the concatenated values of the previously saved header number, the second and the third field. The values are an auto-incremented integers.
When the value is 1, we print the record, because we see it for the second time (the first time we see a key, the value is 0 (because of the post-incrementing operator).

Hope this helps.
# 5  
Old 02-03-2010
Yes - Thanks for the explanation. It does help to know what I'm putting in the script. Thanks again!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to change file section into each line?

Hi Gurus, I have below file which has different sections, need to move the sections to beginning of the each record. original file aaa bbb ccc ddd eee fff output file. aaa bbb ccc ddd eee fff (6 Replies)
Discussion started by: green_k
6 Replies

2. Shell Programming and Scripting

Print the first n line in each section

Hi, i have a file like this: ... 11111111 22222222 33333333 # 4444444 5555555 6666666 7777777 # ... i want just print the 2 first line between each section (each section is separated with "#"). so desired output would be like this: ... 11111111 22222222 (3 Replies)
Discussion started by: saeed.soltani
3 Replies

3. Shell Programming and Scripting

Prepend first line of section to each line until the next section header

I have searched in a variety of ways in a variety of places but have come up empty. I would like to prepend a portion of a section header to each following line until the next section header. I have been using sed for most things up until now but I'd go for a solution in just about anything--... (7 Replies)
Discussion started by: pagrus
7 Replies

4. Shell Programming and Scripting

Extract section of file based on word in section

I have a list of Servers in no particular order as follows: virtualMachines="IIBSBS IIBVICDMS01 IIBVICMA01"And I am generating some output from a pre-existing script that gives me the following (this is a sample output selection). 9/17/2010 8:00:05 PM: Normal backup using VDRBACKUPS... (2 Replies)
Discussion started by: jelloir
2 Replies

5. Shell Programming and Scripting

Placing Duplicate Lines per section into another file

Hello, I need help in putting duplicate lines within a section into another file. Here is what I'm struggling with: Using this file “data.txt”: ABC1 012345 header ABC2 7890-000 ABC3 012345 Header Table ABC4 ABC5 593.0000 587.4800 ABC5 593.5000 587.6580 <= dup need to remove ABC5... (4 Replies)
Discussion started by: petersf
4 Replies

6. Shell Programming and Scripting

Removing Duplicate Lines per Section

Hello, I am in need of removing duplicate lines from within a file per section. File: ABC1 012345 header ABC2 7890-000 ABC3 012345 Header Table ABC4 ABC5 593.0000 587.4800 ABC5 593.5000 587.6580 <= dup need to remove ABC5 593.5000 ... (5 Replies)
Discussion started by: petersf
5 Replies

7. Shell Programming and Scripting

how to retreive certain section of the line

Hi I am using "grep" command to get certain pattern out of the file: PNUM=34 $ grep -w "#${PNUM}" myfile #34 * 2297 * 410 * 964 * * 4352 $ Is there a way to retrieve the section of the above output without #34 so the output would look like this:... (3 Replies)
Discussion started by: aoussenko
3 Replies

8. Shell Programming and Scripting

Remove certain section from the line

A typical line looks like this... ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=129 COMMENT='Compiled E-Mails';I want to remove DEFAULT CHARSET= and COLLATE= after resetting AUTO_INCREMENT=0 I do not want to change the engine and comment. (7 Replies)
Discussion started by: shantanuo
7 Replies

9. Shell Programming and Scripting

Using Sed to duplicate a section of a file....

hello all, I have a file like this: section 1 blah1 blah2 section 2 blah1 blah2 section 3 blah1 blah2 and I want to use sed to duplicate section 2, like this: section 1 blah1 blah2 section 2 blah1 blah2 section 2 blah1 (2 Replies)
Discussion started by: nick26
2 Replies

10. UNIX for Dummies Questions & Answers

help find a section line of a file

hi, I have a 20 line file. I need a command which will brinf back a specific line based upon the line number I enter. e.g. the file looks like this and is called file1 jim is a man john is a woman james is a man wendy is a woman lesley is a woman i want a command that will... (4 Replies)
Discussion started by: sureshy
4 Replies
Login or Register to Ask a Question