Sponsored Content
Top Forums Shell Programming and Scripting To remove duplicates from pipe delimited file Post 302865857 by drl on Sunday 20th of October 2013 07:35:17 AM
Old 10-20-2013
Hi.

We once needed a code that would run on a number of different systems, yet produce consistent results. We ran into the situation that utility uniq was not consistent among the systems. We introducing an option:
Code:
--last
allows over-writing, effectively keeping the most-recently
seen instance. Some versions of uniq on other *nix systems use
the most recent (Solaris), the default is compatibility with
GNU/Linux uniq, which keeps the first occurrence.

By substituting this idea for the system version of uniq, we were able to produce consistent results.

I think this problem can approached with the sort idea of danmero, but with the stable option set, and a "final filter" that eliminates duplicates. Because the file is already sorted, no additional storage is needed: in the final filter, if the fields of the incoming record differ from that in storage, then write out the saved line, and save the new line. If the fields are the same, then save the new instance of the line. Our code was in perl, but awk could be as easily used.

Best wishes ... cheers, drl

Last edited by drl; 10-20-2013 at 08:44 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to generate a pipe ( | ) delimited file?

:)Hi Friends, I have certain log files extracted. I want it to be converted in pipe ( | ) delimited file. How do i do it? E.g. Account Balance : 123456789 Rs O/P (Account Balance: | 123456789 Rs) Account Balance (Last) > 987654321 Rs O/P (Account Balance (Last) | 987654321 Rs) Last... (5 Replies)
Discussion started by: anushree.a
5 Replies

2. Shell Programming and Scripting

convert a pipe delimited file to a':" delimited file

i have a file whose data is like this:: osr_pe_assign|-120|wg000d@att.com|4| osr_evt|-21|wg000d@att.com|4| pe_avail|-21|wg000d@att.com|4| osr_svt|-11|wg000d@att.com|4| pe_mop|-13|wg000d@att.com|4| instar_ready|-35|wg000d@att.com|4| nsdnet_ready|-90|wg000d@att.com|4|... (6 Replies)
Discussion started by: priyanka3006
6 Replies

3. Shell Programming and Scripting

Remove SPACES between PIPE delimited file

This is my input file with extra information in the HEADER and leading & trailing SPACES between PIPE delimiter. 02/04/2010 Dynamic List Display 1 --------------------------------------------------------------------------------------... (6 Replies)
Discussion started by: srimitta
6 Replies

4. Shell Programming and Scripting

How to convert a space delimited file into a pipe delimited file using shellscript?

Hi All, I have space delimited file similar to the one as shown below.. I need to convert it as a pipe delimited, the values inside the pipe delimited file should be as highlighted... AA ATIU2345098809 009697 005374 BB ATIU2345097809 005445 006518 CC ATIU9685098809 003215 003571 DD... (7 Replies)
Discussion started by: nithins007
7 Replies

5. Shell Programming and Scripting

Help with converting Pipe delimited file to Tab Delimited

I have a file which was pipe delimited, I need to make it tab delimited. I tried with sed but no use cat file | sed 's/|//t/g' The above command substituted "/t" not tab in the place of pipe. Sample file: abc|123|2012-01-30|2012-04-28|xyz have to convert to: abc 123... (6 Replies)
Discussion started by: karumudi7
6 Replies

6. Shell Programming and Scripting

Remove few columns from pipe delimited file

I have file as below column1|column2|column3|column4|column5| fill1|fill2|fill3|fill4|fill5| abc1|abc2|abc3|abc4|abc5| . . . . i need to remove column2,3, from that file column1|column4|column5| fill1|fill4|fill5| abc1|abc4|abc5| . . . (3 Replies)
Discussion started by: greenworld123
3 Replies

7. Shell Programming and Scripting

How to ignore Pipe in Pipe delimited file?

Hi guys, I need to know how i can ignore Pipe '|' if Pipe is coming as a column in Pipe delimited file for eg: file 1: xx|yy|"xyz|zzz"|zzz|12... using below awk command awk 'BEGIN {FS=OFS="|" } print $3 i would get xyz But i want as : xyz|zzz to consider as whole column... (13 Replies)
Discussion started by: rohit_shinez
13 Replies

8. Shell Programming and Scripting

Removing duplicates from delimited file based on 2 columns

Hi guys,Got a bit of a bind I'm in. I'm looking to remove duplicates from a pipe delimited file, but do so based on 2 columns. Sounds easy enough, but here's the kicker... Column #1 is a simple ID, which is used to identify the duplicate. Once dups are identified, I need to only keep the one... (2 Replies)
Discussion started by: kevinprood
2 Replies

9. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies

10. Shell Programming and Scripting

How to remove new line characters from data rows in a Pipe delimited file?

I have a file as below Emp1|FirstName|MiddleName|LastName|Address|Pincode|PhoneNumber 1234|FirstName1|MiddleName2|LastName3| Add1 || ADD2|123|000000000 2345|FirstName2|MiddleName3|LastName4| Add1 || ADD2| 234|000000000 OUTPUT : ... (1 Reply)
Discussion started by: styris
1 Replies
PIPE(2) 							System Calls Manual							   PIPE(2)

NAME
pipe - create an interprocess communication channel SYNOPSIS
#include <unistd.h> int pipe(int fildes[2]) DESCRIPTION
The pipe system call creates an I/O mechanism called a pipe. The file descriptors returned can be used in read and write operations. When the pipe is written using the descriptor fildes[1] up to PIPE_MAX bytes of data are buffered before the writing process is suspended. A read using the descriptor fildes[0] will pick up the data. PIPE_MAX equals 7168 under Minix, but note that most systems use 4096. It is assumed that after the pipe has been set up, two (or more) cooperating processes (created by subsequent fork calls) will pass data through the pipe with read and write calls. The shell has a syntax to set up a linear array of processes connected by pipes. Read calls on an empty pipe (no buffered data) with only one end (all write file descriptors closed) returns an end-of-file. The signal SIGPIPE is generated if a write on a pipe with only one end is attempted. RETURN VALUE
The function value zero is returned if the pipe was created; -1 if an error occurred. ERRORS
The pipe call will fail if: [EMFILE] Too many descriptors are active. [ENFILE] The system file table is full. [ENOSPC] The pipe file system (usually the root file system) has no free inodes. [EFAULT] The fildes buffer is in an invalid area of the process's address space. SEE ALSO
sh(1), read(2), write(2), fork(2). NOTES
Writes may return ENOSPC errors if no pipe data can be buffered, because the pipe file system is full. BUGS
Should more than PIPE_MAX bytes be necessary in any pipe among a loop of processes, deadlock will occur. 4th Berkeley Distribution August 26, 1985 PIPE(2)
All times are GMT -4. The time now is 07:55 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy