Sponsored Content
Top Forums Shell Programming and Scripting Script for identifying and deleting dupes in a line Post 302608076 by pludi on Friday 16th of March 2012 05:06:08 AM
Old 03-16-2012
This should do it:
Code:
#!/usr/bin/perl -w

use strict;
use warnings;

while ( my $line = <> ) {
    chomp $line;
    my ( $key, $value ) = ( $line =~ /^(.*?)=(.*)$/g );
    my %hash = map { $_ => 1 } split( /,/, $value );
    print $key, "=", join( ',', sort keys %hash ), "\n";
}

Run as /path/to/script synonym.in > synonym.out
This User Gave Thanks to pludi For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

identifying duplicates line & reporting their line number

I need to find to find duplicate lines in a document and then print the line numbers of the duplicates The files contain multiple lines with about 100 numbers on each line I need something that will output the line numbers where duplicates were found ie 1=5=7, 2=34=76 Any suggestions would be... (5 Replies)
Discussion started by: stresslog
5 Replies

2. Shell Programming and Scripting

Shell Script for deleting the first line in a file

Hi, Could any one please post the shell script for deleting the first line in a file? (3 Replies)
Discussion started by: badrimohanty
3 Replies

3. Shell Programming and Scripting

Deleting a line from a flatfile using Shell Script

Hi All, Can Anyone please tell me,how can I delete a line from a file. I am reading the file line by line using whil loop and validating each line..Suppose in the middle i found a particular line is invalid,i need to delete that particular line. Can anyone please help. Thanks in advance,... (14 Replies)
Discussion started by: dinesh1985
14 Replies

4. Shell Programming and Scripting

Using an awk script to identify dupes in two files

Hello, I have two files. File1 or the master file contains two columns separated by a delimiter: a=b b=d e=f g=h File 2 which is the file to be processed has only a single column a h c b What I need is an awk script to identify unique names from file 2 which are not found in the... (6 Replies)
Discussion started by: gimley
6 Replies

5. Shell Programming and Scripting

deleting dupes in a row

Hello, I have a large database in which name homonyms are arranged in a row. Since the database is large and generated by hand, very often dupes creep in. I want to remove the dupes either using an awk or perl script. An input is given below The expected output is given below: As can be... (2 Replies)
Discussion started by: gimley
2 Replies

6. UNIX for Dummies Questions & Answers

Deleting a pattern in UNIX without deleting the entire line

Hi I have a file: r58778.3|SOURCES={KEY=f665931a...,fw,221-705}|ERRORS={16_1:T,30_1:T,56_1:C,57_1:T,59_1:A,101_1:A,115:-,158_1:C,186_1:A,204:-,271_1:T,305:-,350_1:C,368_1:G,442_1:C,472_1:G,477_1:A}|SOURCE_1="Contig_1092402550638"(f665931a359e36cea0976db191ff60ff09cc816e) I want to retain... (15 Replies)
Discussion started by: Alyaa
15 Replies

7. Shell Programming and Scripting

Identifying dupes within a database and creating unique sub-sets

Hello, I have a database of name variants with the following structure: variant=variant=variant The number of variants can be as many as thirty to forty. Since the database is quite large (at present around 60,000 lines) duplicate sets of variants creep in. Thus John=Johann=Jon and... (2 Replies)
Discussion started by: gimley
2 Replies

8. Shell Programming and Scripting

Help with Perl script for identifying dupes in column1

Dear all, I have a large dictionary database which has the following structure source word=target word e.g. book=livre Since the database is very large in spite of all the care taken, it so happens that at times the source word is repeated e.g. book=livre book=tome Since I want to... (7 Replies)
Discussion started by: gimley
7 Replies

9. Shell Programming and Scripting

Modify script to remove dupes with two delimiters

Hello, I have a script which removes duplicates in a database with a single delimiter = The script is given below: # script to remove dupes from a row with structure word=word BEGIN{FS="="} {for(i=1;i<=NF;i++){a++;}for(i in a){b=b"="i}{sub("=","",b);$0=b;b="";delete a}}1 How do I modify... (6 Replies)
Discussion started by: gimley
6 Replies

10. Shell Programming and Scripting

sed command within script wrongly deleting the last line

Hi, I have a shell script which has a for loop that scans list of files and do find and replace few variables using sed command. While doing this, it deletes the last line of all input file which is something wrong. how to fix this. please suggest. When i add an empty line in all my input file,... (5 Replies)
Discussion started by: rbalaj16
5 Replies
Test::Base::Filter(3pm) 				User Contributed Perl Documentation				   Test::Base::Filter(3pm)

NAME
Test::Base::Filter - Default Filter Class for Test::Base SYNOPSIS
package MyTestSuite; use Test::Base -Base; ... reusable testing code ... package MyTestSuite::Filter; use Test::Base::Filter -Base; sub my_filter1 { ... } DESCRIPTION
Filters are the key to writing effective data driven tests with Test::Base. Test::Base::Filter is a class containing a large default set of generic filters. You can easily subclass it to add/override functionality. FILTERS
This is a list of the default stock filters (in alphabetic order): append list => list Append a string to each element of a list. --- numbers lines chomp append=-# join one two three array list => scalar Turn a list of values into an anonymous array reference. base64_decode scalar => scalar Decode base64 data. Useful for binary tests. base64_encode scalar => scalar Encode base64 data. Useful for binary tests. chomp list => list Remove the final newline from each string value in a list. chop list => list Remove the final char from each string value in a list. dumper scalar => list Take a data structure (presumably from another filter like eval) and use Data::Dumper to dump it in a canonical fashion. escape scalar => scalar Unescape all backslash escaped chars. eval scalar => list Run Perl's "eval" command against the data and use the returned value as the data. eval_all scalar => list Run Perl's "eval" command against the data and return a list of 4 values: 1) The return value 2) The error in $@ 3) Captured STDOUT 4) Captured STDERR eval_stderr scalar => scalar Run Perl's "eval" command against the data and return the captured STDERR. eval_stdout scalar => scalar Run Perl's "eval" command against the data and return the captured STDOUT. exec_perl_stdout list => scalar Input Perl code is written to a temp file and run. STDOUT is captured and returned. flatten scalar => list Takes a hash or array ref and flattens it to a list. get_url scalar => scalar The text is chomped and considered to be a url. Then LWP::Simple::get is used to fetch the contents of the url. hash list => scalar Turn a list of key/value pairs into an anonymous hash reference. head[=number] list => list Takes a list and returns a number of the elements from the front of it. The default number is one. join list => scalar Join a list of strings into a scalar. Join Join the list of strings inside a list of array refs and return the strings in place of the array refs. lines scalar => list Break the data into an anonymous array of lines. Each line (except possibly the last one if the "chomp" filter came first) will have a newline at the end. norm scalar => scalar Normalize the data. Change non-Unix line endings to Unix line endings. prepend=string list => list Prepend a string onto each of a list of strings. read_file scalar => scalar Read the file named by the current content and return the file's content. regexp[=xism] scalar => scalar The "regexp" filter will turn your data section into a regular expression object. You can pass in extra flags after an equals sign. If the text contains more than one line and no flags are specified, then the 'xism' flags are assumed. reverse list => list Reverse the elements of a list. Reverse list => list Reverse the list of strings inside a list of array refs. slice=x[,y] list => list Returns the element number x through element number y of a list. sort list => list Sorts the elements of a list in character sort order. Sort list => list Sort the list of strings inside a list of array refs. split[=string|pattern] scalar => list Split a string in into a list. Takes a optional string or regexp as a parameter. Defaults to /s+/. Same as Perl "split". Split[=string|pattern] list => list Split each of a list of strings and turn them into array refs. strict scalar => scalar Prepend the string: use strict; use warnings; to the block's text. tail[=number] list => list Return a number of elements from the end of a list. The default number is one. trim list => list Remove extra blank lines from the beginning and end of the data. This allows you to visually separate your test data with blank lines. unchomp list => list Add a newline to each string value in a list. write_file[=filename] scalar => scalar Write the content of the section to the named file. Return the filename. yaml scalar => list Apply the YAML::Load function to the data block and use the resultant structure. Requires YAML.pm. AUTHOR
Ingy dA~Xt Net <ingy@cpan.org> COPYRIGHT
Copyright (c) 2006, 2011. Ingy dA~Xt Net. All rights reserved. Copyright (c) 2005. Brian Ingerson. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See http://www.perl.com/perl/misc/Artistic.html perl v5.12.3 2011-04-04 Test::Base::Filter(3pm)
All times are GMT -4. The time now is 04:08 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy