Discarding records with duplicate fields Post: 303043656

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Records Duplicate

Hi Everyone, I have a flat file of 1000 unique records like following : For eg Andy,Flower,201-987-0000,12/23/01 Andrew,Smith,101-387-3400,11/12/01 Ani,Ross,401-757-8640,10/4/01 Rich,Finny,245-308-0000,2/27/06 Craig,Ford,842-094-8740,1/3/04 . . . . . . Now I want to duplicate...

2. Shell Programming and Scripting

compare fields in a file with duplicate records

Hi: I've been searching the net but didnt find a clue. I have a file in which, for some records, some fields coincide. I want to compare one (or more) of the dissimilar fields and retain the one record that fulfills a certain condition. For example, on this file: 99 TR 1991 5 06 ...

3. Shell Programming and Scripting

combine duplicate records

I have a .DAT file like below 23666483030000653-B94030001OLFXXX000000120081227 23797049900000654-E71060001OLFXXX000000220081227 23699281320000655 E71060002OLFXXX000000320081227 22885068900000652 B86860003OLFXXX592123320081227 22885068900000652 B86860003ODL-SP592123420081227...

4. UNIX for Dummies Questions & Answers

Getting non-duplicate records

Hi, I have a file with these records abc xyz xyz pqr uvw cde cde In my o/p file , I want all the non duplicate rows to be shown. o/p abc pqr uvw Any suggestions how to do this? Thanks for the help. rs

5. UNIX for Dummies Questions & Answers

Need to keep duplicate records

Consider my input is 10 10 20 then, uniq -u will give 20 and uniq -dwill return 10. But i need the output as , 10 10 How we can achieve this? Thanks

6. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi, In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'. In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create...

7. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles...

8. Shell Programming and Scripting

Remove duplicate records

Hi, i am working on a script that would remove records or lines in a flat file. The only difference in the file is the "NOT NULL" word. Please see below example of the input file. INPUT FILE:> CREATE a ( TRIAL_CLIENT NOT NULL VARCHAR2(60), TRIAL_FUND NOT NULL...

9. Shell Programming and Scripting

Duplicate records

Gents, I have a file which contends duplicate records in column 1, but the values in column 2 are different. 3099753489 3 3099753489 5 3101954341 12 3101954341 14 3102153285 3 3102153285 5 3102153297 3 3102153297 5 I will like to get something like this: output desired...

10. Shell Programming and Scripting

Duplicate records

Gents, Please give a help file --BAD STATUS NOT RESHOOTED-- *** VP 41255/51341 in sw 2973 *** VP 41679/51521 in sw 2973 *** VP 41687/51653 in sw 2973 *** VP 41719/51629 in sw 2976 --BAD COG NOT RESHOOTED-- *** VP 41689/51497 in sw 2974 *** VP 41699/51677 in sw 2974 *** VP...

LEARN ABOUT DEBIAN

jcode

Jcode(3pm)						User Contributed Perl Documentation						Jcode(3pm)

NAME

       Jcode - Japanese Charset Handler

SYNOPSIS

	use Jcode;
	#
	# traditional
	Jcode::convert($str, $ocode, $icode, "z");
	# or OOP!
	print Jcode->new($str)->h2z->tr($from, $to)->utf8;

DESCRIPTION

       <Japanese document is now available as Jcode::Nihongo. >

       Jcode.pm supports both object and traditional approach.	With object approach, you can go like;

	 $iso_2022_jp = Jcode->new($str)->h2z->jis;

       Which is more elegant than:

	 $iso_2022_jp = $str;
	 &jcode::convert($iso_2022_jp, 'jis', &jcode::getcode($str), "z");

       For those unfamiliar with objects, Jcode.pm still supports "getcode()" and "convert()."

       If the perl version is 5.8.1, Jcode acts as a wrapper to Encode, the standard charset handler module for Perl 5.8 or later.

Methods
       Methods mentioned here all return Jcode object unless otherwise mentioned.

       Constructors

       $j = Jcode->new($str [, $icode])
	 Creates Jcode object $j from $str.  Input code is automatically checked unless you explicitly set $icode. For available charset, see get-
	 code below.

	 For perl 5.8.1 or better, $icode can be any encoding name that Encode understands.

	   $j = Jcode->new($european, 'iso-latin1');

	 When the object is stringified, it returns the EUC-converted string so you can <print $j> instead of <print $j->euc>.

	 Passing Reference
	   Instead of scalar value, You can use reference as

	   Jcode->new($str);

	   This saves time a little bit.  In exchange of the value of $str being converted. (In a way, $str is now "tied" to jcode object).

       $j->set($str [, $icode])
	 Sets $j's internal string to $str.  Handy when you use Jcode object repeatedly (saves time and memory to create object).

	  # converts mailbox to SJIS format
	  my $jconv = new Jcode;
	  $/ = 00;
	  while(&lt;&gt;){
	      print $jconv->set($_)->mime_decode->sjis;
	  }

       $j->append($str [, $icode]);
	 Appends $str to $j's internal string.

       $j = jcode($str [, $icode]);
	 shortcut for Jcode->new() so you can go like;

       Encoded Strings

       In general, you can retrieve encoded string as $j->encoded.

       $sjis = jcode($str)->sjis
       $euc = $j->euc
       $jis = $j->jis
       $sjis = $j->sjis
       $ucs2 = $j->ucs2
       $utf8 = $j->utf8
	 What you code is what you get :)

       $iso_2022_jp = $j->iso_2022_jp
	 Same as "$j->h2z->jis".  Hankaku Kanas are forcibly converted to Zenkaku.

	 For perl 5.8.1 and better, you can also use any encoding names and aliases that Encode supports.  For example:

	   $european = $j->iso_latin1; # replace '-' with '_' for names.

	 FYI: Encode::Encoder uses similar trick.

	 $j->fallback($fallback)
	   For perl is 5.8.1 or better, Jcode stores the internal string in UTF-8.  Any character that does not map to ->encoding are replaced
	   with a '?', which is Encode standard.

	     my $unistr = "x{262f}"; # YIN YANG
	     my $j = jcode($unistr);  # $j->euc is '?'

	   You can change this behavior by specifying fallback like Encode.  Values are the same as Encode.  "Jcode::FB_PERLQQ", "Jcode::FB_XML-
	   CREF", "Jcode::FB_HTMLCREF" are aliased to those of Encode for convenice.

	     print $j->fallback(Jcode::FB_PERLQQ)->euc;   # 'x{262f}'
	     print $j->fallback(Jcode::FB_XMLCREF)->euc;  # '&#x262f;'
	     print $j->fallback(Jcode::FB_HTMLCREF)->euc; # '&#9775;'

	   The global variable $Jcode::FALLBACK stores the default fallback so you can override that by assigning the value.

	     $Jcode::FALLBACK = Jcode::FB_PERLQQ; # set default fallback scheme

       [@lines =] $jcode->jfold([$width, $newline_str, $kref])
	 folds lines in jcode string every $width (default: 72) where $width is the number of "halfwidth" character.  Fullwidth Characters are
	 counted as two.

	 with a newline string spefied by $newline_str (default: "
").

	 Rudimentary kinsoku suppport is now available for Perl 5.8.1 and better.

       $length = $jcode->jlength();
	 returns character length properly, rather than byte length.

       Methods that use MIME::Base64

       To use methods below, you need MIME::Base64.  To install, simply

	  perl -MCPAN -e 'CPAN::Shell->install("MIME::Base64")'

       If your perl is 5.6 or better, there is no need since MIME::Base64 is bundled.

       $mime_header = $j->mime_encode([$lf, $bpl])
	 Converts $str to MIME-Header documented in RFC1522.  When $lf is specified, it uses $lf to fold line (default: 
).  When $bpl is speci-
	 fied, it uses $bpl for the number of bytes (default: 76; this number must be smaller than 76).

	 For Perl 5.8.1 or better, you can also encode MIME Header as:

	   $mime_header = $j->MIME_Header;

	 In which case the resulting $mime_header is MIME-B-encoded UTF-8 whereas "$j->mime_encode()" returnes MIME-B-encoded ISO-2022-JP.  Most
	 modern MUAs support both.

       $j->mime_decode;
	 Decodes MIME-Header in Jcode object.  For perl 5.8.1 or better, you can also do the same as:

	   Jcode->new($str, 'MIME-Header')

       Hankaku vs. Zenkaku

       $j->h2z([$keep_dakuten])
	 Converts X201 kana (Hankaku) to X208 kana (Zenkaku).  When $keep_dakuten is set, it leaves dakuten as is (That is, "ka + dakuten" is left
	 as is instead of being converted to "ga")

	 You can retrieve the number of matches via $j->nmatch;

       $j->z2h
	 Converts X208 kana (Zenkaku) to X201 kana (Hankaku).

	 You can retrieve the number of matches via $j->nmatch;

       Regexp emulators

       To use "->m()" and "->s()", you need perl 5.8.1 or better.

       $j->tr($from, $to, $opt);
	 Applies "tr/$from/$to/" on Jcode object where $from and $to are EUC-JP strings.  On perl 5.8.1 or better, $from and $to can also be
	 flagged UTF-8 strings.

	 If $opt is set, "tr/$from/$to/$opt" is applied.  $opt must be 'c', 'd' or the combination thereof.

	 You can retrieve the number of matches via $j->nmatch;

	 The following methods are available only for perl 5.8.1 or better.

       $j->s($patter, $replace, $opt);
	 Applies "s/$pattern/$replace/$opt". $pattern and "replace" must be in EUC-JP or flagged UTF-8. $opt are the same as regexp options.  See
	 perlre for regexp options.

	 Like "$j->tr()", "$j->s()" returns the object itself so you can nest the operation as follows;

	   $j->tr("a-z", "A-Z")->s("foo", "bar");

       [@match = ] $j->m($pattern, $opt);
	 Applies "m/$patter/$opt".  Note that this method DOES NOT RETURN AN OBJECT so you can't chain the method like	"$j->s()".

       Instance Variables

       If you need to access instance variables of Jcode object, use access methods below instead of directly accessing them (That's what OOP is
       all about)

       FYI, Jcode uses a ref to array instead of ref to hash (common way) to optimize speed (Actually you don't have to know as long as you use
       access methods instead;	Once again, that's OOP)

       $j->r_str
	 Reference to the EUC-coded String.

       $j->icode
	 Input charcode in recent operation.

       $j->nmatch
	 Number of matches (Used in $j->tr, etc.)

Subroutines
       ($code, [$nmatch]) = getcode($str)
	 Returns char code of $str. Return codes are as follows

	  ascii   Ascii (Contains no Japanese Code)
	  binary  Binary (Not Text File)
	  euc	  EUC-JP
	  sjis	  SHIFT_JIS
	  jis	  JIS (ISO-2022-JP)
	  ucs2	  UCS2 (Raw Unicode)
	  utf8	  UTF8

	 When array context is used instead of scaler, it also returns how many character codes are found.  As mentioned above, $str can be $str
	 instead.

	 jcode.pl Users:  This function is 100% upper-conpatible with jcode::getcode() -- well, almost;

	  * When its return value is an array, the order is the opposite;
	    jcode::getcode() returns $nmatch first.

	  * jcode::getcode() returns 'undef' when the number of EUC characters
	    is equal to that of SJIS.  Jcode::getcode() returns EUC.  for
	    Jcode.pm there is no in-betweens.

       Jcode::convert($str, [$ocode, $icode, $opt])
	 Converts $str to char code specified by $ocode.  When $icode is specified also, it assumes $icode for input string instead of the one
	 checked by getcode(). As mentioned above, $str can be $str instead.

	 jcode.pl Users:  This function is 100% upper-conpatible with jcode::convert() !

BUGS

       For perl is 5.8.1 or later, Jcode acts as a wrapper to Encode.  Meaning Jcode is subject to bugs therein.

ACKNOWLEDGEMENTS

       This package owes a lot in motivation, design, and code, to the jcode.pl for Perl4 by Kazumasa Utashiro <utashiro@iij.ad.jp>.

       Hiroki Ohzaki <ohzaki@iod.ricoh.co.jp> has helped me polish regexp from the very first stage of development.

       JEncode by makamaka@donzoko.net has inspired me to integrate Encode to Jcode.  He has also contributed Japanese POD.

       And folks at Jcode Mailing list <jcode5@ring.gr.jp>.  Without them, I couldn't have coded this far.

SEE ALSO

       Encode

       Jcode::Nihongo

       <http://www.iana.org/assignments/character-sets>

COPYRIGHT

       Copyright 1999-2005 Dan Kogai <dankogai@dan.co.jp>

       This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.8.8							    2005-02-19								Jcode(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Records Duplicate

Discussion started by: ganesh123

2. Shell Programming and Scripting

compare fields in a file with duplicate records

Discussion started by: rleal

3. Shell Programming and Scripting

combine duplicate records

Discussion started by: kshuser

4. UNIX for Dummies Questions & Answers

Getting non-duplicate records

Discussion started by: rs123

5. UNIX for Dummies Questions & Answers

Need to keep duplicate records

Discussion started by: pandeesh

6. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Discussion started by: machomaddy

7. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

Discussion started by: vestport

8. Shell Programming and Scripting

Remove duplicate records

Discussion started by: reignangel2003

9. Shell Programming and Scripting

Duplicate records

Discussion started by: jiam912

10. Shell Programming and Scripting

Duplicate records

Discussion started by: jiam912

LEARN ABOUT DEBIAN

jcode