Sponsored Content
Top Forums Shell Programming and Scripting Remove the partial duplicates by checking the length of a field Post 302558311 by bartus11 on Friday 23rd of September 2011 09:24:09 AM
Old 09-23-2011
Try:
Code:
awk -F"|" 'length($1)>l[$3"|"$4]{l[$3"|"$4]=length($1);a[$3"|"$4]=$0}END{for (i in a) print a[i]}' file

This User Gave Thanks to bartus11 For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Checking file for duplicates

Hi all, I am due to start receiving a weekly csv containing around 6 million rows. I need to do some processing on this file and then send it on elsewhere. My problem is that after week 1 the files that I will receive are likely to contain data already received in previous files and I need... (8 Replies)
Discussion started by: pxy2d1
8 Replies

2. Shell Programming and Scripting

AWK - Print partial line/partial field

Hello, this is probably a simple request but I've been toying with it for a while. I have a large list of devices and commands that were run with a script, now I have lines such as: a-router-hostname-C#show ver I want to print everything up to (and excluding) the # and everything after it... (3 Replies)
Discussion started by: ippy98
3 Replies

3. Shell Programming and Scripting

CSV with commas in field values, remove duplicates, cut columns

Hi Description of input file I have: ------------------------- 1) CSV with double quotes for string fields. 2) Some string fields have Comma as part of field value. 3) Have Duplicate lines 4) Have 200 columns/fields 5) File size is more than 10GB Description of output file I need:... (4 Replies)
Discussion started by: krishnix
4 Replies

4. UNIX for Dummies Questions & Answers

remove duplicates based on a field and criteria

Hi, I have a file with fields like below: A;XYZ;102345;222 B;XYZ;123243;333 C;ABC;234234;444 D;MNO;103345;222 E;DEF;124243;333 desired output: C;ABC;234234;444 D;MNO;103345;222 E;DEF;124243;333 ie, if the 4rth field is a duplicate.. i need only those records where... (5 Replies)
Discussion started by: wanderingmind16
5 Replies

5. Shell Programming and Scripting

Flat file-make field length equal to header length

Hello Everyone, I am stuck with one issue while working on abstract flat file which i have to use as input and load data to table. Input Data- ------ ------------------------ ---- ----------------- WFI001 Xxxxxx Control Work Item A Number of Records ------ ------------------------... (5 Replies)
Discussion started by: sonali.s.more
5 Replies

6. Shell Programming and Scripting

Remove duplicates based on a field's value

Hi All, I have a text file with three columns. I would like a simple script that removes lines in which column 1 has duplicate entries, but use the largest value in column 3 to decide which one to keep. For example: Input file: 12345a rerere.rerere len=23 11111c fsdfdf.dfsdfdsf len=33 ... (3 Replies)
Discussion started by: anniecarv
3 Replies

7. Shell Programming and Scripting

Replace a field with a character as per the field length

Hi all, I have a requirement to replace a field with a character as per the length of the field. Suppose i have a file where second field is of 20 character length. I want to replace second field with 20 stars (*). like ******************** As the field is not a fixed one, i want to do the... (2 Replies)
Discussion started by: gani_85
2 Replies

8. Shell Programming and Scripting

Trying to remove duplicates based on field and row

I am trying to see if I can use awk to remove duplicates from a file. This is the file: -==> Listvol <== deleting /vol/eng_rmd_0941 deleting /vol/eng_rmd_0943 deleting /vol/eng_rmd_0943 deleting /vol/eng_rmd_1006 deleting /vol/eng_rmd_1012 rearrange /vol/eng_rmd_0943 ... (6 Replies)
Discussion started by: newbie2010
6 Replies

9. Shell Programming and Scripting

Script to compare partial filenames in two folders and delete duplicates

Background: I use a TV tuner card to capture OTA video files (.mpeg) and then my Plex Media Server automatically optimizes the files (transcodes for better playback) and places them in a new directory. I have another Plex Library pointing to the new location for the optimized .mp4 files. This... (2 Replies)
Discussion started by: shaky
2 Replies

10. UNIX for Beginners Questions & Answers

How can I remove partial duplicates and manipulate text?

Hello, How can I remove partial duplicates and manipulate text in bash using either awk, grep or sed? Thanks. Input: ted,"foo,bar,zoo" john-son,"foot,ben,zoo" bob,"bar,foot" Expected Output: foo,ted bar,ted zoo,ted foot,john-son ben,john-son (4 Replies)
Discussion started by: tara123
4 Replies
Locale::Script(3perl)					 Perl Programmers Reference Guide				     Locale::Script(3perl)

NAME
Locale::Script - standard codes for script identification SYNOPSIS
use Locale::Script; $script = code2script('phnx'); # 'Phoenician' $code = script2code('Phoenician'); # 'Phnx' $code = script2code('Phoenician', LOCALE_CODE_NUMERIC); # 115 @codes = all_script_codes(); @scripts = all_script_names(); DESCRIPTION
The "Locale::Script" module provides access to standards codes used for identifying scripts, such as those defined in ISO 15924. Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 15924 four-letter codes will be used. SUPPORTED CODE SETS
There are several different code sets you can use for identifying scripts. The ones currently supported are: alpha This is a set of four-letter (capitalized) codes from ISO 15924 such as 'Phnx' for Phoenician. This code set is identified with the symbol "LOCALE_SCRIPT_ALPHA". The Zxxx, Zyyy, and Zzzz codes are not used. This is the default code set. numeric This is a set of three-digit numeric codes from ISO 15924 such as 115 for Phoenician. This code set is identified with the symbol "LOCALE_SCRIPT_NUMERIC". ROUTINES
code2script ( CODE [,CODESET] ) script2code ( NAME [,CODESET] ) script_code2code ( CODE ,CODESET ,CODESET2 ) all_script_codes ( [CODESET] ) all_script_names ( [CODESET] ) Locale::Script::rename_script ( CODE ,NEW_NAME [,CODESET] ) Locale::Script::add_script ( CODE ,NAME [,CODESET] ) Locale::Script::delete_script ( CODE [,CODESET] ) Locale::Script::add_script_alias ( NAME ,NEW_NAME ) Locale::Script::delete_script_alias ( NAME ) Locale::Script::rename_script_code ( CODE ,NEW_CODE [,CODESET] ) Locale::Script::add_script_code_alias ( CODE ,NEW_CODE [,CODESET] ) Locale::Script::delete_script_code_alias ( CODE [,CODESET] ) These routines are all documented in the Locale::Codes man page. SEE ALSO
Locale::Codes Locale::Constants http://www.unicode.org/iso15924/ Home page for ISO 15924. AUTHOR
See Locale::Codes for full author history. Currently maintained by Sullivan Beck (sbeck@cpan.org). COPYRIGHT
Copyright (c) 1997-2001 Canon Research Centre Europe (CRE). Copyright (c) 2001-2010 Neil Bowers Copyright (c) 2010-2011 Sullivan Beck This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.14.2 2011-09-26 Locale::Script(3perl)
All times are GMT -4. The time now is 07:30 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy