Remove dupes in a large file Post: 303024638

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove a large number of user from oracle

Hi on solaris and oracle 10g2, I have number of users created in Oracle, I wonder if I have a list of the usernames will it be possible to remove the users quickly ? I want to keep the users access to system but oracle. some thing like shell script may be ?:confused: I am trying to...

2. Shell Programming and Scripting

Sed or awk script to remove text / or perform calculations from large CSV files

I have a large CSV files (e.g. 2 million records) and am hoping to do one of two things. I have been trying to use awk and sed but am a newbie and can't figure out how to get it to work. Any help you could offer would be greatly appreciated - I'm stuck trying to remove the colon and wildcards in...

3. Shell Programming and Scripting

remove a specific line in a LARGE file

Hi guys, i have a really big file, and i want to remove a specific line. sed -i '5d' fileThis doesn't really work, it takes a lot of time... The whole script is supposed to remove every word containing less than 5 characters and currently looks like this: #!/bin/bash line="1"...

4. Shell Programming and Scripting

Remove Duplicate Filenames in 2 very large directories

Hello Gurus, O/S RHEL4 I have a requirement to compare two linux based directories for duplicate filenames and remove them. These directories are close to 2 TB each. I have tried running a: Prompt>diff -r data1/ data2/ I have tried this as well: jason@jason-desktop:~$ cat script.sh ...

5. Shell Programming and Scripting

How to remove a subset of data from a large dataset based on values on one line

Hello. I was wondering if anyone could help. I have a file containing a large table in the format: marker1 marker2 marker3 marker4 position1 position2 position3 position4 genotype1 genotype2 genotype3 genotype4 with marker being a name, position a numeric...

6. UNIX for Dummies Questions & Answers

Filtering F-Dupes

Is there an easy way to tell FDupes what filetypes to look at or ignore?

7. Shell Programming and Scripting

Removing Dupes from huge file- awk/perl/uniq

Hi, I have the following command in place nawk -F, '!a++' file > file.uniq It has been working perfectly as per requirements, by removing duplicates by taking into consideration only first 3 fields. Recently it has started giving below error: bash-3.2$ nawk -F, '!a++'...

8. Shell Programming and Scripting

remove large portion of web page code between two tags

Hi everybody, I am trying to remove bunch of lines from web pages between two tags: one is <h1> and the other is <table it looks like <h1>Anniversary cards roses</h1> many lines here <table summary="Free anniversary greeting cards." cellspacing="8" cellpadding="8" width="70%">my goal...

9. Shell Programming and Scripting

Removing dupes within 2 delimited areas in a large dictionary file

Hello, I have a very large dictionary file which is in text format and which contains a large number of sub-sections. Each sub-section starts with the following header : #DATA #VALID 1 and ends with a footer as shown below #END The data between the Header and the Footer consists of...

10. Shell Programming and Scripting

Modify script to remove dupes with two delimiters

Hello, I have a script which removes duplicates in a database with a single delimiter = The script is given below: # script to remove dupes from a row with structure word=word BEGIN{FS="="} {for(i=1;i<=NF;i++){a++;}for(i in a){b=b"="i}{sub("=","",b);$0=b;b="";delete a}}1 How do I modify...

LEARN ABOUT MOJAVE

locale::codes::langfam5.18

Locale::Codes::LangFam(3pm)				 Perl Programmers Reference Guide			       Locale::Codes::LangFam(3pm)

NAME

       Locale::Codes::LangFam - standard codes for language extension identification

SYNOPSIS

	  use Locale::Codes::LangFam;

	  $lext = code2langfam('apa');		       # $lext gets 'Apache languages'
	  $code = langfam2code('Apache languages');    # $code gets 'apa'

	  @codes   = all_langfam_codes();
	  @names   = all_langfam_names();

DESCRIPTION

       The "Locale::Codes::LangFam" module provides access to standard codes used for identifying language families, such as those as defined in
       ISO 639-5.

       Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 639-5
       language family codes will be used.

SUPPORTED CODE SETS

       There are several different code sets you can use for identifying language families. A code set may be specified using either a name, or a
       constant that is automatically exported by this module.

       For example, the two are equivalent:

	  $lext = code2langfam('apa','alpha');
	  $lext = code2langfam('apa',LOCALE_LANGFAM_ALPHA);

       The codesets currently supported are:

       alpha
	   This is the set of three-letter (lowercase) codes from ISO 639-5 such as 'apa' for Apache languages.

	   This is the default code set.

ROUTINES

       code2langfam ( CODE [,CODESET] )
       langfam2code ( NAME [,CODESET] )
       langfam_code2code ( CODE ,CODESET ,CODESET2 )
       all_langfam_codes ( [CODESET] )
       all_langfam_names ( [CODESET] )
       Locale::Codes::LangFam::rename_langfam  ( CODE ,NEW_NAME [,CODESET] )
       Locale::Codes::LangFam::add_langfam  ( CODE ,NAME [,CODESET] )
       Locale::Codes::LangFam::delete_langfam  ( CODE [,CODESET] )
       Locale::Codes::LangFam::add_langfam_alias  ( NAME ,NEW_NAME )
       Locale::Codes::LangFam::delete_langfam_alias  ( NAME )
       Locale::Codes::LangFam::rename_langfam_code  ( CODE ,NEW_CODE [,CODESET] )
       Locale::Codes::LangFam::add_langfam_code_alias  ( CODE ,NEW_CODE [,CODESET] )
       Locale::Codes::LangFam::delete_langfam_code_alias  ( CODE [,CODESET] )
	   These routines are all documented in the Locale::Codes::API man page.

SEE ALSO

       Locale::Codes
	   The Locale-Codes distribution.

       Locale::Codes::API
	   The list of functions supported by this module.

       http://www.loc.gov/standards/iso639-5/id.php
	   ISO 639-5 .

AUTHOR

       See Locale::Codes for full author history.

       Currently maintained by Sullivan Beck (sbeck@cpan.org).

COPYRIGHT

	  Copyright (c) 2011-2013 Sullivan Beck

       This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.18.2							    2013-11-04					       Locale::Codes::LangFam(3pm)