Sponsored Content
Top Forums UNIX for Advanced & Expert Users Need optimized awk/perl/shell to give the statistics for the Large delimited file Post 303023277 by kartikirans on Thursday 13th of September 2018 05:03:14 PM
Old 09-13-2018
Need optimized awk/perl/shell to give the statistics for the Large delimited file

I have a file size is around 24 G with 14 columns, delimiter with "|"

My requirement- can anyone provide me the fastest and best to get the below results

Number of records of the file
First column and second Column- Unique counts

Thanks for your time
Karti

------ Post updated at 04:03 PM ------

Correction -

Number of records of the file
First column and second Column- Distinct column values , not the counts.
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Could someone give me an example of awk accessing array defined in Korn Shell?

As per title and much apprecieated! (2 Replies)
Discussion started by: biglau
2 Replies

2. UNIX for Dummies Questions & Answers

Trim String in 3rd Column in Tab Delimited File...SED/PERL/AWK?

Hey Everybody, I am having much trouble figuring this out, as I am not really a programmer..:mad: Datafile.txt Column0 Column1 Column2 ABC DEF xxxGHI I am running using WGET on a cronjob to grab a datafile, but I need to cut the first three characters from... (6 Replies)
Discussion started by: rickdini
6 Replies

3. Shell Programming and Scripting

Large pipe delimited file that I need to add CR/LF every n fields

I have a large flat file with variable length fields that are pipe delimited. The file has no new line or CR/LF characters to indicate a new record. I need to parse the file and after some number of fields, I need to insert a CR/LF to start the next record. Input file ... (2 Replies)
Discussion started by: clintrpeterson
2 Replies

4. Shell Programming and Scripting

Extracting a portion of data from a very large tab delimited text file

Hi All I wanted to know how to effectively delete some columns in a large tab delimited file. I have a file that contains 5 columns and almost 100,000 rows 3456 f g t t 3456 g h 456 f h 4567 f g h z 345 f g 567 h j k lThis is a very large data file and tab delimited. I need... (2 Replies)
Discussion started by: Lucky Ali
2 Replies

5. Shell Programming and Scripting

Script Optimization - large delimited file, for loop with many greps

Since there are approximately 75K gsfiles and hundreds of stfiles per gsfile, this script can take hours. How can I rewrite this script, so that it's much faster? I'm not as familiar with perl but I'm open to all suggestions. ls file.list>$split for gsfile in `cat $split`; do csplit... (17 Replies)
Discussion started by: verge
17 Replies

6. Shell Programming and Scripting

Awk getting statistics of a grid file,

Hi , I have the following file which is basically a grid (has more than 100000 rows) LLL1 PPP1 LLL1 PPP2 LLL1 PPP3 ............... LLL1 5500 ..... LLL2 PPP1 LLL2 PPP2 LLL2 PPP3 ............... LLL1 5500 ..... L100 PPP1 L100 PPP2 L100 PPP3 ............... 2100 5500... (6 Replies)
Discussion started by: alex2005
6 Replies

7. Shell Programming and Scripting

awk read one delimited file, search another delimited file

Hello folks, I have another doozy. I have two files. The first file has four fields in it. These four fields map to different locations in my second file. What I want to do is read the master file (file 2 - 23 fields) and compare each line against each record in file 1. If I get a match in all four... (4 Replies)
Discussion started by: dagamier
4 Replies

8. Shell Programming and Scripting

Removing dupes within 2 delimited areas in a large dictionary file

Hello, I have a very large dictionary file which is in text format and which contains a large number of sub-sections. Each sub-section starts with the following header : #DATA #VALID 1 and ends with a footer as shown below #END The data between the Header and the Footer consists of... (6 Replies)
Discussion started by: gimley
6 Replies

9. Shell Programming and Scripting

Perl script give answers by file

Hi, I am new in perl. I am running a perl installation script, its asking for paths and so many inputs. Can we provide that info by any file. so i can avoid the interactive installation. (2 Replies)
Discussion started by: Priy
2 Replies
Alzabo::Create::Index(3pm)				User Contributed Perl Documentation				Alzabo::Create::Index(3pm)

NAME
Alzabo::Create::Index - Index objects for schema creation SYNOPSIS
use Alzabo::Create::Index; DESCRIPTION
This object represents an index on a table. Indexes consist of columns and optional prefixes for each column. The prefix specifies how many characters of the columns should be indexes (the first X chars). Some RDBMS's do not have a concept of index prefixes. Not all col- umn types are likely to allow prefixes though this depends on the RDBMS. The order of the columns is significant. INHERITS FROM
"Alzabo::Index" Note: all relevant documentation from the superclass has been merged into this document. METHODS
new The constructor takes the following parameters: * table => "Alzabo::Create::Table" object The table that this index is indexing. * columns => [ "Alzabo::Create::Column" object, .. ] * columns => [ { column => "Alzabo::Create::Column" object, prefix => $prefix }, repeat as needed ... ] This parameter indicates which columns that are being indexed. It can either be an array reference of column objects, or an array ref- erence of hash references, each with a key called column and one called prefix. The prefix key is optional. * unique => $boolean Indicates whether or not this is a unique index. * fulltext => $boolean Indicates whether or not this is a fulltext index. * function => $string This can be used to create a function index where supported. The value of this parameter should be the full function, with column names, such as "LCASE( username )". The "columns" parameter should include all the columns used in the function. Returns a new "Alzabo::Create::Index" object. Throws: "Alzabo::Exception::Params", "Alzabo::Exception::RDBMSRules" table Returns the "Alzabo::Create::Table" object to which the index belongs. columns Returns an ordered list of the "Alzabo::Create::Column" objects that are being indexed. add_column Adds a column to the index. This method takes the following parameters: * column => "Alzabo::Create::Column" object * prefix => $prefix (optional) Throws: "Alzabo::Exception::Params", "Alzabo::Exception::RDBMSRules" delete_column ("Alzabo::Create::Column" object) Deletes the given column from the index. Throws: "Alzabo::Exception::Params", "Alzabo::Exception::RDBMSRules" prefix ("Alzabo::Create::Column" object) A column prefix is, to the best of my knowledge, a MySQL specific concept, and as such cannot be set when using an RDBMSRules module for a different RDBMS. However, it is important enough for MySQL to have the functionality be present. It allows you to specify that the index should only look at a certain portion of a field (the first N characters). This prefix is required to index any sort of BLOB column in MySQL. This method returns the prefix for the column in the index. If there is no prefix for this column in the index, then it returns undef. set_prefix This method takes the following parameters: * column => "Alzabo::Create::Column" object * prefix => $prefix Throws: "Alzabo::Exception::Params", "Alzabo::Exception::RDBMSRules" unique Returns a boolean value indicating whether the index is a unique index. set_unique ($boolean) Sets whether or not the index is a unique index. fulltext Returns a boolean value indicating whether the index is a fulltext index. set_fulltext ($boolean) Set whether or not the index is a fulltext index. Throws: "Alzabo::Exception::Params", "Alzabo::Exception::RDBMSRules" register_column_name_change This method takes the following parameters: * column => "Alzabo::Create::Column" object The column (with the new name already set). * old_name => $old_name This method is called by the table object which owns the index when a column name changes. You should never need to call this yourself. Throws: "Alzabo::Exception::Params" id The id is generated from the table, column and prefix information for the index. This is useful as a canonical name for a hash key, for example. Returns a string that is the id which uniquely identifies the index in this schema. AUTHOR
Dave Rolsky, <autarch@urth.org> perl v5.8.8 2007-12-23 Alzabo::Create::Index(3pm)
All times are GMT -4. The time now is 02:39 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy