File comaprsons for the Huge data files ( around 60G) - Need optimized and teh best way to do this Post: 303025150

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

search and grab data from a huge file

folks, In my working directory, there a multiple large files which only contain one line in the file. The line is too long to use "grep", so any help? For example, if I want to find if these files contain a string like "93849", what command I should use? Also, there is oder_id number...

2. Shell Programming and Scripting

How to extract data from a huge file?

Hi, I have a huge file of bibliographic records in some standard format.I need a script to do some repeatable task as follows: 1. Needs to create folders as the strings starts with "item_*" from the input file 2. Create a file "contents" in each folders having "license.txt(tab...

3. Shell Programming and Scripting

insert a header in a huge data file without using an intermediate file

I have a file with data extracted, and need to insert a header with a constant string, say: H|PayerDataExtract if i use sed, i have to redirect the output to a seperate file like sed ' sed commands' ExtractDataFile.dat > ExtractDataFileWithHeader.dat the same is true for awk and...

4. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA...

5. Shell Programming and Scripting

Splitting the Huge file into several files...

Hi I have to write a script to split the huge file into several pieces. The file columns is | pipe delimited. The data sample is as: 6625060|1420215|07308806|N|20100120|5572477081|+0002.79|+0000.00|0004|0001|.........

6. Shell Programming and Scripting

Problem running Perl Script with huge data files

Hello Everyone, I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this : foreach my $t (@text) { open TEXT, $t or die "Cannot open $t for reading: $!\n"; while(my $line=<TEXT>){ ...

7. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . .

8. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment...

9. UNIX for Dummies Questions & Answers

File comparison of huge files

Hi all, I hope you are well. I am very happy to see your contribution. I am eager to become part of it. I have the following question. I have two huge files to compare (almost 3GB each). The files are simulation outputs. The format of the files are as below For clear picture, please see...

10. UNIX for Advanced & Expert Users

Need Optimization shell/awk script to aggreagte (sum) for all the columns of Huge data file

Optimization shell/awk script to aggregate (sum) for all the columns of Huge data file File delimiter "|" Need to have Sum of all columns, with column number : aggregation (summation) for each column File not having the header Like below - Column 1 "Total Column 2 : "Total ... ......

LEARN ABOUT DEBIAN

file::queue

File::Queue(3pm)					User Contributed Perl Documentation					  File::Queue(3pm)

NAME

       File::Queue - Persistent FIFO queue implemented in pure perl!

SYNOPSIS

	   use strict; # always!
	   use File::Queue;

	   my $q = new File::Queue (File => '/var/spool/yourprog/queue');

	   $q->enq('some flat text1');
	   $q->enq('some flat text2');
	   $q->enq('some flat text3');

	   # Get up to first 10 elements
	   my $contents = $q->peek(10);

	   my $elem1 = $q->deq();
	   my $elem2 = $q->deq();

	   # empty the queue
	   $q->reset();

DESCRIPTION

       This module allows for the creation of persistent FIFO queue objects.

       File::Queue only handles scalars as queue elements.  If you want to work with references, serialize them first!

       The module was written with speed in mind, and it is very fast, but it should be used with care.  Please refer to the CAVEATS section.

Interface
       File::Queue implements a OO interface.  The object methods and parameters are described below.

   Methods
       File::Queue supports all of the queue-related functions a developer should expect.

       o   new()

	   Instantiates your File::Queue object.  Parameters are described in the next sub-section.

       o   enq()

	   Enqueues a string element to the queue.

       o   deq()

	   Dequeues a string element from the queue, and returns the element.  If the queue is empty, nothing is returned.

       o   peek(n)

	   Returns an arrayref containing the next n elements in the queue.  If the queue size is less than n, all elements are returned.  If the
	   queue is empty, an empty arrayref is returned.

       o   reset()

	   Emptys the queue.

       o   close()

	   Closes the filehandles belonging to the queue object ('.dat' and '.idx').

       o   delete()

	   Deletes the files belonging to the queue object ('.dat' and '.idx').

   Parameters
       There are a number of parameters that can be passed when constructing your File::Queue objects.	Parameters are case-insensitive.

       o   File (required)

	   File::Queue creates two files using this parameter as the base.  In the case of the example in the SYNOPSIS, the two files are
	   '/var/spool/yourprog/queue.dat' and '/var/spool/yourprog.idx'.

	   The '.dat' file holds all of the data, the '.idx' file holds the byte index (pointer) of the starting point of the first element in the
	   queue.

       o   Mode (optional)

	   The file bit mode to be shared by both the '.dat' and '.idx' files.	Defaults to '0600'.

       o   Seperator (optional)

	   The character or byte sequence that is used to seperate queue elements in the '.dat' file.  It should be something you can guarantee
	   will NEVER appear in your queue data.  Defaults to the newline character.

       o   BlockSize (optional)

	   This is the size of the byte chunks that are pulled at each iteration when checking for the end of a queued element.  Defaults to 64,
	   which will be fine for most cases, but can be tweaked or tuned for your specific case to squeeze out a few extra nanoseconds.

CAVEATS

       This module should never be used in situations where the queue is not expected to become empty.

       The '.dat' file is not truncated (emptied) until the queue is empty.

       Even the data you've already dequeued remains in the '.dat' file until the queue is empty.

       If you keep enqueueing elements and never FULLY dequeue everything, eventually your disk will fill up!

SEE ALSO

       Tie::File

AUTHOR

       Jason Lavold <jlavold [ at ] gmail.com>

perl v5.10.0							    2008-12-22							  File::Queue(3pm)