comparing Huge Files - Performance is very bad Post: 302092551

9 More Discussions You Might Find Interesting

1. AIX

Bad performance when log in with putty

Hello guys! I'm n00b in AIX and I'm sticked in a problem. (my English is poor enough, but I hope you can understand me :P). So.. I'm trying to connect to an AIX machine with putty, and .. 'using username xxx' appears after 2 sec (OK), but 'xxx@ip's password' appears after 1:15 min. After...

2. Shell Programming and Scripting

Comparing two huge files

Hi, I have two files file A and File B. File A is a error file and File B is source file. In the error file. First line is the actual error and second line gives the information about the record (client ID) that throws error. I need to compare the first field (which doesnt start with '//') of...

3. HP-UX

Bad performance but Low CPU loading?

There might be some problem with my server, because every morning at 7, it's performance become bad with no DB extra deadlock. But I just couldn't figure it out. Please give me some advise, thanks a lot... According to the CPU performace chart, Daily CPU loading Maximum: 42 %, Average:36%. ...

4. Shell Programming and Scripting

Comparing two huge files on field basis.

Hi all, I have two large files and i want a field by field comparison for each record in it. All fields are tab seperated. file1: Email SELVAKUMAR RAMACHANDRAN Email SHILPA SAHU Web NIYATI SONI Web NIYATI SONI Email VIINII DOSHI Web RAJNISH KUMAR Web ...

5. Shell Programming and Scripting

Comparing 2 huge text files

I have this 2 files: k5login sanwar@systems.nyfix.com jjamnik@systems.nyfix.com nisha@SYSTEMS.NYFIX.COM rdpena@SYSTEMS.NYFIX.COM service/backups-ora@SYSTEMS.NYFIX.COM ivanr@SYSTEMS.NYFIX.COM nasapova@SYSTEMS.NYFIX.COM tpulay@SYSTEMS.NYFIX.COM rsueno@SYSTEMS.NYFIX.COM...

6. Solaris

Performance (iops) becomes bad, what is the reason?

I have written a virtual HBA driver named "xmp_vhba". A scsi disk is attached on it. as shown below: xmp_vhba, instance #0 disk, instance #11 But the performance became very bad when we read/write the scsi disk using the vdbench(a read/write io tool). What is the reason? ...

7. HP-UX

Performance issue with 'grep' command for huge file size

I have 2 files; one file (say, details.txt) contains the details of employees and another file (say, emp.txt) has some selected employee names. I am extracting employee details from details.txt by using emp.txt and the corresponding code is: while read line do emp_name=`echo $line` grep -e...

8. Shell Programming and Scripting

Perl: Need help comparing huge files

What do i need to do have the below perl program load 205 million record files into the hash. It currently works on smaller files, but not working on huge files. Any idea what i need to do to modify to make it work with huge files: #!/usr/bin/perl $ot1=$ARGV; $ot2=$ARGV; open(mfileot1,...

9. UNIX for Advanced & Expert Users

Performance problem with removing duplicates in a huge file (50+ GB)

I'm trying to remove duplicate data from an input file with unsorted data which is of size >50GB and write the unique records to a new file. I'm trying and already tried out a variety of options posted in similar threads/forums. But no luck so far.. Any suggestions please ? Thanks !!

LEARN ABOUT REDHAT

pdl::philosophy

PHILOSOPHY(1)						User Contributed Perl Documentation					     PHILOSOPHY(1)

NAME

       PDL::Philosophy -- what's behind PDL?

DESCRIPTION

       This is an attempt to summarize some of the common spirit between pdl developers in order to answer the question "Why PDL"? If you are a
       PDL developer and I haven't caught your favorite ideas about PDL, please let me know!

       An often-asked question is: Why not settle for some of the existing systems like Matlab or IDL or GnuPlot or whatever?

       Major ideas

       The first tenet of our philosophy is the "free software" idea: software being free has several advantages (less bugs because more people
       see the code, you can have the source and port it to your own working environment with you, ... and of course, that you don't need to pay
       anything).

       The second idea is a pet peeve of many: many languages like matlab are pretty well suited for their specific tasks but for a different
       application, you need to change to an entirely different tool and regear yourself mentally. Not to speak about doing an application that
       does two things at once...  Because we use Perl, we have the power and ease of perl syntax, regular expressions, hash tables etc at our
       fingertips at all times.  By extending an existing language, we start from a much healthier base than languages like matlab which have
       grown into existence from a very small functionality at first and expanded little by little, making things look badly planned. We stand by
       the Perl sayings: "simple things should be simple but complicated things should be possible" and "There is more than one way to do it"
       (TIMTOWTDI).

       The third idea is interoperability: we want to be able to use PDL to drive as many tools as possible, we can connect to OpenGL or Mesa for
       graphics or whatever. There isn't anything out there that's really satisfactory as a tool and can do everything we want easily. And be por-
       table.

       The fourth idea is related to PDL::PP and is Tuomas's personal favorite: code should only specify as little as possible redundant info. If
       you find yourself writing very similar-looking code much of the time, all that code could probably be generated by a simple perl script.
       The PDL C preprocessor takes this to an extreme.

       Minor goals and purposes

       We want speed. Optimally, it should ultimately (e.g. with the Perl compiler) be possible to compile PDL::PP subs to C and obtain the top
       vectorized speeds on supercomputers. Also, we want to be able to calculate things at near top speed from inside perl, by using dataflow to
       avoid memory allocation and deallocation (the overhead should ultimately be only a little over one indirect function call plus couple of
       ifs per function in the pipe).

       We want handy syntax. Want to do something and cannot do it easily?  Tell us about it...

       We want lots of goodies. A good mathematical library etc.

AUTHOR

       Copyright(C) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu).  Redistribution in the same form is allowed but reprinting requires a permission
       from the author.

perl v5.8.0							    1999-12-09							     PHILOSOPHY(1)