Sponsored Content
Top Forums UNIX for Dummies Questions & Answers comparing Huge Files - Performance is very bad Post 302092588 by ghostdog74 on Tuesday 10th of October 2006 10:58:27 PM
Old 10-10-2006
Python alternative:
Code:
#!/usr/bin/python
deltafile = open("delta.txt","a")
yfile = open("yester_file.txt") #open yesterday file
tfile = open("today_file.txt") #open today file

for i in xrange(0,2000000): #loop 2million records
        yesterline = yfile.readline().strip() #strip newline
        todayline = tfile.readline().strip()
        y_primary , y_2nd, y_3rd , y_4th = yesterline.split("|")
        t_primary, t_2nd, t_3rd, t_4th = todayline.split("|")
        if y_primary == t_primary:
                if y_4th != t_4th:         
                        print >> deltafile , "C|%s|%s|%s|%s" %( t_primary , t_2nd ,"U" , t_4th)
        else:
                print >> deltafile, "A|%s|%s|%s|%s" %( t_primary , t_2nd, t_3rd, t_4th )
                print >> deltafile, "D|%s|%s|%s|%s" %( y_primary , y_2nd, "D", y_4th)

deltafile.close() #close output file

Output:
/home > python test.py
C|aaa|xxxxxxxxxxxxxxxxxxxxxxxxx|U|vvvvvvvvvvvvvvvvvvv
C|bbb|xxxxxxxxxxxxxxxxxxxxxxxxx|U|kkkkkkkkkkkkkkkk
A|ddd|xxxxxxxxxxxxxxxxxxxxxxxxx|I|zzzzzzzzzzzzzzzzz
D|ccc|xxxxxxxxxxxxxxxxxxxxxxxxx|D|bbbbbbbbbbbbbbb
 

9 More Discussions You Might Find Interesting

1. AIX

Bad performance when log in with putty

Hello guys! I'm n00b in AIX and I'm sticked in a problem. (my English is poor enough, but I hope you can understand me :P). So.. I'm trying to connect to an AIX machine with putty, and .. 'using username xxx' appears after 2 sec (OK), but 'xxx@ip's password' appears after 1:15 min. After... (6 Replies)
Discussion started by: combat2k
6 Replies

2. Shell Programming and Scripting

Comparing two huge files

Hi, I have two files file A and File B. File A is a error file and File B is source file. In the error file. First line is the actual error and second line gives the information about the record (client ID) that throws error. I need to compare the first field (which doesnt start with '//') of... (11 Replies)
Discussion started by: kmkbuddy_1983
11 Replies

3. HP-UX

Bad performance but Low CPU loading?

There might be some problem with my server, because every morning at 7, it's performance become bad with no DB extra deadlock. But I just couldn't figure it out. Please give me some advise, thanks a lot... According to the CPU performace chart, Daily CPU loading Maximum: 42 %, Average:36%. ... (8 Replies)
Discussion started by: GreenShery
8 Replies

4. Shell Programming and Scripting

Comparing two huge files on field basis.

Hi all, I have two large files and i want a field by field comparison for each record in it. All fields are tab seperated. file1: Email SELVAKUMAR RAMACHANDRAN Email SHILPA SAHU Web NIYATI SONI Web NIYATI SONI Email VIINII DOSHI Web RAJNISH KUMAR Web ... (4 Replies)
Discussion started by: Suman Singh
4 Replies

5. Shell Programming and Scripting

Comparing 2 huge text files

I have this 2 files: k5login sanwar@systems.nyfix.com jjamnik@systems.nyfix.com nisha@SYSTEMS.NYFIX.COM rdpena@SYSTEMS.NYFIX.COM service/backups-ora@SYSTEMS.NYFIX.COM ivanr@SYSTEMS.NYFIX.COM nasapova@SYSTEMS.NYFIX.COM tpulay@SYSTEMS.NYFIX.COM rsueno@SYSTEMS.NYFIX.COM... (11 Replies)
Discussion started by: linuxgeek
11 Replies

6. Solaris

Performance (iops) becomes bad, what is the reason?

I have written a virtual HBA driver named "xmp_vhba". A scsi disk is attached on it. as shown below: xmp_vhba, instance #0 disk, instance #11 But the performance became very bad when we read/write the scsi disk using the vdbench(a read/write io tool). What is the reason? ... (7 Replies)
Discussion started by: ForgetChen
7 Replies

7. HP-UX

Performance issue with 'grep' command for huge file size

I have 2 files; one file (say, details.txt) contains the details of employees and another file (say, emp.txt) has some selected employee names. I am extracting employee details from details.txt by using emp.txt and the corresponding code is: while read line do emp_name=`echo $line` grep -e... (7 Replies)
Discussion started by: arb_1984
7 Replies

8. Shell Programming and Scripting

Perl: Need help comparing huge files

What do i need to do have the below perl program load 205 million record files into the hash. It currently works on smaller files, but not working on huge files. Any idea what i need to do to modify to make it work with huge files: #!/usr/bin/perl $ot1=$ARGV; $ot2=$ARGV; open(mfileot1,... (12 Replies)
Discussion started by: mrn6430
12 Replies

9. UNIX for Advanced & Expert Users

Performance problem with removing duplicates in a huge file (50+ GB)

I'm trying to remove duplicate data from an input file with unsorted data which is of size >50GB and write the unique records to a new file. I'm trying and already tried out a variety of options posted in similar threads/forums. But no luck so far.. Any suggestions please ? Thanks !! (9 Replies)
Discussion started by: Kannan K
9 Replies
PMNSMERGE(1)						      General Commands Manual						      PMNSMERGE(1)

NAME
pmnsmerge - merge multiple versions of a Performance Co-Pilot PMNS SYNOPSIS
$PCP_BINADM_DIR/pmnsmerge [-adfv] infile [...] outfile DESCRIPTION
pmnsmerge merges multiple instances of a Performance Metrics Name Space (PMNS), as used by the components of the Performance Co-Pilot (PCP). Each infile argument names a file that includes the root of a PMNS, of the form root { /* arbitrary stuff */ } The order in which the infile files are processed is determined by the presence or absence of embedded control lines of the form #define _DATESTAMP YYYYMMDD Files without a control line are processed first and in the order they appear on the command line. The other files are then processed in order of ascending _DATESTAMP. The -a option suppresses the argument re-ordering and processes all files in the order they appear on the command line. The merging proceeds by matching names in PMNS, only those new names in each PMNS are considered, and these are added after any existing metrics with the longest possible matching prefix in their names. For example, merging these two input PMNS root { root { surprise 1:1:3 mine 1:1:1 mine 1:1:1 foo foo yawn yours 1:1:2 } } foo { foo { fumble 1:2:1 mumble 1:2:3 stumble 1:2:2 stumble 1:2:2 } } yawn { sleepy 1:3:1 } Produces the resulting PMNS in out. root { mine 1:1:1 foo yours 1:1:2 surprise 1:1:3 yawn } foo { fumble 1:2:1 stumble 1:2:2 mumble 1:2:3 } yawn { sleepy 1:3:1 } To avoid accidental over-writing of PMNS files, outfile is expected to not exist when pmnsmerge starts. The -f option forces the removal of outfile (if it exists), before the check is made. The -d option allows the resultant PMNS to optionally contain duplicate PMIDs with different names in the PMNS. By default this condition is considered an error. The -v option produces one line of diagnostic output as each infile is processed. Once all of the merging has been completed, pmnsmerge will attempt to load the resultant namespace using pmLoadASCIINameSpace(3) - if this fails for any reason, outfile will still be created, but pmnsmerge will report the problem and exit with non-zero status. CAVEAT
Once the writing of the new outfile file has begun, the signals SIGINT, SIGHUP and SIGTERM will be ignored to protect the integrity of the new file. PCP ENVIRONMENT
Environment variables with the prefix PCP_ are used to parameterize the file and directory names used by PCP. On each installation, the file /etc/pcp.conf contains the local values for these variables. The $PCP_CONF variable may be used to specify an alternative configura- tion file, as described in pcp.conf(5). SEE ALSO
pmnsadd(1), pmnsdel(1), pmLoadASCIINameSpace(3), pcp.conf(5), pcp.env(5) and pmns(5). Performance Co-Pilot PCP PMNSMERGE(1)
All times are GMT -4. The time now is 03:39 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy