Sponsored Content
Top Forums Shell Programming and Scripting Combine a datafile with Master datafile, emergent! Post 302117301 by matrixmadhan on Saturday 12th of May 2007 04:52:06 AM
Old 05-12-2007
Quote:
Originally Posted by onthetopo
Thanks a lot ghostdog74. It works!
But it's really slow for large data files.
Join is surprisingly much faster in managing large files, only join couldn't work in this case.
Code:
#! /opt/third-party/bin/perl

open(FILE, "<", "small") || die "Unable to open file small <$!>\n";

while(<FILE>) {
  chomp;
  $fileHash{$_} = $i++;
}

close(FILE);

open(FILE, "<", "index") || die "Unable to open file index <$!>\n";

while(<FILE>) {
  chomp;
  $set = 0;
  foreach my $v ( sort keys %fileHash ) {
    if ( $v =~ m/^$_/ ) {
      print $v . "\n";
      $set = 1;
      last;
    }
  }
  print "$_,,,\n" if ( $set == 0 );
}

close(FILE);

exit 0

This should be fast !
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

replace one section in a datafile

Hi: First, this is not a homework problem. I just need enough of a hint to get this going... My datafile (dataf.in) is made up of 10 sections. Each section begins with & and with && So it looks like this:------------------------------------- &section1 ...etc... && &section2 ...etc...... (4 Replies)
Discussion started by: Paprika
4 Replies

2. Solaris

oracle datafile *dbf

Hi ,,,, I have move an oracle db from old server to a new server ( solaris 5.9 is the operating system ) my problem is that to new server the datafile ( *.dbf ) are in a different path ..... example old : /export/home/data/blobs ........... new /oracle/data/blobs....... how i can... (3 Replies)
Discussion started by: tt155
3 Replies

3. Shell Programming and Scripting

selective positions from a datafile

Hi dear friends, Im writing a shell script which has to select the strings based on the position. but the problem is there is no field seperator. Normally a datafile contains 2000 records (lines) and each line is of size 500 charecters. I want to select the fields from all the lines which... (10 Replies)
Discussion started by: ganapati
10 Replies

4. UNIX for Dummies Questions & Answers

Append filename to datafile

I am working on an shell script which checks for all the file starting with abc*.* and if file found then the filelines need to append the file name in begining can some one help with the filename appending... for i in `ls $filename*.csv` do echo $i --- NEED to append file name befor... (3 Replies)
Discussion started by: Satyagiri
3 Replies

5. UNIX for Advanced & Expert Users

How do we know which processis creating a datafile

Hi, Is there any way we can find out which process is creating a partucular datafile.I know the user and group but i am just curios to know is there any way to find the process. Thanks (7 Replies)
Discussion started by: ukatru
7 Replies

6. Shell Programming and Scripting

Reversing numbers in a datafile of rows and columns

Hello, I've tried searching the forum for an answer to my question, but without any luck... I have a datafile looking simplified as follows: 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 I want to reverse it by rearranging all the numbers from last to... (16 Replies)
Discussion started by: mattings
16 Replies

7. Shell Programming and Scripting

sorting the datafile in an order given in second datafile

Hi, I have two files: first input file is having 7-8 columns, and second data file is like I want to arrange my datafile1 in the order given in second data file, by comparing the seconddatafile with the second column of first file and print the entire line....also if any... (2 Replies)
Discussion started by: CAch
2 Replies

8. Shell Programming and Scripting

Help with datafile parsing and creating spreadsheet

I have a datafile containing data in the following format name1,employee_number1,cell1,home1,fax1 name2,employee_number2,cell2,home2,fax2 name3,employee_number3,cell3,home3,fax3 name4,employee_number4,cell4,home4,fax4 name5,employee_number5,cell5,home5,fax5 ... ... .... I would like... (6 Replies)
Discussion started by: inditopgun
6 Replies

9. Shell Programming and Scripting

Validating a datafile with the datatypes

I have two input files 1)datafile 2)metadata file. I have a metadata file like: field1datatypeformat1number2string3dateyy-mm-dd I have a data file like: 1234abc12-8-16 xyz234512-9-163456acd14-08-12 In the first row there is no correction as everything is inline with the metadata.... (3 Replies)
Discussion started by: bikky6
3 Replies

10. Shell Programming and Scripting

Read in numbers from a datafile

Hi, I want to be able to read numbers from many files which have the same general form as follows: C3H8 4.032258004031807E-002 Phi = 1.000000E+00 Tau = 5.749E+00 sL0 = 3.805542E+01 dL0 = 1.514926E-02 Tb = 2.328291E+03 Tu = 3.450E+02 Alpha = ... (3 Replies)
Discussion started by: lost.identity
3 Replies
largefile(5)                                            Standards, Environments, and Macros                                           largefile(5)

NAME
largefile - large file status of utilities DESCRIPTION
A large file is a regular file whose size is greater than or equal to 2 Gbyte ( 2**31 bytes). A small file is a regular file whose size is less than 2 Gbyte. Large file aware utilities A utility is called large file aware if it can process large files in the same manner as it does small files. A utility that is large file aware is able to handle large files as input and generate as output large files that are being processed. The exception is where additional files are used as system configuration files or support files that can augment the processing. For example, the file utility supports the -m option for an alternative "magic" file and the -f option for a support file that can contain a list of file names. It is unspecified whether a utility that is large file aware will accept configuration or support files that are large files. If a large file aware utility does not accept configuration or support files that are large files, it will cause no data loss or corruption upon encountering such files and will return an appropriate error. The following /usr/bin utilities are large file aware: adb awk bdiff cat chgrp chmod chown cksum cmp compress cp csh csplit cut dd dircmp du egrep fgrep file find ftp getconf grep gzip head join jsh ksh ln ls mdb mkdir mkfifo more mv nawk page paste pathchck pg rcp remsh rksh rm rmdir rsh sed sh sort split sum tail tar tee test touch tr uncompress uudecode uuencode wc zcat The following /usr/xpg4/bin utilities are large file aware: awk cp chgrp chown du egrep fgrep file grep ln ls more mv rm sed sh sort tail tr The following /usr/xpg6/bin utilities are large file aware: getconf ls tr The following /usr/sbin utilities are large file aware: install mkfile mknod mvdir swap See the USAGE section of the swap(1M) manual page for limitations of swap on block devices greater than 2 Gbyte on a 32-bit operating sys- tem. The following /usr/ucb utilities are large file aware: chown from ln ls sed sum touch The /usr/bin/cpio and /usr/bin/pax utilities are large file aware, but cannot archive a file whose size exceeds 8 Gbyte - 1 byte. The /usr/bin/truss utilities has been modified to read a dump file and display information relevant to large files, such as offsets. cachefs file systems The following /usr/bin utilities are large file aware for cachefs file systems: cachefspack cachefsstat The following /usr/sbin utilities are large file aware for cachefs file systems: cachefslog cachefswssize cfsadmin fsck mount umount nfs file systems The following utilities are large file aware for nfs file systems: /usr/lib/autofs/automountd /usr/sbin/mount /usr/lib/nfs/rquotad ufs file systems The following /usr/bin utility is large file aware for ufs file systems: df The following /usr/lib/nfs utility is large file aware for ufs file systems: rquotad The following /usr/xpg4/bin utility is large file aware for ufs file systems: df The following /usr/sbin utilities are large file aware for ufs file systems: clri dcopy edquota ff fsck fsdb fsirand fstyp labelit lockfs mkfs mount ncheck newfs quot quota quotacheck quotaoff quotaon repquota tunefs ufsdump ufsrestore umount Large file safe utilities A utility is called large file safe if it causes no data loss or corruption when it encounters a large file. A utility that is large file safe is unable to process properly a large file, but returns an appropriate error. The following /usr/bin utilities are large file safe: audioconvert audioplay audiorecord comm diff diff3 diffmk ed lp mail mailcompat mailstats mailx pack pcat red rmail sdiff unpack vi view The following /usr/xpg4/bin utilities are large file safe: ed vi view The following /usr/xpg6/bin utility is large file safe: ed The following /usr/sbin utilities are large file safe: lpfilter lpforms The following /usr/ucb utilities are large file safe: Mail lpr The following /usr/lib utility is large file safe: sendmail SEE ALSO
lf64(5), lfcompile(5), lfcompile64(5) SunOS 5.10 7 Nov 2003 largefile(5)
All times are GMT -4. The time now is 05:55 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy