The UNIX and Linux Forums  


Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
nim master manoj.solaris AIX 0 04-22-2008 03:43 PM
Append filename to datafile Satyagiri UNIX for Dummies Questions & Answers 3 10-12-2006 10:19 AM
selective positions from a datafile ganapati Shell Programming and Scripting 10 09-19-2006 10:09 AM
oracle datafile *dbf tt155 SUN Solaris 3 12-17-2005 07:38 PM
replace one section in a datafile Paprika Shell Programming and Scripting 4 06-17-2005 09:48 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 05-10-2007
onthetopo onthetopo is offline
Registered User
  
 

Join Date: May 2007
Posts: 24
Combine a datafile with Master datafile, emergent!

Hi guys, my supervisor has asked me to solve the problem in 7 days, I've taken 3 days to think about it but couldn't figure out any idea.
Please give me some thoughts with the following problem,

I have index.database that has only index date:
1994
1995
1996
1997
1998
1999

I have small.database.csv that contains data for some of the indexed dates but not all of them:

1995, california, A3,B6
1999, vermont, A4,B9

I want to match the small.database.csv into index.database into a combined.database.csv so it would look like:

1994,,,
1995, california, A3,B6
1996,,,
1997,,,
1998,,,
1999, vermont, A4,B9

shell scripts or perl would both be fine

Thanks a lot.
My supervisor is after me on this one.
  #2 (permalink)  
Old 05-10-2007
Ygor's Avatar
Ygor Ygor is offline Forum Staff  
Moderator
  
 

Join Date: Oct 2003
Location: -31.96,115.84
Posts: 1,409
Try...
Code:
$ head file?
==> file1 <==
1994
1995
1996
1997
1998
1999

==> file2 <==
1995, california, A3,B6
1999, vermont, A4,B9
$ join -t , -a 1 -o 1.1,2.2,2.3,2.4 file1 file2
1994,,,
1995, california, A3,B6
1996,,,
1997,,,
1998,,,
1999, vermont, A4,B9
$

  #3 (permalink)  
Old 05-11-2007
onthetopo onthetopo is offline
Registered User
  
 

Join Date: May 2007
Posts: 24
It didn't work!
The join command requires two files to be sorted according the index field.
What I have as index field is a date
07/08/1998
Join can't figure it out on its own., all it sees is 07

Please help.
  #4 (permalink)  
Old 05-11-2007
ghostdog74 ghostdog74 is offline Forum Advisor  
Registered User
  
 

Join Date: Sep 2006
Posts: 2,554
If you can use Python, here's an alternative:

Code:
#!/usr/bin/python
flag=0
for line in open("file1"):
    line = line.strip()
    for line2 in open("file2"):
        if line2.split(",")[0] == line:
            print line2.strip()
            flag=1
    if flag: 
        flag = 0
        continue
    else: print "%s,,," % line

output:

Code:
# ./test.py
1994,,,
1995, california, A3,B6
1996,,,
1997,,,
1998,,,
1999, vermont, A4,B9

  #5 (permalink)  
Old 05-11-2007
onthetopo onthetopo is offline
Registered User
  
 

Join Date: May 2007
Posts: 24
Thanks a lot ghostdog74. It works!
But it's really slow for large data files.
Join is surprisingly much faster in managing large files, only join couldn't work in this case.
  #6 (permalink)  
Old 05-12-2007
matrixmadhan matrixmadhan is offline Forum Advisor  
Technorati Master
  
 

Join Date: Mar 2005
Location: leaf node in B+ tree
Posts: 2,958
Quote:
Originally Posted by onthetopo
Thanks a lot ghostdog74. It works!
But it's really slow for large data files.
Join is surprisingly much faster in managing large files, only join couldn't work in this case.

Code:
#! /opt/third-party/bin/perl

open(FILE, "<", "small") || die "Unable to open file small <$!>\n";

while(<FILE>) {
  chomp;
  $fileHash{$_} = $i++;
}

close(FILE);

open(FILE, "<", "index") || die "Unable to open file index <$!>\n";

while(<FILE>) {
  chomp;
  $set = 0;
  foreach my $v ( sort keys %fileHash ) {
    if ( $v =~ m/^$_/ ) {
      print $v . "\n";
      $set = 1;
      last;
    }
  }
  print "$_,,,\n" if ( $set == 0 );
}

close(FILE);

exit 0

This should be fast !
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 01:16 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0