Sponsored Content
Special Forums UNIX and Linux Applications High Performance Computing Huge Files to be Joined on Ux instead of ORACLE Post 302322074 by figaro on Tuesday 2nd of June 2009 05:53:18 PM
Old 06-02-2009
If performance is your concern (rather than data integrity), then a lot more factors are at play than just the platform you are using. For instance the structure of you data might be a factor. For large datasets, databases are almost always better, but if you have a test machine I am keen to find out which of the two methods you prefer and what your observations are.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script needs to be modified - Each 5 Rows to be joined in single line with comma (,)

Hi All, I'm using the following script to produce a result: #!/bin/sh awk ' $0 ~ /\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+Interface\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+\+/ { match_str="YES"; line_cnt=0; next; } { if((line_cnt < 5) && ( match_str=="YES")) { print $0; line_cnt += 1; } else... (0 Replies)
Discussion started by: ntgobinath
0 Replies

2. Shell Programming and Scripting

Comparing two huge files

Hi, I have two files file A and File B. File A is a error file and File B is source file. In the error file. First line is the actual error and second line gives the information about the record (client ID) that throws error. I need to compare the first field (which doesnt start with '//') of... (11 Replies)
Discussion started by: kmkbuddy_1983
11 Replies

3. UNIX for Dummies Questions & Answers

Difference between two huge files

Hi, As per my requirement, I need to take difference between two big files(around 6.5 GB) and get the difference to a output file without any line numbers or '<' or '>' in front of each new line. As DIFF command wont work for big files, i tried to use BDIFF instead. I am getting incorrect... (13 Replies)
Discussion started by: pyaranoid
13 Replies

4. UNIX for Advanced & Expert Users

Huge files manipulation

Hi , i need a fast way to delete duplicates entrys from very huge files ( >2 Gbs ) , these files are in plain text. I tried all the usual methods ( awk / sort /uniq / sed /grep .. ) but it always ended with the same result (memory core dump) In using HP-UX large servers. Any advice will... (8 Replies)
Discussion started by: Klashxx
8 Replies

5. Shell Programming and Scripting

Best Stratergy to process Huge files

I have a file with 20 million records. I need to read each record and process it. Which will be faster? Perl, Shell or awk? and what is the best method to read huge files line by line? (3 Replies)
Discussion started by: tene
3 Replies

6. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a... (1 Reply)
Discussion started by: jiapei100
1 Replies

7. Shell Programming and Scripting

Comparing 2 huge text files

I have this 2 files: k5login sanwar@systems.nyfix.com jjamnik@systems.nyfix.com nisha@SYSTEMS.NYFIX.COM rdpena@SYSTEMS.NYFIX.COM service/backups-ora@SYSTEMS.NYFIX.COM ivanr@SYSTEMS.NYFIX.COM nasapova@SYSTEMS.NYFIX.COM tpulay@SYSTEMS.NYFIX.COM rsueno@SYSTEMS.NYFIX.COM... (11 Replies)
Discussion started by: linuxgeek
11 Replies

8. UNIX for Dummies Questions & Answers

How to seperate two lines that are joined?

i have something like this abc 123 3234 1234 * qqoiki * abc 4533 34 1234 * lloiki * i want to make it two lines i,e.,abc 123 3234 1234 * qqoiki * abc 4533 34 1234 * lloiki * how to do that ? (13 Replies)
Discussion started by: anurupa777
13 Replies

9. Shell Programming and Scripting

Difference between two huge .csv files

Hi all, I need help on getting difference between 2 .csv files. I have 2 large . csv files which has equal number of columns. I nned to compare them and get output in new file which will have difference olny. E.g. File1.csv Name, Date, age,number Sakshi, 16-12-2011, 22, 56 Akash,... (10 Replies)
Discussion started by: Dimple
10 Replies

10. Shell Programming and Scripting

Aggregation of Huge files

Hi Friends !! I am facing a hash total issue while performing over a set of files of huge volume: Command used: tail -n +2 <File_Name> |nawk -F"|" -v '%.2f' qq='"' '{gsub(qq,"");sa+=($156<0)?-$156:$156}END{print sa}' OFMT='%.5f' Pipe delimited file and 156 column is for hash totalling.... (14 Replies)
Discussion started by: Ravichander
14 Replies
JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME
join - relational database operator SYNOPSIS
join [-an] [-e s] [-o list] [-tc] file1 file2 DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 is `-', the standard input is used. File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in each line. There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con- sists of the common field, then the rest of the line from file1, then the rest of the line from file2. Fields are normally separated by blank, tab or newline. In this case, multiple separators count as one, and leading separators are dis- carded. These options are recognized: -an In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2. -e s Replace empty output fields by string s. -o list Each output line comprises the fields specified in list, each element of which has the form n.m, where n is a file number and m is a field number. -tc Use character c as a separator (tab character). Every appearance of c in a line is significant. SEE ALSO
sort(1), comm(1), awk(1). BUGS
With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort. The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous. 7th Edition April 29, 1985 JOIN(1)
All times are GMT -4. The time now is 05:00 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy