Best Stratergy to process Huge files

12-11-2009

Registered User

132, 18

Join Date: May 2008

Last Activity: 23 January 2013, 12:06 AM EST

Location: Chennai

Posts: 132

Thanks Given: 0

Thanked 18 Times in 18 Posts

Best Stratergy to process Huge files

I have a file with 20 million records. I need to read each record and process it.
Which will be faster? Perl, Shell or awk?
and what is the best method to read huge files line by line?

tene

View Public Profile for tene

Find all posts by tene

12-11-2009

Registered User

3,216, 33

Join Date: Mar 2005

Last Activity: 4 September 2020, 7:11 AM EDT

Location: classification algos

Posts: 3,216

Thanks Given: 19

Thanked 33 Times in 30 Posts

Is the order of processing of records strictly serialized?
Is there any dependency on output of processing 'n' record with respect to n+1 record?

What operation are you trying to perform on the record?

matrixmadhan

View Public Profile for matrixmadhan

Find all posts by matrixmadhan

12-11-2009

Registered User

132, 18

Join Date: May 2008

Last Activity: 23 January 2013, 12:06 AM EST

Location: Chennai

Posts: 132

Thanks Given: 0

Thanked 18 Times in 18 Posts

There is no dependency between the records neither thay need serialised processing.
I will read each record and add it in a sql query.

eg: select * from table1 where field in (.........)

I will read each field from file and add in this query.Also every 500 fields I will form a new query.

The query will be printed in a file.

tene

View Public Profile for tene

Find all posts by tene

12-11-2009

Registered User

2,669, 20

Join Date: Sep 2006

Last Activity: 28 January 2015, 8:30 AM EST

Posts: 2,669

Thanks Given: 0

Thanked 20 Times in 20 Posts

Quote:

Originally Posted by tene

I have a file with 20 million records. I need to read each record and process it.
Which will be faster? Perl, Shell or awk?
and what is the best method to read huge files line by line?

forget shell for huge files. Awk or Perl for processing huge files are fine, but awk sometimes perform better than Perl in terms of speed. I personally prefer awk.

ghostdog74

View Public Profile for ghostdog74

Find all posts by ghostdog74

Shell Programming and Scripting

Best Stratergy to process Huge files

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Aggregation of Huge files

Discussion started by: Ravichander

2. Shell Programming and Scripting

Difference between two huge .csv files

Discussion started by: Dimple

3. AIX

Copy huge files system

Discussion started by: Mr.AIX

4. AIX

Process ids consuming huge resources ?

Discussion started by: sidharthmellam

5. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Discussion started by: jiapei100

6. High Performance Computing

Huge Files to be Joined on Ux instead of ORACLE

Discussion started by: magedfawzy

7. UNIX for Advanced & Expert Users

Huge files manipulation

Discussion started by: Klashxx

8. UNIX for Dummies Questions & Answers

Difference between two huge files

Discussion started by: pyaranoid

9. Shell Programming and Scripting

Comparing two huge files

Discussion started by: kmkbuddy_1983

10. UNIX for Advanced & Expert Users

Apache process cause huge load

Discussion started by: Sergiu-IT