Hmm... that's an interesting idea!
I'd love to try that out... Actually by the time I woke up, it was able to process some 1 TB (so that makes it 7 hours). I don't know how to formally write down the whole thing but I'll try:
Total Data Size: 2.2 TB (currently handling around 1TB though)
Special Info: The formats of the data were slightly different. There were a total of four data sets (lets call them DS):
DS1 & DS2: Format 1- Size of DS1: 556G RAW
- Size of DS2: 105G Gzip Compressed
DS3 & DS4: Format 2- Size of DS3: 157G RAW
- Size of DS4: 109G RAW
Further, there were two other tasks (extracting a 36G archive and copying some 10G worth data) running handled by a different processor (another computer in fact) on the disk of this main computer.
Tasks running Simultaneously:
- Parsing DS1, DS2, DS3, DS4 and writing the result onto disk again
- Extracting an archive on the same disk using a different computer on which the disk is mounted as a remote drive
- Copying the gzipped files back into the main computer(if anyone has seen my other threads, yes, in fact these were the huge archives I was talking about converting into individual smaller archives )
As of now, I have finished parsing DS1, DS2, DS3 and DS4 but I am left with extracting the huge archives and then parsing the last DS5 which will be around 1.7TB uncompressed. I will perhaps run the optimization then.
Thanks for the advice and looking forward for a post from you.