Speed problems with tar'ing a 500Gb directory on an eSATA drive


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Speed problems with tar'ing a 500Gb directory on an eSATA drive
# 1  
Old 04-11-2012
Question Speed problems with tar'ing a 500Gb directory on an eSATA drive

I'm trying to compress a directory structure on an external hard drive, connected by eSATA cable to my linux (Ubuntu 10.04) desktop. The total volume is 500Gb with half a million files, ranging from Kb to Mb in size. The drive is 2Tb, with 0.8Tb free space before compression.

running "tar -pcf directory.tar directory" worked for a previous, entirely analogous, 400Gb set of data in about 10 hours.
This time, the command has been running for 7 days, and the tar file is now only growing at 2 Gb/hour - estimated another 50+ days for completion.

I've run it twice now (the cable fell out the first time after two days) and the lack of results is reproducible. Deleting some of the other data from the external drive made no difference.

I'm about to try installing a large RAID0 system in the linux desktop (current drive is almost full), do a straight cp of the directory to there, and repeat the tar locally.
But if anyone has any ideas why this process might be so painfully slow it would be appreciated!

Thanks.
Simon
# 2  
Old 04-11-2012
I don't think that tar or cp are the right commands.

To make a straight copy to another mounted filesystem and preserve permissions:
Code:
cd /filesystem_to_copy
find . -xdev -print | cpio -pdumv /new_filesystem


Ps. I have never used tar to back up anything. It is sometimes useful for moving files to alien systems.


Quote:
tar -pcf directory.tar directory
There is meaning to the -p switch to tar in this context.
This User Gave Thanks to methyl For This Post:
# 3  
Old 04-11-2012
Quote:
Originally Posted by methyl
I don't think that tar or cp are the right commands.
Not being your preferred commands isn't what's making them operate slow, however. I sincerely doubt cpio is going to break the speed barrier here.

What bus speeds would you expect from your disks, omnisppot? Could you be having southbridge issues -- perhaps the bus is saturated?
This User Gave Thanks to Corona688 For This Post:
# 4  
Old 04-11-2012
I agree with Corona688. Probably Hardware problem.
However I have seen a modern tar (i.e. one which can deal with files larger than 2Gb) crawl when it demands more memory.
# 5  
Old 04-13-2012
[QUOTE=methyl;302622215]I don't think that tar or cp are the right commands.

To make a straight copy to another mounted filesystem and preserve permissions:
Code:
cd /filesystem_to_copy
find . -xdev -print | cpio -pdumv /new_filesystem

Ps. I have never used tar to back up anything. It is sometimes useful for moving files to alien systems.


Thanks for the input, but the goal is to move the 500Gb of data from the external drive to an offsite compute cluster. I believe the only way I can do this is ftp, and ftp only supports moving single files, not directories. GUIs like Filezilla don't work as they prompt for a new password every time the token-generated one expires.
I don't think it's possible to mount the external hard drive from a cluster that's behind a firewall - I can only connect to the cluster, not from it Smilie

---------- Post updated at 05:25 AM ---------- Previous update was at 05:20 AM ----------

Quote:
Originally Posted by Corona688
What bus speeds would you expect from your disks, omnisppot? Could you be having southbridge issues -- perhaps the bus is saturated?
Sorry, I don't know how to precisely answer that question! I know I can (if I had enough internal hard drive space) "cp -r" all the data down the SATA cable in a few hours without any issues. It certainly seems I/O on the external drive is the bottleneck with tar. Hopefully this is at the external disk end and not the mobo bus end. Hopefully (I'll find out next week) doing it on a 4-drive RAID0 will overcome that!

---------- Post updated at 05:28 AM ---------- Previous update was at 05:25 AM ----------

Quote:
Originally Posted by methyl
I agree with Corona688. Probably Hardware problem.
However I have seen a modern tar (i.e. one which can deal with files larger than 2Gb) crawl when it demands more memory.
My linux box has 16Gb RAM, but whilst doing this system usage didn't exceed 3Gb (including running the O/S and everything else).
# 6  
Old 04-13-2012
Can you use walknet? i.e. take the external disc drive to the target computer.
# 7  
Old 04-13-2012
Would you consider:-
Code:
# cd source_directory
# tar -cvf - . | rsh target_server "cd target_directory ; tar -xvf -"

I'm assuming it's rsh not remsh for your OS.

If the server is remote or the network is the bottleneck, you could consider:-
Code:
# cd source_directory
# tar -cvf - . | compress | rsh target_server "cd target_directory ; uncompress | tar -xvf -"

Of course, this latter option costs on CPU and is best on multi-proc servers so that the tar and compress are not competing.
I've shovelled 200Gb between remote sites over 2M link in a weekend with something like the above, although the syntax will need to be checked. I must have got pretty good compression I suppose. I can't really test it at the moment.

You will need to ensure that the local server can remote shell to the target. An entry in /.rhosts should suffice, but if this seems a good plan but you can't get remote shell working, let us know.


I hope that this helps
Robin
Liverpool/Blackburn
UK
This User Gave Thanks to rbatte1 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Speed up extraction od tar.bz2 files using bash

The below bash will untar each tar.bz2 folder in the directory, then remove the tar.bz2. Each of the tar.bz2 folders ranges from 40-75GB and currently takes ~2 hours to extract. Is there a way to speed up the extraction process? I am using a xeon processor with 12 cores. Thank you :). ... (7 Replies)
Discussion started by: cmccabe
7 Replies

2. UNIX for Dummies Questions & Answers

Tar-ing folders in a folder

How do I create individual tars of a all the directories in a directory? I have a directory called 'patients', each patient has a directory in the patients directory. I want to create tars such that each patient has their own tar file. Thanks! (5 Replies)
Discussion started by: HappyPhysicist
5 Replies

3. Shell Programming and Scripting

tar command dont tar to original directory

HI, if I have a tarfile called pmapdata.tar that contains tar -tvf pmapdata.tar -rw-r--r-- 0/0 21 Oct 15 11:00 2009 /var/tmp/pmapdata/pmap4628.txt -rw-r--r-- 0/0 21 Oct 14 20:00 2009 /var/tmp/pmapdata/pmap23752.txt -rw-r--r-- 0/0 1625 Oct 13 20:00 2009... (1 Reply)
Discussion started by: borderblaster
1 Replies

4. Shell Programming and Scripting

grep'ing and sed'ing chunks in bash... need help on speeding up a log parser.

I have a file that is 20 - 80+ MB in size that is a certain type of log file. It logs one of our processes and this process is multi-threaded. Therefore the log file is kind of a mess. Here's an example: The logfile looks like: "DATE TIME - THREAD ID - Details", and a new file is created... (4 Replies)
Discussion started by: elinenbe
4 Replies

5. UNIX for Advanced & Expert Users

iSCSI speed problems

Hi all. I was able to set up an IBM Ultrium LTO 4 tape drive to use iSCSI (using open-iscsi drivers) to communicate with Red Hat, but it's going really slow, maxing out in tar and dd tests at like 16 MB/s (using a block size of 128k). The thing is rated for 30MB/s. I feel like even though I have... (1 Reply)
Discussion started by: jeriryan87
1 Replies

6. UNIX for Dummies Questions & Answers

Tar-ing the correct directories

Hi all, my directory structure is as follows /a/b/c. I would like to tar the /a directory including the subdirectories b and c. i intend to use the command tar -cvfz a.tgz a/ My question is where do i execute the command? do i execute it at the '/' prompt or at '/a' prompt ? My concern at... (1 Reply)
Discussion started by: new2ss
1 Replies

7. UNIX for Dummies Questions & Answers

tar'ing and regular expressions

Hi, How do I tar all but a specific set of files in a directory? Is it possible to use regular expressions in the tar command? I want to tar all files except those beginning with D. I tried this tar -cvf files.tar ^ but this didn't work. Anyone any ideas. Thanks (2 Replies)
Discussion started by: sirbrian
2 Replies

8. UNIX for Advanced & Expert Users

tar problems using Sony AIT drive

Recently we brought up a Spectralogic 2K Tape Library that had been out of service for about 3 years to replace a DDS-4 tape drive unit as our main backup device. Everything seemed to go fine but now I have run into a little problem. System details: FBSD 6.1 SpectraLogic 2K library with a... (1 Reply)
Discussion started by: thumper
1 Replies

9. UNIX for Dummies Questions & Answers

tar'ing and zipping files

If I have a directory /directory1 and want to tar and zip everything in it into a file new_tar.tar.gz on disk (not tape) How can I do it? I tried tar -cv /new_tar.tar /directory1/* But I got an error: tar: /dev/rmt/0: No such device or address (4 Replies)
Discussion started by: FredSmith
4 Replies

10. Solaris

Error tar'ing files to tape

I'm trying to tar a bunch of files off to a tape, but for one specific file (it is fairly large, roughly 10Gb) I get the error: too large to archive Does tar have a limit of the size of file it can write off to tape? I'm using SunOS 5.8. Thanks! -Fred (6 Replies)
Discussion started by: FredSmith
6 Replies
Login or Register to Ask a Question