Extract part of an archive to a different file


 
Thread Tools Search this Thread
Top Forums Programming Extract part of an archive to a different file
# 1  
Old 09-06-2015
Extract part of an archive to a different file

I need to save part of a file to a different one, start and end offset bytes are provided by two counters in long format. If the difference is big, how should I do it to prevent buffer overflow in java?
# 2  
Old 09-06-2015
Why use java? This is a perfect problem for the dd utility.
This User Gave Thanks to Don Cragun For This Post:
# 3  
Old 09-06-2015
Because the software that detects the start and end offsets is entirely written in java, and I want it to be portable.
# 4  
Old 09-06-2015
OK. Show us the java code and show us where in the code you are running into buffering problems.
This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 09-10-2015
I'm not sure I need to show internal code that has nothing to do with it. Basically I think the issue comes from the maximum number of items into a byte array. Say an example program already contains this code:

Code:
byte[] content = new byte[(int) entry.getSize()];

That would mean that the maximum number of elements is the maximum value an integer can achieve, which in Java is 2147483647, so make that bytes. That implies the maximum length of piece to extract can be up to 2 GB approximately. What happens if I want to extract a piece of about 7 GB? Even in the 2GB case, I have no idea if the content is stored into ram memory, which will cause problems on low specs computers.
# 6  
Old 09-10-2015
The code has everything to do with it.

Why do you believe that you have to copy everything into an array before you write any of your desired output?

Open your input file. Seek to the offset of the first byte you want to copy. Read data from your input file and write it to your output file until you have copied all of the bytes you want to extract. You can do this one byte at a time (no buffering issues, but relatively slow for large transfers), one block at a time (trivial buffering, relatively faster), one block at a time tuned to input and output file disk block boundaries (more complex logic, possibly hardware/filesystem dependent, faster).
This User Gave Thanks to Don Cragun For This Post:
# 7  
Old 09-10-2015
So do you think RandomAccessFile will be the simplest way to achieve that? If I write byte by byte is may cause a lot of I/O overhead, specially bad for SSD drives. Writing in blocks of 1MB or so I think is much better. The seek method will provide the way to position the cursor for both reading and writing bytes.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract a part of variable/line content in a file

I have a variable and assigned the following values ***XYZ_201519_20150929140642_20150929140644_211_0_0_211 I need to read this variable from backward and stop read when I get first underscore (_) In this scenario I should get 211 Thanks Kris (3 Replies)
Discussion started by: mkris
3 Replies

2. UNIX for Dummies Questions & Answers

Extract tar archive on remote server in another directory

HI All Please suggest how to untar archive on remote sever. When im trying use regular command without any flags everything is working fine: $( ssh <user>@<server> -n '. ~/.profile >/dev/null 2>&1 ; cd /path_1 ; copiedIVR_name=`ls -tr | tail -1` ; tar xvf $copiedIVR_name ' ) but when im... (9 Replies)
Discussion started by: BACya
9 Replies

3. Shell Programming and Scripting

> dpkg-deb to Extract and Reconstruct a Multipart Archive???

Greetings! Here's one which has been bugging me for a bit ;) As might be known, LibreOffice is available to some of us Linux folk as a large set of debs. Of course, being a curious sort, I'd like to dig in and recreate the original tree which is composed of these assorted archives. So, I... (1 Reply)
Discussion started by: LinQ
1 Replies

4. Shell Programming and Scripting

Extract the part of sequences from a file

I have a text file, input.fasta contains some protein sequences. input.fasta is shown below. >P02649 MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQT LSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQA RLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVY... (8 Replies)
Discussion started by: rahim42
8 Replies

5. Shell Programming and Scripting

Extract part of file

Hello All, I need to extract part of a file into a new file My file is Define schema xxxxxx Insert into table ( a ,b ,c ,d ) values ( 1, 2, 3, (15 Replies)
Discussion started by: nnani
15 Replies

6. Shell Programming and Scripting

extract part of text file

I need to extract the following lines from this text and put it in different files. From xxxx@gmail.com Thu Jun 10 21:15:46 2010 Return-Path: <xxxxx@gmail.com> X-Original-To: xxx@localhost Delivered-To:xxxx@localhost Received: from ubuntu (localhost ) by ubuntu (Postfix) with ESMTP... (11 Replies)
Discussion started by: waxo
11 Replies

7. Shell Programming and Scripting

How to extract certain part of log file?

Hi there, I'm having some problem with UNIX scripting (ksh), perhaps somebody can help me out? For example: ------------ Sample content of my log file (text file): -------------------------------------- File1: .... info_1 ... info_2 ... info_3 ... File2: .... info_1 ... info_2 ...... (10 Replies)
Discussion started by: superHonda123
10 Replies

8. UNIX for Dummies Questions & Answers

choose what to extract from tar archive

Hello! I want to extract a choosen directory (and its contents) from a tar archive and i have tried what i believe is every option i could find in the manual. I think i have done it once before, but i don't remeber how. Could anyone please tell me how to do? (2 Replies)
Discussion started by: noratx
2 Replies

9. UNIX for Dummies Questions & Answers

Extract a part of file name

Hi, I want to extract a part of filename and pass it as a parameter to one of the scripts. Could someone help. File name:- NLL_NAM_XXXXX.XXXXXXX_1_1.txt. Here i have to extract only XXXXX.XXXXXXX and the position will be constant. that means that i have to extract some n characters from... (6 Replies)
Discussion started by: dnat
6 Replies

10. UNIX for Dummies Questions & Answers

How to extract archive to a specified directory

Hi, I would like to extract the files from an archive which I have copied from a different server which has different file structures to my server. When I do a tar xvf archive_name, I get the error saying the file or directory cannot be found. How do I specify a desginated directory to... (4 Replies)
Discussion started by: john_trinh
4 Replies
Login or Register to Ask a Question