automatic tar xf of file with unknown name


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers automatic tar xf of file with unknown name
# 1  
Old 07-08-2007
automatic tar xf of file with unknown name

Hi all,
With curl I can fetch a tar archive from a web server which contains a file ending with .scf which I am interested in. Unfortunately the file name may vary and the subdirectory inside the tar archive may change. I can manually browse the directory structure and extract the file and then rename it to a name which can be read by another program by certain hard coded rules.
Here's my actual example:

curl "http://www.ncbi.nlm.nih.gov/Traces/trace.fcgi?cmd=retrieve&save=1&srcf=1&scfrcf=scf&file=trace&val=%EID%&ti=%EID%\" -s -o .tracecache/%EID%.tar

Where %EID% is for example 458767001

This tar archive contains the file
/2007_07_08_02h41m29s/BACILLUS_ANTHRACIS_STR._AMES_0581/TIGR/traces/BAEI181TF.scf
which I want to copy to the .tracecache directory with the filename 458767001.

The only fixed parameter is the ending of the file I'm interested in is always .scf and there should be always exactly one file of this kind in an arbitary subdirectory. tar shouldn't create a coresponding directory but it should place all files in the same directory.

Is there any way to automatize this?

Thanks in advance,

Thomas
# 2  
Old 07-08-2007
There are some things you could try, but I think they will all require two passes through the tar file; one to read the file names and locate the one you want, and another to actually extract it.

To find the one you want, something like "tar tvf $tarfile | grep '\.scf$'" should work. That gets the listing of archive members (in a slightly obfuscated way; the "t" command is to "test" the archive, i.e. read through it and check for errors, and the "v" option is the usual "verbose" option to list the file names of archive members as they are process) and greps for one with the required extension. You might want to check that there is exactly one match.

Given the file name of the archive member you want to extract, either extract that to a temporary location, move where you want it, rm -rf the temporary tree; or, at least with GNU tar, there's an option -O to extract to standard output, so you can redirect the output to a convenient place.

So to summarize, something like this perhaps.

#!/bin/sh

# TODO: check that the EID is passed in as the sole argument
EID=$1

curl "http://www.ncbi.nlm.nih.gov/Traces/trace.fcgi?cmd=retrieve&save=1&srcf=1&scfrcf=scf&file=trace&val=$EID&ti=$EID" -s -o .tracecache/$EID.tar
tar tvf .tracecache/$EID.tar | grep '\.scf$' | xargs tar xOf .tracecache/$EID.tar >.tracecache/$EID

I'm assuming that the backslash in the curl command line was a mistake, and that the DOS-style variable name %EID% is not a symptom of something more sinister.
era
# 3  
Old 07-09-2007
Power

Thanks very much era!

The curl string was embedded inside a C source code and was hard coded compiled with the application (Hawkeye viewer for Amos, see http://amos.sourceforge.net). The author has used the % signs for replacing string segments by variables with a string function. It has nothing to do with DOS I think.
For this reason the slash was in front of the " to hide it and I forgot to delete it before I posted the thread.

Hawkeye has only space for a single command line, so I decided to put everything into an external script "fetchscf.sh" as you started it already. The command line in Hawkeye is now only

/usr/local/bin/fetchscf.sh %EID% %TRACECACHE%

fetchscf.sh contains:

<source>
#!/bin/sh

EID=$1
tracecache=$2

curl "http://www.ncbi.nlm.nih.gov/Traces/trace.fcgi?cmd=retrieve&save=1&srcf=1&scfrcf=scf&file=trace&val=$EID&ti=$EID" -s -o $tracecache/$EID.tar
tar tvf $tracecache/$EID.tar | grep ' \.scf$' | cut -d: -f2 | cut -b4- | xargs tar xOf $tracecache/$EID.tar >$tracecache/$EID.scf
</source>

Because tar tvf returned the whole info line including the date etc.
-rw-rw-r-- 0/0 106207 2007-07-09 00:29 2007_07_09_01h29m40s/BACILLUS_ANTHRACIS_STR._AMES_0581/TIGR/traces/BAEAT42TR.scf

I had to cut the string from the left side until the first character of the path name.
I wasn't sure if this length was always exactly constant, but I assumed that the time was always separated by ":" and then the fourth byte is the beginning of the path name. This explains the double cut in the pipe.
Maybe this is not perfectly elegant, but it works fine. It isn't worth making it more perfect, because NCBI will change the path names and URL parameters every three month or so.

Thanks again for your help!

Thomas
# 4  
Old 07-11-2007
Ah yes, sorry for missing the mangling of the output from tar; glad I could help.
era
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. AIX

Tar - pre-checking before making the Tar file

Coming from this thread, just wondering if there is an option to check if the Tar of the files/directory will be without any file-errors without actually making the tar. Scenario: Let's say you have a directory of 20GB, but you don't have the space to make Tar file at the moment, and you want... (14 Replies)
Discussion started by: filosophizer
14 Replies

2. AIX

Making Tar of directory and tar file is going to be placed

Quick question, is it possible to make a Tar of completely directory and placing the tar file in it (will this cause even the tar file to tarred ?) sample: /opt/freeware/bin/tar -cvf - /oracle | gzip > /oracle/backup.tgz will the tar file backup.tgz also include backup.tgz ? i tried... (5 Replies)
Discussion started by: filosophizer
5 Replies

3. UNIX for Dummies Questions & Answers

UNIX command to check if file name ends with .tar OR if the file is a tar file

Hello Team, Would you please help me with a UNIX command that would check if file is a tar file. if we dont have that , can you help me with UNIX command that would check if file ends with .tar Thanks in advance. (10 Replies)
Discussion started by: sanjaydubey2006
10 Replies

4. UNIX for Dummies Questions & Answers

Do I need to extract the entire tar file to confirm the tar folder is fine?

I would like to confirm my file.tar is been tar-ed correctly before I remove them. But I have very limited disc space to untar it. Can I just do the listing instead of actual extract it? Can I say confirm folder integrity if the listing is sucessful without problem? tar tvf file1.tar ... (1 Reply)
Discussion started by: vivien_chu
1 Replies

5. UNIX for Dummies Questions & Answers

tar -cvf test.tar `find . -mtime -1 -type f` only tar 1 file

Hi all, 4 files are returned when i issue 'find . -mtime -1 -type f -ls'. ./ora_475244.aud ./ora_671958.aud ./ora_934052.aud ./ora_934050.aud However, when I issued the below command: tar -cvf test.tar `find . -mtime -1 -type f`, the tar file only contains the 1st file -... (2 Replies)
Discussion started by: ahSher
2 Replies

6. UNIX for Advanced & Expert Users

GNU tar automatic gz detection/decompression

I stumbled on this feature on a SLES10 system yesterday... if you tar tf filename.tar.gz or tar xf filename.tar.gz it automatically gunzips the data for you. Has this feature been around for a while? I have 1.12 on my system, which doesn't, but the 1.20 manual mentions it... (3 Replies)
Discussion started by: Annihilannic
3 Replies

7. UNIX for Advanced & Expert Users

Tar utility (untar a .tar file) on VxWorks

Hi All Can someone pls guide me if there any utility to compress file on windows & uncompress on vxworks I tried as - - compressed some folders on windows ... i created .tar ( to maintain directory structure ) and compressed to .gz format. - on VxWorks i have uncompressed it to .tar... (1 Reply)
Discussion started by: uday_01
1 Replies

8. Solaris

PING - Unknown host 127.0.0.1, Unknown host localhost - Solaris 10

Hello, I have a problem - I created a chrooted jail for one user. When I'm logged in as root, everything work fine, but when I'm logged in as a chrooted user - I have many problems: 1. When I execute the command ping, I get weird results: bash-3.00$ usr/sbin/ping localhost ... (4 Replies)
Discussion started by: Przemek
4 Replies

9. Shell Programming and Scripting

extract one file form .tar.gz without uncompressing .tar.gz file

hi all, kindly help me how to extract one file form .tar.gz without uncompressing .tar.gz file. thanks in advance bali (2 Replies)
Discussion started by: balireddy_77
2 Replies

10. UNIX for Advanced & Expert Users

Does tar do crc checking on a tape or tar file?

Trying to answer a question about whether tar table-of-contents is a good tool for verifying tape data. (1 Reply)
Discussion started by: tjlst15
1 Replies
Login or Register to Ask a Question