OK. I have looked up the tar header format. The tar header contains lots of nul bytes, so any attempt to process a tar archive using the shell, awk, sed, or any other Linux or UNIX text processing utilities produces undefined results. The 1st 100 bytes in a tar header may contain the file's name (if it is <= 100 bytes long), may contain one or more directory names from the file's pathname (if they fit along with the file's name in 100 bytes), and may contain complete garbage left over from archiving a previous file. If the file's name is longer than 100 bytes, but the complete stored pathname is <= 155 bytes, the pathname (including the final component) may be saved in bytes 345-499 (with the 1st byte numbered 0). So your awk script seems to be looking for "02" and "07" at specific points in the middle of a pathname that ends with a newline character and that is somewhere between 86 and 100 bytes long. If these conditions are met in the 1st file archived in the tar file, you may get the results you want for that file; otherwise, all bets are off.
If you will show us what I asked for in my last message (or at least the 1st several lines of output from the tar command and the corresponding output you want to be produced for those lines), we may be able to help you parse the output of a tar archive listing command to get what you want. Otherwise, I don't see how we can help.
Hi don,
as required out from both zcat & the below code needs is as follows
OUTPUT FROM ZCAT filename.tar.gz
Output from the code
Moderator's Comments:
You have repeatedly been asked to use CODE tags. Without CODE tags, spacing gets lost in the HTML output. Given the context, the data shown here is probably incorrect.
Last edited by Don Cragun; 07-16-2013 at 04:41 AM..
Reason: Add CODE tags
Hi don,
as required out from both zcat & the below code needs is as follows
OUTPUT FROM ZCAT filename.tar.gz
Output from the code
Moderator's Comments:
You have repeatedly been asked to use CODE tags. Without CODE tags, spacing gets lost in the HTML output. Given the context, the data shown here is probably incorrect.
OK. This is not what I asked for, but it is informative.
I take back everything I said before. I made the wild assumption that your filename filename.tar.gz followed normal UNIX and Linux conventions (i.e., it was a tar output file that had been compressed using gzip. But, the output from the zcat clearly shows that this is not a tar archive. So, exactly what command line was used to create filename.tar.gz?
And, no matter what created this file, the awk script you have been showing us would never produce the four lines of output you have shown above. Two of these lines seem to meet your criteria, although the text I marked in red (that you showed in bold) can't both be from input columns 84 and 85. (Although both lines do contain 07 in columns 84 and 85.) But, the other two lines don't contain the strings "02" or "07" anywhere that I can see.
So. Forget about the awk code. Tell us in English what criteria you used to decide that the four lines of output shown above are the output that you want?
command line used for creating filename.tar.gz is as follows:
OUTPUT FROM ZCAT filename.tar.gz
Above is the input
Now for required output, i have placed a check for printing those lines which only have 02 in 26th field of the input line & 07 in the 84th field with 2 as length.
So in case it matches then i print the output in a file, count no of match & also the filename from where condition has matched,i.e,
if
is having 10 files with file names say file1, file2... file10, then for every condition matched above should print something like this
Since the above is the matched condition so match counter will be increased accordingly.
In the end i would need match & not match count for each file & for match condition output to be in a.txt. Content of countfile should look something like this
Content of
should look as mentioned above
since i have space constraints so untar cannot be done .
Hope this clarifies....
Since you sent me private mail asking me to help you on this again, I take it that you ignored my previous messages in this thread. The archive files produced by awk contain lots of NULL bytes; so by definition tar archive files are binary, not text, files. The shell and awk utilities are built to work with text files; not binary files, so there is no way to do what you're trying to do with a standard awk. (Some implementations may provide extensions to awk enabling it to work on binary files, but I do not have access to any such implementation. You might also be able to write a perl program to do this, but I am not fluent enough in perl to help you try this.)
It would be easy to extract the files from the archive and walk through the regular files in the extracted file hierarchy to get what you want. But, you say you don't have the room to do that.
The output format produced by tar -t and tar -tv is not standardized (and varies from implementation to implementation). It may be possible for you to use tar -t or tar -tv to get a list of regular files stored in the archive and then use tar -xO pathname in a loop with pathname set to a different regular file in the archive each time through the loop so you can feed the contents of that file through your awk script without saving a copy of the file on disk.
That will require reading the archive n+1 times if there are n regular files in the archive and even this only works if all of the regular files in the archive are text files. I encourage you to play with tar to see if you can make this work. (On some implementations, tar -tf archive will list directories in the archive with a trailing slash on the name and other files without a trailing slash. If the implementation of tar on your system does this; you can use the trailing slash to determine whether to skip that file or to extract it and feed it to your awk script.)
This User Gave Thanks to Don Cragun For This Post:
hey don,
thanks for the input, when i am in need i dont ignore other remarks. I went through your earlier comments & was finding ways to crack this on binary files & from where i learned that the archive i am searching is a ustar format. Anyways, i am working on your comments & will get back to you in case any further help is required.
Hi All,
I have following input file. I wish to retain those lines which match multiple search criteria. The search criteria is stored in a variable seperated from each other by comma(,).
SEARCH_CRITERIA = "REJECT, DUPLICATE"
Input File:
ERROR,MYFILE_20130214_11387,9,37.75... (3 Replies)
Hi
I need to select lines from a txt file, I have got a line starting with ZMIO:MSISDN= and after a few line I have another line starting with 'MOBILE STATION ISDN NUMBER' and another one starting with 'VLR-ADDRESS' I need to copy these three lines as three different columns in a separate... (3 Replies)
Hi All,
I have the following time stamp data in 2 columns
Date TimeStamp(also with milliseconds)
05/23/2012 08:30:11.250
05/23/2012 08:30:15.500
05/23/2012 08:31.15.500
.
.
etc
From this data I need the following output.
0.00( row1-row1 in seconds)
04.25( row2-row1 in... (5 Replies)
Hello,
Need help with following scenario.
A file contains following text:
{beginning of file}
New: This is a new record and it is not
on same line. Since I have lost touch with script
take this challenge and bring all this in one line.
New: Hello losttouch. You seem to be struggling... (4 Replies)
Hey guys, maybe you can help me with this...
I want to read input.dat line by line, while doing a simple calculation between the second column value of the current line and the second column value of the next line (like a difference).
input is something like this:
0 3.945757
1 ... (1 Reply)
I have a business requirement in my project where I need to calculate the 12th working day of every month. Can any please tell me the solution to my problem.
Thanks in advance (7 Replies)
Hi,
I have code which is like this
<TABLE name="UsageDetail_24>
<ROW>
<Date24><!]></Date24>
<Time24><!]></Time24>
<Destination24><!]></Destination24>
<Rate24><!]></Rate24>
<Duration24><!]></Duration24>
<Cost24><!]></Cost24>
<Allowance24><!]></Allowance24>
</ROW>
<ROW>... (3 Replies)
Hi all!
A bit of background: I am trying to create a script that formats SQL statements. I have gotten so far as to add new lines based on certain match criteria like commas, keywords etc. In the process, I end up adding newlines where I don't want.
For example: substr(colName, 1, 10)... (3 Replies)
Attached are views of the components of a dummy Access database. The database represents an example of the problem that has reared its ugly head.
The query example is a simple "Selection" query, which, after getting it to work, will become an "Append" query. The selected data will be appended... (1 Reply)