Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Extracting data from many compressed files Post 302411831 by Boltzmann on Friday 9th of April 2010 01:55:55 PM
Old 04-09-2010
Quote:
Originally Posted by thegeek
Yes, z commands have power. You can do zgrep directly, instead of cat & grep.

Refer: The Power of Z Commands – Zcat, Zless, Zgrep, Zdiff Examples (link removed)


Thank you for the advice. I guess I don't have bzgrep installed. I tried using zgrep, but that is actually slower than bzcat piped to grep.

The latter is definitely much faster than my original loop, but still too slow.

I am quite new to all this, so sorry if I am sounding rather amateurish (I am indeed that).

In each of my files, the relevant lines are always the same ones - as a means of an example, say I want to extract lines 5, 17, 38, ... from each of the files. So supposing I can identify which the real line numbers are, would there be a better way to extract them than to use grep. It seems not very efficient to have grep search every file if in effect it is known exactly which lines are needed.

Thank you again and sorry if there is an obvious answer.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl - extracting data from .csv files

PROJECT: Extracting data from an employee timesheet. The timesheets are done in excel (for user ease) and then converted to .csv files that look like this (see color code key below): ,,,,,,,,,,,,,,,,,,, 9/14/2003,<-- Week Ending,,,,,,,,,,,,,,,,,, Craig Brennan,,,,,,,,,,,,,,,,,,,... (3 Replies)
Discussion started by: kregh99
3 Replies

2. Shell Programming and Scripting

extracting data from files..

frnds, I m having prob woth doing some 2-3 task simultaneously... what I want is... I have lots ( lacs ) of files in a dir... I want.. these info from arround 2-3 months files filename convention is - abc20080403sdas.xyz ( for todays files ) I want 1. total no of files for 1 dec... (1 Reply)
Discussion started by: clx
1 Replies

3. Shell Programming and Scripting

Ucompress the compressed data

Hi, I have a file that has got compressed data. I would want to uncompress the packed decimal data(not the file). is there a way to do that in ksh? (6 Replies)
Discussion started by: ahmedwaseem2000
6 Replies

4. UNIX for Dummies Questions & Answers

Finding and Extracting uniq data in multiple files

Hi, I have several files that look like this: File1.txt Data1 Data2 Data20 File2.txt Data1 Data5 Data10 File3.txt Data1 Data2 Data17 File4.txt (6 Replies)
Discussion started by: Fahmida
6 Replies

5. Shell Programming and Scripting

awk - extracting data from a series of files

Hi, I am trying to extract data from multiple output files. I am able to extract the data from a single output file by using the following awk commands: awk '/ test-file*/{print;m=0}' out1.log > out1a.txt awk '/ test-string/{m=1;c=0}m&&++c==3{print $2 " " $3 " " $4 ;m=0}' out1.log >... (12 Replies)
Discussion started by: p_sun
12 Replies

6. UNIX for Dummies Questions & Answers

Extracting data from PDF files into CSV file

Hi, I have several hundreds of PDFfiles number 01.pdf, 02.pdf, 03.pdf, etc in one folder. These are vey long documentd with a lot of information (text, tables, figures, etc). I need to extract the information asociated with one disease in particular (Varicella). The information I need to... (5 Replies)
Discussion started by: Xterra
5 Replies

7. Shell Programming and Scripting

Extracting Delimiter 'TAG' Data From log files

Hi I am trying to extract data from within a log file and output format to a new file for further manipulation can someone provide script to do this? For example I have a file as below and just want to extract all delimited variances of tag 32=* up to the delimiter "|" and output to a new file... (2 Replies)
Discussion started by: Buddyluv
2 Replies

8. Programming

Python script for extracting data using two files

Hello, I have two files. File 1 is a list of interested IDs Ex1 Ex2 Ex3File 2 is the original file with over 8000 columns and 20 millions rows and is a compressed file .gz Ex1 xx xx xx xx .... Ex2 xx xx xx xx .... Ex2 xx xx xx xx ....Now I need to extract the information for all the IDs of... (4 Replies)
Discussion started by: nans
4 Replies

9. Shell Programming and Scripting

Extracting data from specific rows and columns from multiple csv files

I have a series of csv files in the following format eg file1 Experiment Name,XYZ_07/28/15, Specimen Name,Specimen_001, Tube Name, Control, Record Date,7/28/2015 14:50, $OP,XYZYZ, GUID,abc, Population,#Events,%Parent All Events,10500, P1,10071,95.9 Early Apoptosis,1113,11.1 Late... (6 Replies)
Discussion started by: pawannoel
6 Replies

10. Shell Programming and Scripting

Extracting part of data from files

Hi All, I have log files as below. log1.txt <table name="content_analyzer" primary-key="id"> <type="global" /> </table> <table name="content_analyzer2" primary-key="id"> <type="global" /> </table> Time taken: 1.008 seconds ID = gd54321bbvbvbcvb <table name="content_analyzer"... (7 Replies)
Discussion started by: ROCK_PLSQL
7 Replies
bdiff(1)						      General Commands Manual							  bdiff(1)

NAME
bdiff - Finds differences in large files SYNOPSIS
bdiff file1 file2 [number] [-s] bdiff - file2 [number] [-s] bdiff file1 - [number] [-s] The bdiff command compares file1 and file2 and writes information about their differing lines to standard output. If either filename is - (dash), bdiff reads standard input. OPTIONS
Suppresses error messages. (May either precede or follow the number argument if it is specified.) DESCRIPTION
The bdiff command uses diff to find lines that must be changed in two files to make them identical (see the diff command). Its primary purpose is to permit processing of files that are too large for diff. The bdiff command ignores lines common to the beginning of both files, splits the remainders into sections of number lines, and runs diff on the sections. The output is then processed to make it look as if diff had processed the files whole. If you do not specify number, a system default is used. In some cases, the number you specify or the default number may be too large for diff. If bdiff fails, specify a smaller value for number and try again. Note that because of file segmenting, bdiff does not necessarily find the smallest possible set of file differences. In general, although the output is similar, using bdiff is not the equivalent of using diff. NOTES
The diff command is executed by a child process, generated by forking, and communicates with bdiff through pipes. It should not normally be necessary to use this command, since diff can handle most large files. EXIT STATUS
No differences. Differences found. An error occurred. SEE ALSO
Commands: diff(1), diff3(1) bdiff(1)
All times are GMT -4. The time now is 03:25 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy