Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Extracting data from many compressed files Post 302411785 by Boltzmann on Friday 9th of April 2010 12:18:30 PM
Old 04-09-2010
Extracting data from many compressed files

I have a large number (50,000) of pretty large compressed files and I need only certain lines of data from them (each relevant line contains a certain key word). Each file contains 300 such lines. The individual file names are indexed by file number (file_name.1, file_name.2, ... , file_name.50000).

So, I need to uncompress, pull out relevant lines and write to combined_file_name and compress again. I wrote the following bash script to do this:

Code:
i=1
num=50000
while [ $i -le $num ]
  do
  bunzip2 file_name.$i.bz2
  grep key_word file_name.$i >> combined_file_name
  bzip2 file_name.$i
  i=$[$i+1]
done

What I am wondering is whether there is a better/faster way to accomplish this.

Any advice would be much appreciated.

Thank you

---------- Post updated at 12:18 PM ---------- Previous update was at 11:50 AM ----------

While reading another post on here, I just discovered the bzcat command.

I am guessing something like this would be faster:

Code:
i=1
num=50000
while [ $i -le $num ]
  do
  bzcat file_name.$i.bz2 | grep key_word >> combined_file_name
  i=$[$i+1]
done

Is there something even better?

Thanks again in advance.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl - extracting data from .csv files

PROJECT: Extracting data from an employee timesheet. The timesheets are done in excel (for user ease) and then converted to .csv files that look like this (see color code key below): ,,,,,,,,,,,,,,,,,,, 9/14/2003,<-- Week Ending,,,,,,,,,,,,,,,,,, Craig Brennan,,,,,,,,,,,,,,,,,,,... (3 Replies)
Discussion started by: kregh99
3 Replies

2. Shell Programming and Scripting

extracting data from files..

frnds, I m having prob woth doing some 2-3 task simultaneously... what I want is... I have lots ( lacs ) of files in a dir... I want.. these info from arround 2-3 months files filename convention is - abc20080403sdas.xyz ( for todays files ) I want 1. total no of files for 1 dec... (1 Reply)
Discussion started by: clx
1 Replies

3. Shell Programming and Scripting

Ucompress the compressed data

Hi, I have a file that has got compressed data. I would want to uncompress the packed decimal data(not the file). is there a way to do that in ksh? (6 Replies)
Discussion started by: ahmedwaseem2000
6 Replies

4. UNIX for Dummies Questions & Answers

Finding and Extracting uniq data in multiple files

Hi, I have several files that look like this: File1.txt Data1 Data2 Data20 File2.txt Data1 Data5 Data10 File3.txt Data1 Data2 Data17 File4.txt (6 Replies)
Discussion started by: Fahmida
6 Replies

5. Shell Programming and Scripting

awk - extracting data from a series of files

Hi, I am trying to extract data from multiple output files. I am able to extract the data from a single output file by using the following awk commands: awk '/ test-file*/{print;m=0}' out1.log > out1a.txt awk '/ test-string/{m=1;c=0}m&&++c==3{print $2 " " $3 " " $4 ;m=0}' out1.log >... (12 Replies)
Discussion started by: p_sun
12 Replies

6. UNIX for Dummies Questions & Answers

Extracting data from PDF files into CSV file

Hi, I have several hundreds of PDFfiles number 01.pdf, 02.pdf, 03.pdf, etc in one folder. These are vey long documentd with a lot of information (text, tables, figures, etc). I need to extract the information asociated with one disease in particular (Varicella). The information I need to... (5 Replies)
Discussion started by: Xterra
5 Replies

7. Shell Programming and Scripting

Extracting Delimiter 'TAG' Data From log files

Hi I am trying to extract data from within a log file and output format to a new file for further manipulation can someone provide script to do this? For example I have a file as below and just want to extract all delimited variances of tag 32=* up to the delimiter "|" and output to a new file... (2 Replies)
Discussion started by: Buddyluv
2 Replies

8. Programming

Python script for extracting data using two files

Hello, I have two files. File 1 is a list of interested IDs Ex1 Ex2 Ex3File 2 is the original file with over 8000 columns and 20 millions rows and is a compressed file .gz Ex1 xx xx xx xx .... Ex2 xx xx xx xx .... Ex2 xx xx xx xx ....Now I need to extract the information for all the IDs of... (4 Replies)
Discussion started by: nans
4 Replies

9. Shell Programming and Scripting

Extracting data from specific rows and columns from multiple csv files

I have a series of csv files in the following format eg file1 Experiment Name,XYZ_07/28/15, Specimen Name,Specimen_001, Tube Name, Control, Record Date,7/28/2015 14:50, $OP,XYZYZ, GUID,abc, Population,#Events,%Parent All Events,10500, P1,10071,95.9 Early Apoptosis,1113,11.1 Late... (6 Replies)
Discussion started by: pawannoel
6 Replies

10. Shell Programming and Scripting

Extracting part of data from files

Hi All, I have log files as below. log1.txt <table name="content_analyzer" primary-key="id"> <type="global" /> </table> <table name="content_analyzer2" primary-key="id"> <type="global" /> </table> Time taken: 1.008 seconds ID = gd54321bbvbvbcvb <table name="content_analyzer"... (7 Replies)
Discussion started by: ROCK_PLSQL
7 Replies
FINFO_FILE(3)								 1							     FINFO_FILE(3)

finfo_file - Return information about a file

       Procedural style

SYNOPSIS
string finfo_file NULL NULL (resource $finfo, string $file_name, [int $options = FILEINFO_NONE], [resource $context]) DESCRIPTION
Object oriented style string finfo::file NULL NULL (string $file_name, [int $options = FILEINFO_NONE], [resource $context]) This function is used to get information about a file. PARAMETERS
o $finfo - Fileinfo resource returned by finfo_open(3). o $file_name - Name of a file to be checked. o $options - One or disjunction of more Fileinfo constants. o $context - For a description of contexts, refer to "Stream Functions". RETURN VALUES
Returns a textual description of the contents of the $file_name argument, or FALSE if an error occurred. EXAMPLES
Example #1 A finfo_file(3) example <?php $finfo = finfo_open(FILEINFO_MIME_TYPE); // return mime type ala mimetype extension foreach (glob("*") as $filename) { echo finfo_file($finfo, $filename) . " "; } finfo_close($finfo); ?> The above example will output something similar to: text/html image/gif application/vnd.ms-excel SEE ALSO
finfo_buffer(3). PHP Documentation Group FINFO_FILE(3)
All times are GMT -4. The time now is 10:58 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy