I have a large number (50,000) of pretty large compressed files and I need only certain lines of data from them (each relevant line contains a certain key word). Each file contains 300 such lines. The individual file names are indexed by file number (file_name.1, file_name.2, ... , file_name.50000).
So, I need to uncompress, pull out relevant lines and write to combined_file_name and compress again. I wrote the following bash script to do this:
What I am wondering is whether there is a better/faster way to accomplish this.
Any advice would be much appreciated.
Thank you
---------- Post updated at 12:18 PM ---------- Previous update was at 11:50 AM ----------
While reading another post on here, I just discovered the bzcat command.
I am guessing something like this would be faster:
PROJECT: Extracting data from an employee timesheet. The timesheets are done in excel (for user ease) and then converted to .csv files that look like this (see color code key below):
,,,,,,,,,,,,,,,,,,,
9/14/2003,<-- Week Ending,,,,,,,,,,,,,,,,,,
Craig Brennan,,,,,,,,,,,,,,,,,,,... (3 Replies)
frnds,
I m having prob woth doing some 2-3 task simultaneously...
what I want is...
I have lots ( lacs ) of files in a dir...
I want.. these info from arround 2-3 months files
filename convention is - abc20080403sdas.xyz ( for todays files )
I want
1. total no of files for 1 dec... (1 Reply)
Hi,
I have a file that has got compressed data. I would want to uncompress the packed decimal data(not the file). is there a way to do that in ksh? (6 Replies)
Hi,
I have several files that look like this:
File1.txt
Data1
Data2
Data20
File2.txt
Data1
Data5
Data10
File3.txt
Data1
Data2
Data17
File4.txt (6 Replies)
Hi,
I am trying to extract data from multiple output files.
I am able to extract the data from a single output file by
using the following awk commands:
awk '/ test-file*/{print;m=0}' out1.log > out1a.txt
awk '/ test-string/{m=1;c=0}m&&++c==3{print $2 " " $3 " " $4 ;m=0}' out1.log >... (12 Replies)
Hi,
I have several hundreds of PDFfiles number 01.pdf, 02.pdf, 03.pdf, etc in one folder. These are vey long documentd with a lot of information (text, tables, figures, etc). I need to extract the information asociated with one disease in particular (Varicella). The information I need to... (5 Replies)
Hi
I am trying to extract data from within a log file and output format to a new file for further manipulation can someone provide script to do this?
For example I have a file as below and just want to extract all delimited variances of tag 32=* up to the delimiter "|" and output to a new file... (2 Replies)
Hello,
I have two files.
File 1 is a list of interested IDs
Ex1
Ex2
Ex3File 2 is the original file with over 8000 columns and 20 millions rows and is a compressed file .gz
Ex1 xx xx xx xx ....
Ex2 xx xx xx xx ....
Ex2 xx xx xx xx ....Now I need to extract the information for all the IDs of... (4 Replies)
I have a series of csv files in the following format
eg file1
Experiment Name,XYZ_07/28/15,
Specimen Name,Specimen_001,
Tube Name, Control,
Record Date,7/28/2015 14:50,
$OP,XYZYZ,
GUID,abc,
Population,#Events,%Parent
All Events,10500,
P1,10071,95.9
Early Apoptosis,1113,11.1
Late... (6 Replies)
Hi All,
I have log files as below.
log1.txt
<table name="content_analyzer" primary-key="id">
<type="global" />
</table>
<table name="content_analyzer2" primary-key="id">
<type="global" />
</table>
Time taken: 1.008 seconds
ID = gd54321bbvbvbcvb
<table name="content_analyzer"... (7 Replies)
Discussion started by: ROCK_PLSQL
7 Replies
LEARN ABOUT DEBIAN
mbk_out_filter
MBK_OUT_FILTER(1) MBK ENVIRONMENT VARIABLES MBK_OUT_FILTER(1)NAME
MBK_OUT_FILTER - define the input filter
ORIGIN
This software belongs to the ALLIANCE CAD SYSTEM developed by the ASIM team at LIP6 laboratory of Universite Pierre et Marie CURIE, in
Paris, France.
Web : http://asim.lip6.fr/recherche/alliance/
E-mail : alliance-users@asim.lip6.fr
DESCRIPTION
MBK_OUT_FILTER sets the output filter for writting compressed Alliance files. Filter is typically a string containing filename and options.
This filter must read non compressed data flow on it standard input and write compressed data flow on it standard output. If a non com-
pressed version of a file exist in the same target directory the designer want the save a file's compressed version, to ensure that file
will be read later and not the non compressed one, the non compressed file is DELETED. To activate filters, variable MBK_FILTER_SFX must
be set.
EXAMPLE
Writing compressed files with gzip :
setenv MBK_OUT_FILTER "/asim/gnu/bin/gzip -c"
setenv MBK_FILTER_SFX ".gz"
SEE ALSO mbk(3), MBK_FILTER_SFX(1), MBK_IN_FILTER(1), mbkenv(1).
BUG REPORT
This tool is under development at the ASIM department of the LIP6 laboratory.
We need your feedback to improve documentation and tools.
ASIM /LIP6 October 1, 1999 MBK_OUT_FILTER(1)