I have a large number (50,000) of pretty large compressed files and I need only certain lines of data from them (each relevant line contains a certain key word). Each file contains 300 such lines. The individual file names are indexed by file number (file_name.1, file_name.2, ... , file_name.50000).
So, I need to uncompress, pull out relevant lines and write to combined_file_name and compress again. I wrote the following bash script to do this:
What I am wondering is whether there is a better/faster way to accomplish this.
Any advice would be much appreciated.
Thank you
---------- Post updated at 12:18 PM ---------- Previous update was at 11:50 AM ----------
While reading another post on here, I just discovered the bzcat command.
I am guessing something like this would be faster:
PROJECT: Extracting data from an employee timesheet. The timesheets are done in excel (for user ease) and then converted to .csv files that look like this (see color code key below):
,,,,,,,,,,,,,,,,,,,
9/14/2003,<-- Week Ending,,,,,,,,,,,,,,,,,,
Craig Brennan,,,,,,,,,,,,,,,,,,,... (3 Replies)
frnds,
I m having prob woth doing some 2-3 task simultaneously...
what I want is...
I have lots ( lacs ) of files in a dir...
I want.. these info from arround 2-3 months files
filename convention is - abc20080403sdas.xyz ( for todays files )
I want
1. total no of files for 1 dec... (1 Reply)
Hi,
I have a file that has got compressed data. I would want to uncompress the packed decimal data(not the file). is there a way to do that in ksh? (6 Replies)
Hi,
I have several files that look like this:
File1.txt
Data1
Data2
Data20
File2.txt
Data1
Data5
Data10
File3.txt
Data1
Data2
Data17
File4.txt (6 Replies)
Hi,
I am trying to extract data from multiple output files.
I am able to extract the data from a single output file by
using the following awk commands:
awk '/ test-file*/{print;m=0}' out1.log > out1a.txt
awk '/ test-string/{m=1;c=0}m&&++c==3{print $2 " " $3 " " $4 ;m=0}' out1.log >... (12 Replies)
Hi,
I have several hundreds of PDFfiles number 01.pdf, 02.pdf, 03.pdf, etc in one folder. These are vey long documentd with a lot of information (text, tables, figures, etc). I need to extract the information asociated with one disease in particular (Varicella). The information I need to... (5 Replies)
Hi
I am trying to extract data from within a log file and output format to a new file for further manipulation can someone provide script to do this?
For example I have a file as below and just want to extract all delimited variances of tag 32=* up to the delimiter "|" and output to a new file... (2 Replies)
Hello,
I have two files.
File 1 is a list of interested IDs
Ex1
Ex2
Ex3File 2 is the original file with over 8000 columns and 20 millions rows and is a compressed file .gz
Ex1 xx xx xx xx ....
Ex2 xx xx xx xx ....
Ex2 xx xx xx xx ....Now I need to extract the information for all the IDs of... (4 Replies)
I have a series of csv files in the following format
eg file1
Experiment Name,XYZ_07/28/15,
Specimen Name,Specimen_001,
Tube Name, Control,
Record Date,7/28/2015 14:50,
$OP,XYZYZ,
GUID,abc,
Population,#Events,%Parent
All Events,10500,
P1,10071,95.9
Early Apoptosis,1113,11.1
Late... (6 Replies)
Hi All,
I have log files as below.
log1.txt
<table name="content_analyzer" primary-key="id">
<type="global" />
</table>
<table name="content_analyzer2" primary-key="id">
<type="global" />
</table>
Time taken: 1.008 seconds
ID = gd54321bbvbvbcvb
<table name="content_analyzer"... (7 Replies)
Discussion started by: ROCK_PLSQL
7 Replies
LEARN ABOUT CENTOS
dh_compress
DH_COMPRESS(1) Debhelper DH_COMPRESS(1)NAME
dh_compress - compress files and fix symlinks in package build directories
SYNOPSIS
dh_compress [debhelperoptions] [-Xitem] [-A] [file...]
DESCRIPTION
dh_compress is a debhelper program that is responsible for compressing the files in package build directories, and makes sure that any
symlinks that pointed to the files before they were compressed are updated to point to the new files.
By default, dh_compress compresses files that Debian policy mandates should be compressed, namely all files in usr/share/info,
usr/share/man, files in usr/share/doc that are larger than 4k in size, (except the copyright file, .html and other web files, image files,
and files that appear to be already compressed based on their extensions), and all changelog files. Plus PCF fonts underneath
usr/share/fonts/X11/
FILES
debian/package.compress
These files are deprecated.
If this file exists, the default files are not compressed. Instead, the file is ran as a shell script, and all filenames that the shell
script outputs will be compressed. The shell script will be run from inside the package build directory. Note though that using -X is a
much better idea in general; you should only use a debian/package.compress file if you really need to.
OPTIONS -Xitem, --exclude=item
Exclude files that contain item anywhere in their filename from being compressed. For example, -X.tiff will exclude TIFF files from
compression. You may use this option multiple times to build up a list of things to exclude.
-A, --all
Compress all files specified by command line parameters in ALL packages acted on.
file ...
Add these files to the list of files to compress.
CONFORMS TO
Debian policy, version 3.0
SEE ALSO debhelper(7)
This program is a part of debhelper.
AUTHOR
Joey Hess <joeyh@debian.org>
11.1.6ubuntu2 2018-05-10 DH_COMPRESS(1)