Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Reducing input file size after pattern search Post 302996279 by Xterra on Sunday 23rd of April 2017 10:45:44 AM
Old 04-23-2017
Reducing input file size after pattern search

I have a very large file with millions of entries identified by @M. I am using the following script to "extract" entries based on specific strings/patterns:
Code:
#!/bin/bash
if [[ -f $1 ]]
then
	file=$1
else
	echo "Input_file passed as an argument $1 is NOT found."
	exit;
fi
MID=(NULL "string-1" "string-2" "string-3" "string-4" )
tot=$(grep -c "^@" < "$file" )
echo "Total " "$tot" > log.txt

for y in {1..4}
do
	awk -v search="${MID[$y]}" '$2 ~ search { print $0 }' $file > MID-$y.txt
	awk -v Id="MID-$y" -v pct="$tot" '/^@M/ {count++} END { print Id "\t" (count*100)/pct }' MID-$y.txt >> log.txt
done

I believe it would be more "cost-effective" to reduce the "size" of the input file by eliminating the entries that have been already "extracted" during the initial loops. Thus, by the time the last strings are being searched, the processing time would have been significantly reduced. I was wondering what would be the most efficient way to accomplish such task considering that I am dealing with a sizable infile?
Thanks in advance!

Last edited by Xterra; 04-23-2017 at 04:12 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search file for pattern and grab some lines before pattern

I want to search a file for a string and then if the string is found I need the line that the string is on - but also the previous two lines from the file (that the pattern will not be found in) This is on solaris Can you help? (2 Replies)
Discussion started by: frustrated1
2 Replies

2. Programming

reducing size of executeable in C under Unix

Hi, Could any one tell me how to reduce the size of an executable file of C under Unix. thanks (2 Replies)
Discussion started by: useless79
2 Replies

3. Solaris

reducing to root file size

My root file size has reached 80% and I am looking where all i can reduce the file size . Here is the output of top directories in / . To me none of this looks useful but not sure . We use an appplication and email. Which all can be deleted . Please advise . 2016989 989445 /var 930059 ... (2 Replies)
Discussion started by: Hitesh Shah
2 Replies

4. Shell Programming and Scripting

How to assign the Pattern Search string as Input Variable

guys, I need to know how to assing pattern matched string as an input command variable. Here it goes' My script is something like this. ./routing.sh <Server> <enable|disable> ## This Script takes an input <Server> variable from this line of the script ## echo $1 | egrep... (1 Reply)
Discussion started by: raghunsi
1 Replies

5. Shell Programming and Scripting

Search for a pattern in a String file and count the occurance of each pattern

I am trying to search a file for a patterns ERR- in a file and return a count for each of the error reported Input file is a free flowing file without any format example of output ERR-00001=5 .... ERR-01010=10 ..... ERR-99999=10 (4 Replies)
Discussion started by: swayam123
4 Replies

6. Shell Programming and Scripting

How to use sed to search a particular pattern in a file backward after a pattern is matched.?

Hi, I have two files file1.txt and file2.txt. Please see the attachments. In file2.txt (which actually is a diff output between two versions of file1.txt.), I extract the pattern corresponding to 1172c1172. Now ,In file1.txt I have to search for this pattern 1172c1172 and if found, I have to... (9 Replies)
Discussion started by: saurabh kumar
9 Replies

7. Shell Programming and Scripting

Reducing the decimal points of numbers (3d coordinates) in a file; how to input data to e.g. Python

I have a file full of coordinates of the form: 37.68899917602539 58.07500076293945 57.79100036621094 The numbers don't always have the same number of decimal points. I need to reduce the decimal points of all the numbers (there are 128 rows of 3 numbers) to 2. I have tried to do this... (2 Replies)
Discussion started by: crunchgargoyle
2 Replies

8. Shell Programming and Scripting

Search pattern in a file taking input from another file

Hi, Below is my requirement File1: svasjsdhvassdvasdhhgvasddhvasdhasdjhvasdjsahvasdjvdasjdvvsadjhv vdjvsdjasvdasdjbasdjbasdjhasbdasjhdbjheasbdasjdsajhbjasbjasbhddjb svfsdhgvfdshgvfsdhfvsdadhfvsajhvasjdhvsajhdvsadjvhasjhdvjhsadjahs File2: sdh hgv I need a command such that... (8 Replies)
Discussion started by: imrandec85
8 Replies

9. Shell Programming and Scripting

Grep command to search pattern corresponding to input from user

One more question: I want to grep "COS_12_TM_4 pattern from a file look likes : "COS_12_TM_4" " ];I am taking scan_out as the input from the user. How to search "COS_12_TM_4" in the file which is corresponds to scan_out (12 Replies)
Discussion started by: Preeti Chandra
12 Replies

10. UNIX for Beginners Questions & Answers

Grep/awk using a begin search pattern and end search pattern

I have this fileA TEST FILE ABC this file contains ABC; TEST FILE DGHT this file contains DGHT; TEST FILE 123 this file contains ABC, this file contains DEF, this file contains XYZ, this file contains KLM ; I want to have a fileZ that has only (begin search pattern for will be... (2 Replies)
Discussion started by: vbabz
2 Replies
kclog(1M)																 kclog(1M)

NAME
kclog - manage kernel configuration log file SYNOPSIS
config] string] type] name] [count] comment DESCRIPTION
is the administrative command for the HP-UX kernel configuration log file. The log file is automatically maintained by all of the kernel configuration commands (and Any change to any kernel configuration gets logged to this log file, which is located at Note that this file is a plain text file which can be viewed and manipulated using standard Unix file management commands; exists simply for convenience in find- ing particular log file entries. Under normal usage, prints the last count entries in the log file. When one of the options is specified, prints the last count entries that match the specified criteria. If count is not specified, it defaults to 1. Options will print all entries matching criteria. If this option is not specified, will only print the last count entries that match the specified criteria. will print only log file entries describing changes to the saved kernel configuration named config. If this option is not specified, will print log file entries describing changes to any saved or running kernel configuration. will not print any entries. Instead, will create a new entry, as if a kernel configuration change had been made, containing the specified comment. This option is only allowed for users with appropriate privileges. will print only log file entries that contain the given string. will print only log file entries that refer to a configuration object (module or tunable) of the given name. will print only log file entries that refer to configuration objects of the specified type: or RETURN VALUE
returns zero for success. It returns non-zero and prints a diagnostic message if an error occurs. EXAMPLES
To see the last three entries in the log: To see the last entry for the tunable To see the last five entries for module changes: To see all entries for module changes: To see the last entry mentioning Aberdeen: WARNINGS
The format of the log file may be changed without notice. Some configuration changes can be made without using the kernel configuration commands. No log file entries are made for such changes. The log file should not be manually edited. Doing so may cause to behave unpredictably. SEE ALSO
kcmodule(1M), kconfig(1M), kctune(1M), kconfig(5). available on kclog(1M)
All times are GMT -4. The time now is 02:35 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy