Sponsored Content
Top Forums Shell Programming and Scripting Bash script search, improve performance with large files Post 303033012 by SDohmen on Thursday 28th of March 2019 11:21:00 AM
Old 03-28-2019
Quote:
Originally Posted by RudiC
Would you mind to also time the proposal in post #3?

I actually did but i edited in the post after. Smilie


Code:
awk  prijslijst_filter.csv lowercase_winnaar.csv  9,51s user 0,13s system 99% cpu 9,647 total

Since the difference between the grep and this newer awk is only mere seconds i am not sure which i am going to use. The awk one is prefered as it is a drop in solution for the current one but the grep one is still quite alot faster.


Grep has also the advantage that it responds better with the ignore case part. I never seem to get this properly working on the awk one even with the forced lowercase on both files.




I just tried your awk solution again RudiC and it seems something is wrong with it . I did not check the first time because i had to leave right after i tested it (the files got overwritten after).


It seems the part you gave does not give any files to continue the rest of the script.


Code:
awk '
NR==FNR                 {SRCH=SRCH DL $0
                         DL = "|"
                         next
                        }
tolower($0) ~ SRCH      {print > "'"$PAD/removed_woord_blaat33.csv"'"
                         next
                        }

                        {print > "'"$PAD/filtered_winnaar_blaat33.csv"'"
                        }
' prijslijst_filter.csv lowercase_winnaar.csv


I tried with and without time to see if that caused the issue but it did not change the outcome. Both new files arent created.

Last edited by SDohmen; 03-28-2019 at 12:29 PM.. Reason: new info
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Improve Performance

hi someone tell me which ways i can improve disk I/O and system process performance.kindly refer some commands so i can do it on my test machine.thanks, Mazhar (2 Replies)
Discussion started by: mazhar99
2 Replies

2. Shell Programming and Scripting

Any way to improve performance of this script

I have a data file of 2 gig I need to do all these, but its taking hours, any where i can improve performance, thanks a lot #!/usr/bin/ksh echo TIMESTAMP="$(date +'_%y-%m-%d.%H-%M-%S')" function showHelp { cat << EOF >&2 syntax extreme.sh FILENAME Specify filename to parse EOF... (3 Replies)
Discussion started by: sirababu
3 Replies

3. Shell Programming and Scripting

Improve the performance of a shell script

Hi Friends, I wrote the below shell script to generate a report on alert messages recieved on a day. But i for processing around 4500 lines (alerts) the script is taking aorund 30 minutes to process. Please help me to make it faster and improve the performace of the script. i would be very... (10 Replies)
Discussion started by: apsprabhu
10 Replies

4. Shell Programming and Scripting

Want to improve the performance of script

Hi All, I have written a script as follows which is taking lot of time in executing/searching only 3500 records taken as input from one file in log file of 12 GB Approximately. Working of script is read the csv file as an input having 2 arguments which are transaction_id,mobile_number and search... (6 Replies)
Discussion started by: poweroflinux
6 Replies

5. Programming

Help with improve the performance of grep

Input file: #content_1 12314345345 242467 #content_14 436677645 576577657 #content_100 3425546 56 #content_12 243254546 1232454 . . Reference file: content_100 (1 Reply)
Discussion started by: cpp_beginner
1 Replies

6. Shell Programming and Scripting

Performance issue in Grepping large files

I have around 300 files(*.rdf,*.fmb,*.pll,*.ctl,*.sh,*.sql,*.prog) which are of large size. Around 8000 keywords(which will be in the file $keywordfile) needed to be searched inside those files. If a keyword is found in a file..I have to insert the filename,extension,catagoery,keyword,occurrence... (8 Replies)
Discussion started by: millan
8 Replies

7. UNIX for Dummies Questions & Answers

How to improve the performance of this script?

Hi , i wrote a script to convert dates to the formate i want .it works fine but the conversion is tkaing lot of time . Can some one help me tweek this script #!/bin/bash file=$1 ofile=$2 cp $file $ofile mydates=$(grep -Po '+/+/+' $ofile) # gets 8/1/13 mydates=$(echo "$mydates" | sort |... (5 Replies)
Discussion started by: vikatakavi
5 Replies

8. Shell Programming and Scripting

Copying large files in a bash script stops execution

Hello, I'm new to this forum and like to first of all say hello to everyone. I've got a really annoying problem at the moment. I'm trying to rsync some files (about 200MB with one file of 120MB) from a Raspberry PI with raspbian to a debian server via rsync. This procedure is stored in a... (3 Replies)
Discussion started by: wex_storm
3 Replies

9. Programming

Improve the performance of my C++ code

Hello, Attached is my very simple C++ code to remove any substrings (DNA sequence) of each other, i.e. any redundant sequence is removed to get unique sequences. Similar to sort | uniq command except there is reverse-complementary for DNA sequence. The program runs well with small dataset, but... (11 Replies)
Discussion started by: yifangt
11 Replies
ifpps(8)							netsniff-ng-toolkit							  ifpps(8)

NAME
ifpps - fetch and format kernel network statistics SYNOPSIS
ifpps -d|--dev <netdev> [-t|--interval <sec>][-p|--promisc][-c|--term] [-C|--csv][-H|--csv-tablehead][-l|--loop][-v|--version][-h|--help] DESCRIPTION
A tiny tool to provide top-like reliable networking statistics. ifpps reads out the 'real' kernel statistics, so it does not give erroneous statistics on high I/O load. OPTIONS
ifpps --dev eth0 Fetch eth0 interface statistics. ifpps --dev eth0 --interval 60 --csv Output eth0 interface statistics every minute in CSV format. OPTIONS
-h|--help Print help text and lists all options. -v|--version Print version. -d|--dev <netdev> Device to fetch statistics for i.e., eth0. -p|--promisc Put the device in promiscuous mode -t|--interval <time> Refresh time in sec (default 1 sec) -c|--term Output to terminal -C|--csv Output in CSV format. E.g. post-processing with Gnuplot et al. -H|--csv-tablehead Print CSV table head. -l|--loop Loop terminal output. AUTHOR
Written by Daniel Borkmann <daniel@netsniff-ng.org> DOCUMENTATION
Documentation by Emmanuel Roullit <emmanuel@netsniff-ng.org> BUGS
Please report bugs to <bugs@netsniff-ng.org> 2012-06-29 ifpps(8)
All times are GMT -4. The time now is 01:29 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy