Sponsored Content
Top Forums Shell Programming and Scripting Bash script search, improve performance with large files Post 303032995 by SDohmen on Thursday 28th of March 2019 06:37:00 AM
Old 03-28-2019
Bash script search, improve performance with large files

Hello,


For several of our scripts we are using awk to search patterns in files with data from other files. This works almost perfectly except that it takes ages to run on larger files. I am wondering if there is a way to speed up this process or have something else that is quicker with the searching.


The part that i use is as follows:


Code:
awk -F";" '
NR==FNR         {id[$0]
                 next
                }
                {for (SP in id) if (tolower($0) ~ SP)    {print > "'"$PAD/removed_woord.csv"'"
                                                 next
                                                }
                }
                {print > "'"$PAD/filtered_winnaar_2.csv"'"
                }
' $PAD/prijslijst_filter.csv $PAD/lowercase_winnaar.csv



I got this piece of programming also from this forum but i added the tolower part myself since not always it seem to get all results from the main file. 1 important part is that the results from filtering need to be saved in another file. The filtered file only contains the not found lines of course.

Last edited by joeyg; 04-04-2019 at 09:58 AM..
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Improve Performance

hi someone tell me which ways i can improve disk I/O and system process performance.kindly refer some commands so i can do it on my test machine.thanks, Mazhar (2 Replies)
Discussion started by: mazhar99
2 Replies

2. Shell Programming and Scripting

Any way to improve performance of this script

I have a data file of 2 gig I need to do all these, but its taking hours, any where i can improve performance, thanks a lot #!/usr/bin/ksh echo TIMESTAMP="$(date +'_%y-%m-%d.%H-%M-%S')" function showHelp { cat << EOF >&2 syntax extreme.sh FILENAME Specify filename to parse EOF... (3 Replies)
Discussion started by: sirababu
3 Replies

3. Shell Programming and Scripting

Improve the performance of a shell script

Hi Friends, I wrote the below shell script to generate a report on alert messages recieved on a day. But i for processing around 4500 lines (alerts) the script is taking aorund 30 minutes to process. Please help me to make it faster and improve the performace of the script. i would be very... (10 Replies)
Discussion started by: apsprabhu
10 Replies

4. Shell Programming and Scripting

Want to improve the performance of script

Hi All, I have written a script as follows which is taking lot of time in executing/searching only 3500 records taken as input from one file in log file of 12 GB Approximately. Working of script is read the csv file as an input having 2 arguments which are transaction_id,mobile_number and search... (6 Replies)
Discussion started by: poweroflinux
6 Replies

5. Programming

Help with improve the performance of grep

Input file: #content_1 12314345345 242467 #content_14 436677645 576577657 #content_100 3425546 56 #content_12 243254546 1232454 . . Reference file: content_100 (1 Reply)
Discussion started by: cpp_beginner
1 Replies

6. Shell Programming and Scripting

Performance issue in Grepping large files

I have around 300 files(*.rdf,*.fmb,*.pll,*.ctl,*.sh,*.sql,*.prog) which are of large size. Around 8000 keywords(which will be in the file $keywordfile) needed to be searched inside those files. If a keyword is found in a file..I have to insert the filename,extension,catagoery,keyword,occurrence... (8 Replies)
Discussion started by: millan
8 Replies

7. UNIX for Dummies Questions & Answers

How to improve the performance of this script?

Hi , i wrote a script to convert dates to the formate i want .it works fine but the conversion is tkaing lot of time . Can some one help me tweek this script #!/bin/bash file=$1 ofile=$2 cp $file $ofile mydates=$(grep -Po '+/+/+' $ofile) # gets 8/1/13 mydates=$(echo "$mydates" | sort |... (5 Replies)
Discussion started by: vikatakavi
5 Replies

8. Shell Programming and Scripting

Copying large files in a bash script stops execution

Hello, I'm new to this forum and like to first of all say hello to everyone. I've got a really annoying problem at the moment. I'm trying to rsync some files (about 200MB with one file of 120MB) from a Raspberry PI with raspbian to a debian server via rsync. This procedure is stored in a... (3 Replies)
Discussion started by: wex_storm
3 Replies

9. Programming

Improve the performance of my C++ code

Hello, Attached is my very simple C++ code to remove any substrings (DNA sequence) of each other, i.e. any redundant sequence is removed to get unique sequences. Similar to sort | uniq command except there is reverse-complementary for DNA sequence. The program runs well with small dataset, but... (11 Replies)
Discussion started by: yifangt
11 Replies
SYSLOGOUT(8)						      System Manager's Manual						      SYSLOGOUT(8)

NAME
syslogout - modular centralized shell logout mechanism DESCRIPTION
syslogout is a generic approach to enable centralized shell logout actions for all users of a given system in a modular and centralized way mostly aimed at avoiding work for lazy sysadmins. It has only been tested to work with the bash shell. It basically consists of the small /etc/syslogout shell script which invokes other small shell scripts having a .bash suffix which are con- tained in the /etc/syslogout.d/ directory. The system administrator can drop in any script he wants without any naming convention other than that the scripts need to have a .bash suffix to enable automagic sourcing by the /etc/syslogout script. For shell sessions, the contents of /etc/syslogout.d/" will be sourced by every user at logout if the following lines are present in his $HOME/.bash_logout: if [ -f /etc/syslogout ]; then . /etc/syslogout fi If used for X sessions it is advisable to include the former statement into the Xreset script of the X display manager instead to prevent that closing of an terminal emulator window yields unexpected results in your running X session if your X11 terminal emulator is using a login shell. Be sure then to run it under the user-id of the X session's user. See the example files in /usr/share/doc/syslogout/ for illustration. Users not wanting /etc/syslogout to be sourced for their environment can easily disable it's automatic mechanism. It can be disabled by simply creating an empty file called $HOME/.nosyslogout in the user's home directory using e.g. the touch(1) command. Any single configuration file in /etc/syslogout.d/ can simply be overridden by any user by creating a private $HOME/.syslogout.d/ directory which may contain a user's own version of any configuration file to be sourced instead of the system default. It's names have just to match exactly the system's default /etc/syslogout.d/ configuration files. Empty versions of these files contained in the $HOME/.syslo- gout.d/ directory automatically disable sourcing of the system wide version. Naturally, users can add and include their own private scripts to be automagically executed by /etc/syslogout at logout time. OPTIONS
There are no options other than those dictated by shell conventions. Anything is defined within the configuration scripts themselves. SEE ALSO
The README files and configuration examples contained in /usr/share/doc/syslogout/ and the manual page for bash(1), xdm(1x), xdm.options(5), and wdm(1x). Recommended further reading is everything related with shell programming. If you need a similar mechanism for executing code at login time check out the related package sysprofile(8) which is a very close compan- ion to syslogout. BUGS
syslogout in its current form is mainly restricted to bash(1) syntax. In fact it is actually a rather embarrassing quick and dirty hack than anything else - but it works. It serves the practical need to enable a centralized bash configuration until something better becomes available. Your constructive criticism in making this into something better" is very welcome. Before i forget to mention it: we take patches... ;-) AUTHOR
syslogout was developed by Paul Seelig <pseelig@debian.org> specifically for the Debian GNU/Linux system. Feel free to port it to and use it anywhere else under the conditions of either the GNU public license or the BSD license or both. Better yet, please help to make it into something more worthwhile than it currently is. SYSLOGOUT(8)
All times are GMT -4. The time now is 01:39 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy