Sponsored Content
Top Forums Shell Programming and Scripting Removing Lines if value exist in first file Post 302348789 by Azhrei on Saturday 29th of August 2009 05:43:51 PM
Old 08-29-2009
Quote:
Originally Posted by Scrutinizer
While I agree that Perl is usually well suited for this type of application, I do not think this generalization is accurate. The shell scripts above are fine but there is room for some significant speed optimizations. If we use ksh (ksh93s+) instead of bash and a method that resembles the one in your Perl script, I think there would not be a real big difference in speed.
Hmm. Let's take a look at your script and its efficiency/performance and compare that to the Perl script, shall we?

First, the perl script loses big time in terms of startup cost; initializing the interpreter and compiling the script are overhead that can never be reclaimed (although it can be amortized if the data files are large enough). The perl script also loses (slightly) in that it's less readable to people unfamiliar with the language (although the OP was able to correctly determine how to change the field used for his particular case). The final lossage comes from the wordiness of my perl example -- it could've been done more concisely but I was at least partially concerned about the OP being able to understand its overall operation.

(I'm modifying your Korn shell script to add some performance and usage benefits, but it remains essentially the same.) Your Korn shell script does not have the startup cost, but as a true interpreter it will have to constantly be reparsing the loop body every time through the loop, so if there are a significant number of iterations it will be a performance problem. There's also the problem of single and double quotes occurring in the input; the Korn shell's read will handle paired quotes correctly (as it interprets the quotes) while perl will need help from a regular expression to do the work (or the Text::Balanced module). The reason I mention this as a problem is that a single apostrophe will screw up the Korn script but have no impact on the perl script (as the perl script ignores the issue entirely!).
Quote:
filter.ksh93
Code:
#!/usr/bin/ksh
typeset -A EXCLUDED
while read excl; do
  EXCLUDED[$excl]=1
done < "$1"
IFS=","
while read -A fields; do
  if (( ${EXCLUDED[${fields[3]}]} != 1 )); then
    echo "${fields[*]}"
  fi
done < "$2"

In any case, there is no comparison between the two languages when processing more than a few hundred lines of data. I wrote a Korn script to do some text processing for a client (similar to this task) that took 28+ minutes to process 300k records. The same task in Perl took a little over 2 minutes. That's 10k records per minute for the shell script and 150k records per minute for the perl script. Smilie I attribute the difference to the efficiencies of pseudo-compiling and the nature of the I/O between the two scripts (the perl script was in "paragraph" mode, reading 10-20 lines at a time while the shell script had to do one line at a time and maintain a FSM).
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing lines from a file

Hello i have 2 files file1 and file2 as shown below file1 110010000000206|567810008161509 110010000000207|567810072227627 110010000000208|567811368851555 110010000000209|567811422513652 110010000000210|567812130217683 110010000000211|567813220211182 110010000000212|567813449322589... (4 Replies)
Discussion started by: PradeepRed
4 Replies

2. Shell Programming and Scripting

Removing lines within a file

Hi There, I've written a script that processes a data file on our system. Basically the script reads a post code from a list file, looks in the data file for the first occurrence (using grep) and reads the line number. It then tails the data file, with the line number just read, and outputs to a... (3 Replies)
Discussion started by: tookers
3 Replies

3. UNIX for Dummies Questions & Answers

Removing lines from a file

I'm trying to find a command which will allow me to remove a range of lines (2-4) from a .dat file from the command line without opening the file. Someone mentioned using the ex command? Does anyone have any ideas? thanks (6 Replies)
Discussion started by: computersaysno
6 Replies

4. Shell Programming and Scripting

Removing the first and last lines in a file

Hi Gurus, I'm a little new to UNIX. How can I do remove the first and last line in a file? Say, supppose I have a file as below: Code: 1DMA 400002BARRIE 401002CALGARY/LETHBRI 402002CARLETON 500001PORTLAND-AUBRN 501001NEW YORK, NY 502001BINGHAMTON, NY ... (2 Replies)
Discussion started by: naveendronavall
2 Replies

5. UNIX for Dummies Questions & Answers

removing several lines from a file

Hi folks, I have a long string of DNA sequences, and I need to remove several lines, as well as the line directly following them. For example, here is a sample of my starting material: >548::GY31UMJ02DLYEH rank=0007170 x=1363.5 y=471.0 length=478... (1 Reply)
Discussion started by: kkohl78
1 Replies

6. UNIX for Dummies Questions & Answers

Removing a user that doesnt exist from a group

Hi there, normally if I want to remove a user tht I have added to a specific group, i would do the following this is what my group2 looks like # grep group2 /etc/group group2:x:7777:user2,user1,user4 user1 has been defined in a few groups # id -nG user1 group1 group2 group3 So... (3 Replies)
Discussion started by: rethink
3 Replies

7. Shell Programming and Scripting

Deleting lines of a file if they exist in another file

I have a reference file that needs to remain static and another file that may or may not have duplicate rows that match the reference file. I need help with a command that will delete any duplicate rows from the second file while leaving reference file intact For example reference file would... (4 Replies)
Discussion started by: bjdamon
4 Replies

8. Shell Programming and Scripting

Remove lines from one file that exist in another file

Hello Everyone, I'm currently have a requirement where I've generated a list of files with specific attributes and I need to know what lines are similar between the two files. For example: -File 1- line1 line2 line3 -File 2- line1 line2 line4 line5 -Desires Output- line1 line2... (5 Replies)
Discussion started by: omnivir
5 Replies

9. Shell Programming and Scripting

Removing lines from a file

Hi, I have a linux server that was hacked and I have a bunch of files that sporadically contain the following lines through out the file: <?php eval(base64_decode("Xxxxxxxxxxxxxx/xxxxxxxx")); I did't put the exact lines of the file in this post. The "Xxxx" are random letters/numbers.... (8 Replies)
Discussion started by: nck
8 Replies

10. Shell Programming and Scripting

Removing lines from a file

I have a file `/tmp/wrk` containing filenames with paths. I want to remove filenames from this file, for example remove all filenames containing alja cagr cavt clta cmdo or corl remove all filenames containing data for days in region `d.2016.001` to `d.2016.207` remove all filenames... (10 Replies)
Discussion started by: kristinu
10 Replies
CHSH(1) 							   User Commands							   CHSH(1)

NAME
chsh - change login shell SYNOPSIS
chsh [options] [LOGIN] DESCRIPTION
The chsh command changes the user login shell. This determines the name of the user's initial login command. A normal user may only change the login shell for her own account; the superuser may change the login shell for any account. OPTIONS
The options which apply to the chsh command are: -h, --help Display help message and exit. -R, --root CHROOT_DIR Apply changes in the CHROOT_DIR directory and use the configuration files from the CHROOT_DIR directory. -s, --shell SHELL The name of the user's new login shell. Setting this field to blank causes the system to select the default login shell. If the -s option is not selected, chsh operates in an interactive fashion, prompting the user with the current login shell. Enter the new value to change the shell, or leave the line blank to use the current one. The current shell is displayed between a pair of [ ] marks. NOTE
The only restriction placed on the login shell is that the command name must be listed in /etc/shells, unless the invoker is the superuser, and then any value may be added. An account with a restricted login shell may not change her login shell. For this reason, placing /bin/rsh in /etc/shells is discouraged since accidentally changing to a restricted shell would prevent the user from ever changing her login shell back to its original value. FILES
/etc/passwd User account information. /etc/shells List of valid login shells. /etc/login.defs Shadow password suite configuration. SEE ALSO
chfn(1), login.defs(5), passwd(5). shadow-utils 4.5 01/25/2018 CHSH(1)
All times are GMT -4. The time now is 06:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy