The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
Google UNIX.COM


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Replace characters in a string using their ascii value roops Shell Programming and Scripting 2 03-03-2005 09:51 PM
text files, ASCII files, binary files and ftp transfers Perderabo Answers to Frequently Asked Questions 0 04-08-2004 01:25 PM
open ASCII files Wing m. Cheng High Level Programming 2 10-27-2001 12:12 PM
ASCII Files yialousa UNIX for Dummies Questions & Answers 1 08-09-2001 04:27 AM
How can I ... (Modifying large ASCII files) hviktor UNIX for Dummies Questions & Answers 2 07-20-2001 07:28 AM

Closed Thread
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 07-03-2002
Registered User
 

Join Date: Jul 2002
Posts: 1
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Thumbs up String substitutions in ASCII files -

We need to scramble data in a number of ASCII files. Some of these files are extremely large (1.2 GB). By scrambling, I mean that we need to substitute certain strings, which number around 400, with scrambled strings. An example has been given below

If "London" occurs in the file, then it needs to be substituted by "X1"

If "Frankfurt" occurs in the file, then it needs to be substituted by "X2".

We have written a Korn shell script, but there are huge performance problems as we need to check for 400 different strings. What is the best way of doing this ?.

The machine is HP-UX B.11.00 E 9000/800.

The solution suggested by Perderabo works...................
...............like LIGHTNING.

Thanks a lot for the help.


Last edited by SanjivNagraj; 07-04-2002 at 03:52 AM.
Forum Sponsor
  #2 (permalink)  
Old 07-03-2002
Perderabo's Avatar
Unix Daemon
 

Join Date: Aug 2001
Location: Washington DC Area
Posts: 8,240
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
The exact best approach would depend on the details of your particular system. It always amazes me when folks ask questions without revealing what version of unix, what computer, etc. Well, I'll this a shot anyway.

The fastest way to do anything is to write a carefully designed assembly language program that will fully exploit the features available on your system. Following close behind would be writing the program in C.

As far as scripts go, the fastest way to to perform the two tranformations that you mentioned is this:
Code:
#! /usr/bin/sed -f
s/London/X1/g
s/Frankfurt/X2/g
You might call it "scramble" and run it like this:
./scramble < inputfile > outputfile

But you want to do 400 substitutions. sed will have some limit on the number of commands that it can handle. It is not likely that you can get all 400 in one script. You can probably get 100, but the exact limit depends on your version of unix. You could have 4 of these, like this:
./scramble1 < input | ./scramble2 | ./scramble3 | ./scramble4 > output
If your computer has at least 4 cpu's this might still be unbeatable by any other scripted solution.

The latest version of ksh, ksh93, has much of sed built-in. A carefully written ksh93 script that relies only on built-ins could probably beat the pipeline of sed scripts. But most folks only have ksh88 available.

Try the sed solution and see where that leaves you.

Last edited by Perderabo; 07-03-2002 at 06:22 AM.
  #3 (permalink)  
Old 07-03-2002
Registered User
 

Join Date: Nov 2001
Location: New Zealand
Posts: 333
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Using Sun OS 5.6..and for me the limit for sedfile usage is 199. Not 200 but 199 substitutions. I had a similar exercise once replacing a ceratin field with it's encrypted value - but I had around 10,000 substitutions to complete.

I'm not sure of the limitations on the -e flag...i.e. I have no idea howmany -e's you can have..but this may be high...(although I doubt it would be).

If you knew perl you could compile the similar with one pass of the file...although somewhat more effort to set up.
__________________
Pete
Google UNIX.COM
Closed Thread

Thread Tools
Display Modes




All times are GMT -7. The time now is 04:42 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101