![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Rules & FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| maximum tar file size? | Bobby | UNIX for Dummies Questions & Answers | 7 | 12-20-2007 07:06 PM |
| Maximum size of a file in unix | nagalenoj | UNIX for Dummies Questions & Answers | 3 | 08-16-2007 06:56 AM |
| Mailx Maximum attachment size | sreejithau | HP-UX | 0 | 07-03-2007 05:16 AM |
| Maximum input file size in "Diff" Command | Neeraja | UNIX for Dummies Questions & Answers | 1 | 01-17-2007 06:09 AM |
| Maximum File Size | matrixmadhan | High Level Programming | 5 | 05-19-2006 07:56 AM |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
Maximum size of sed file...
The sed -f option (reading sed commands from a file) seems to have a limit of 200 transactions per file. I can't see anything in the man pages about this restriction.
I have a file with several thousand sed commands I need to perform (substitutions) - and while I can split the file into chunks of 200 fairly easily - I wonder if there is a way to define the sed file limitation? Seems a very small number for such a function.
__________________
Pete |
| Forum Sponsor | ||
|
|
|
|||
|
wow, several thousand sed commands on 1 file!!?? i was not aware of a limitation but i will take a look and see if I find anything, if i can ask, what type of file are you doing this to? and did you use sed to create the sed file? that is some extreme sed programming you got going on
|
|
|||
|
Yes - sed was used in creating the sed file.....
The purpose for this is to apply an encryption algorithm against a field within each record of a large file. 1. I first extract all of the values in a certain field from each record. 2. I then apply an encryption algorithim to each value in this new file and that creates another file. 3. I then 'paste' the two files together - which leaves me with 2 columns - tab telimited - first column of raw data - second column encrypted data. 4. I use a series of sed commands to change each line to 's/old/new/g' 5. I then use the output of this as a sed file - which I split into a number of smaller sed files (due to this limitation) and apply the substitutions in each file against the original file - effectively substituting all the raw data in each record for its encrypted equivalent. 6. I then do each of these steps (1-5) against a number of different files. (there are a couple of spurious steps such as removing header records etc when creating the sed files - but they're not that relevant). The original files are flat (text) files. And when the big sed file with all the substitutions iscreated- there are 1,000s of rows. The limitation seems to actually be 199 commands in a sedfile (as 200 stops with the too many arguements error). Extreme programming - I assure you it's far from it. I know there are other ways to do this - i.e. 1 record at a time.... but that seems fairly inefficient.
__________________
Pete Last edited by peter.herlihy; 03-19-2002 at 03:58 PM. |
|
|||
|
thats AWESOME..........
i suppose you are talking about splitting up the sed file and then calling the sed commands in a shell script. sounds like the quick fix and running them against the previously changed file. i don't have access to check now but i'll post if i find anything |
|
|||
|
Yeah - bang on.... I take the large sed file (1,000s) of rows and 'split' into chunks of 199 rows. Then for each of these split files - use sed -f - agsint the original file (renaming it back to it's original name after each sed file is applied)
__________________
Pete |
|
|||
|
Here's the script that I have used. The my_crypt.pl performs the encryption of a a single column file only and creates a new single column file. (that cannot be changed).
All original input files are *.ABC (I've changed the names of the files for putting it up here for obvious reasons). There are header rows - so I remove them as they are the only rows that have .ABC or .abc in them. Some fields are blank (a few only) - so I remove these from the substitue to avoid the error when replacing nulls. Sedformat has the commands to format the large file into s/old/new/g for y in *.ABC; do awk -F, '{print $3 }' $y | sort -u > $y.raw; done && for z in *.raw; do grep -v "\.ABC" $z > temp; mv temp $z; done && for z in *.raw; do grep -v "\.abc" $z > temp; mv temp $z; done && for z in *.raw; do grep -v " " $z > temp; mv temp $z; done && for x in *.raw; do my_crypt.pl e $x; paste $x crypt.out > $x.sedfile; done 2>/dev/null && for w in *.sedfile; do sed -f sedformat $w > temp; mv temp $w; done && for v in *.sedfile; do split -199 $v $v; done && rm *sedfile && for u in *.ABC; do for t in $u.raw.sedfile*; do sed -f $t $u > temp; mv temp $u; done; done && rm *.raw* crypt.out I know it's not the greatest code in the world - it does work however which is fairly important! Any suggestions for enhancements would be great.
__________________
Pete |
|||
| Google UNIX.COM |