Large search replace using sed results in memory problem.
I have one big file of size 9GB (big_file.txt). This big file has sentences and paragraphs like any usual English document. I have another file consisting of replacement strings for sed to use. The file name is replace.sed and each entry in one line looks like this:
There are 6 million such replacement strings each in a newline.
As one can see above that my objective is to replace those strings in the big_file.txt which have a space in between words with an underscore character.
I am running the code using this command:
There are no issues with the code as it runs without any errors. I can understand that since I am doing large search and replacement, it will be both time and space demanding. But I cannot understand why the memory usage in the above command keeps on increasing with time and after sometime takes up the entire primary memory available in the computer?
The I used the split command to split the
in smaller chunks each of 500MB. Running the same sed one liner on one of these smaller chunks only at one time also keeps on taking up the memory space.
I even tried with GNU parallel to speed up both on the large and the smaller file:
The above command chokes the entire computer resulting in disk thrashing. Any idea why the above script is taking too much of "ever-increasing" space? I am using BASH on Slackware.
Paralleling process won't help but increase memory congestion as every process will allocate own memory for the same operations.
Don't cat the big_file but have sed read the file directly - that might reduce system memory consumption for piping/buffering. And, try to split the replacefile and iterate the result of the first part_replace through the rest of the part_replaces.
Like (untested)
Try this on smaller subsets of both data and script files.
I created 3 files with the identical data as follows
dial-peer voice 9999 pots
trunkgroup CO
list outgoing Local
translation-profile outgoing LOCAL-7-DIGITS-NO-PREPEND-97
preference 2
shutdown
destination-pattern 9......$
forward-digits 7
dial-peer voice 10000 pots
... (6 Replies)
Hello ,
When using vim, can ctag and cscope support recording search results and displaying the history results ? Once I jump to one tag, I can use :tnext to jump to next tag, but how can I display the preview search result? (0 Replies)
Hello -
I have a very large file in which a certain numbers are repeated. I find that using vi to edit the entire file is useless.
How should i use sed to find a replace such as this text:
To replace: 145.D25.D558
With: 215.22.45.DW
I tried this command:
sed... (4 Replies)
hey guys,
I have a directory with about 600 files. I need to find a specific word inside a command and replace only that instance of the word in many files. For example, lets say I have a command called 'foo' in many files. One of the input arguments of the 'foo' call is 'bar'. The word 'bar'... (5 Replies)
I was running a program and it stopped and showed "Out of Memory!". at that time, the RAM used by this process is around 4G and the free memory size of the machine is around 30G. Does anybody know what maybe the reason? this program is written with Perl. the OS of the machine is Solaris U8. And I... (1 Reply)
hi,
im new for sed, anyone can help me to these in sed command
my output file.txt
"aaa",a1,bbb
"ddd",a1,ccc
"eee",a1,www
need to change a1, to "a1","
output i need
"aaa","a1","bbb
"ddd","a1","ccc
"eee","a1","www
thanks in advance
fsp (2 Replies)
Hi,
The following code loops through every file with an error extension and then loops through all XML files in that directory and replaces the target character @ with / . The problem I have is that if there is more than one occurance of @ in each individual file it doesn't replace it. Any... (2 Replies)
Hi,
In a file FILE, the following lines appear :
WORD 8 8 8
ANOTHERWORD blabla
...
Directly in the prompt, if I type
$sed '/WORD/s/8/10/g' FILE
it replace the 8's by 10's in file :
$cat FILE
WORD 10 10 10
ANOTHERWORD blabla
... (9 Replies)
Is there a way to use the sed command to
1) search a specified pattern
2) in the line where that pattern is found, replace from character N to character N+4 with a new 4-character string.
Thks in advance! (5 Replies)
Hello Folks,
Anyone know how I can replace this line in file.xml
<oacore_nprocs oa_var="s_oacore_nprocs">8</oacore_nprocs>
with this line
<oacore_nprocs oa_var="s_oacore_nprocs">1</oacore_nprocs>
using sed or awk ?
Thanks for your time.
Cheers,
Dave (7 Replies)