How to control grep output intact for each matching line?
I have multiple (~80) files (some can be as big as 30GB of >1 billion of lines!) to grep on a pattern, and piped the match to a single file. I have a 96-core machine so that each grep job was sent to the background to speed up the search:
It seems to me the problem is from the writing of the pipe, as 80 grep jobs for 80 files are writing to the same output file. By default grep prints matching lines so that I assume each row should be printed as a whole, but it did not in my case.
What is wrong here?
Last edited by yifangt; 4 Weeks Ago at 02:20 PM..
Buffering will make a mess of this, bundling arbitrary blocks into one write. These arbitrary blocks don't care much where lines begin and end. Long enough lines could conceivably take more than one write!
If you have GNU awk, --line-buffered may help, but will have a big performance cost.
You could also send the output to separate files and cat them together later.
The Following User Says Thank You to Corona688 For This Useful Post: