Since you're specifying 3 digit file sequence numbers, I assume you expect that you'll be producing more than a hundred files with this script. There is a good chance that awk will run out of file descriptors if you keep all of them open. You might want to consider something like:
If there are existing split.xxx.txt files when you start this script do you really want to append data to them, or do you want to remove any data that was there before and just keep what you find in the current input file?
If you want to append to existing files, the script above should work.
If you want to replace data instead of appending data, change:
to:
As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.
This User Gave Thanks to Don Cragun For This Post:
I get 2 to 10 mil records file. I have to split them with 100,000 records in each file. Assuming that i mostly get 3 mil records, so I have to split the file in 300 files. What is the limit that awk can handle certain number of file descriptors.
Besides, how do I get header (n records) and trailer with file number or some content in it.
I get 2 to 10 mil records file. I have to split them with 100,000 records in each file. Assuming that i mostly get 3 mil records, so I have to split the file in 300 files. What is the limit that awk can handle certain number of file descriptors.
Besides, how do I get header (n records) and trailer with file number or some content in it.
Simple, you slightly modify the code I gave you to put 100000 lines per output file instead of 2 lines per output file. The code I gave you already closes files when it is done with them so it only keeps one output file open at a time.
You're going to have to give us a lot more than "get header (n records) and trailer with file number or some content in it" to guess at what you want to put as headers and trailers in your files. Show us sample input and show us sample output! How is your script supposed to identify which lines are headers, which lines are trailers, and what data you want added to or removed from those headers as you copy parts of the input file to your hundreds of output files?
I need to sum up the values in field nr 5 in a data file that contains some file listing. The 5th field denotes the size of each file and following are some sample values.
1,775,947,633
4,738
7,300
16,610
15,279
0
0
I tried the following code in a shell script.
awk '{sum+=$5} END{print... (4 Replies)
Hello,
I need to split a file by number of records and rename each split file with actual filename pre-pended with 3 digit split number.
What I have tried is the below command with 2 digit numeric value
split -l 3 -d abc.txt F (# Will Produce split Files as F00 F01 F02)
How to produce... (19 Replies)
I would like to split a string of numbers "1-2,4-13,16,19-20,21-25,31-32" and output these with awk into
-dFirstPage=1 -dLastPage=2 file.pdf -dFirstPage=4 -dLastPage=13 file.pdf -dFirstPage=16 -dLastPage=16 file.pdf file.pdf -dFirstPage=19 -dLastPage=20 file.pdf -dFirstPage=21 -dLastPage=25... (3 Replies)
Hi All
I have one query,say i have a requirement like the below code should be
move to diffent files whose maximum lines can be of 10 lines.Say in the below example,it consist of 14 lines.
This should be moved logically using the data in the fisrt coloumn to file1 and file 2.The data of first... (2 Replies)
Hey,
I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time:
for i in *.rtf.out
do
awk '/^>/{f=++d".fasta"} {print > $i.out}' $i
done (1 Reply)
Hello,
I have a file of text and numbers from which I want to extract certain fields and write it to a new file. I would use awk but unfortunately the input data isn't always formatted into the correct columns. I am using tcsh.
For example, given the following data
I want to extract:
and... (3 Replies)
Hello,
Hello,
I use the following command to split a file:
split -Number_of_Lines Input_File MyPrefix_
output is
MyPrefix_a
MyPrefix_b
MyPrefix_c
......
Instead, how can I get numerical values like:
MyPrefix_1
MyPrefix_2
MyPrefix_3
...... (2 Replies)
Given that I have a log file of the format:
DATE ID LOG_LEVEL | EVENT
2009-07-23T14:05:11Z T-4030097550 D | MessX
2009-07-23T14:10:44Z T-4030097550 D | MessY
2009-07-23T14:34:08Z T-7298651656 D | MessX
2009-07-23T14:41:00Z T-7298651656 D | MessY
2009-07-23T15:05:10Z T-4030097550 D | MessZ... (5 Replies)
I have been trying to remove some improperly formatted lines of output from fortran code I have been using. The problem is that I have some singularities in the math for some points that causes an incorrectly large value to be reported that exceeds the normal formating set in the code resulting in... (2 Replies)