Sponsored Content
Top Forums Shell Programming and Scripting Split files with formatted numbers Post 302908837 by jim mcnamara on Thursday 10th of July 2014 04:59:54 PM
Old 07-10-2014
I guess I missed something - generally I think it is better to use a command that does what you want than to write a script, in this case
Code:
csplit

is a possible choice. It is educational to write a script but a better idea to use known good commands for production work.

Code:
csplit  -f splitz -k  -n 3  csprap01.logscan 10000 {5}

Explanation: split csprap01.logscan into five files named splitz000..splitz004

-f splitz -prefix for numbered file name - splitz001 .. splits999

-n number of decimal digits in the number: -n 3 means use zero filled numbers with 3 digits for output filenames

10000 means start from where you are in the file (usually the beginning) and stop 10000 lines later == lines 1-9999 are in the first split. 10000 - 19999 in the second.

{5} repeat five times - {*} (Linux csplit) means keep on repeating. This last option will cause you to overwrite the splitz000 file (and others) if you create more than 999 files as splits.

The line in red means the last file came up short of lines. With -k you lose no lines in the splits in case of error.

Code:
csplit  -f splitz -k  -n 3  csprap01.logscan 10000 {5}
1293851
1305465
1306543
2458441
1785104
/usr/local/bin/csplit: `10000': line number out of range on repetition 5
258231
jmcnama>
jmcnama > ls -lrt splitz*
-rw-r--r--   1 jmcnama  other    1293851 Jul 10 14:39 splitz000
-rw-r--r--   1 jmcnama  other    1305465 Jul 10 14:39 splitz001
-rw-r--r--   1 jmcnama  other    1306543 Jul 10 14:39 splitz002
-rw-r--r--   1 jmcnama  other    2458441 Jul 10 14:39 splitz003
-rw-r--r--   1 jmcnama  other    1785104 Jul 10 14:39 splitz004
-rw-r--r--   1 jmcnama  other     258231 Jul 10 14:39 splitz005

Code:
 jmcnama > wc -l splitz*
    9999 splitz000
   10000 splitz001
   10000 splitz002
   10000 splitz003
   10000 splitz004
    2093 splitz005
   52092 total
jmcnama >  wc -l csprap01.logscan
   52092 csprap01.logscan

 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need to remove improperly formatted fortran output line from files, tried sed

I have been trying to remove some improperly formatted lines of output from fortran code I have been using. The problem is that I have some singularities in the math for some points that causes an incorrectly large value to be reported that exceeds the normal formating set in the code resulting in... (2 Replies)
Discussion started by: gillesc_mac
2 Replies

2. Shell Programming and Scripting

Generating formatted reports from log files

Given that I have a log file of the format: DATE ID LOG_LEVEL | EVENT 2009-07-23T14:05:11Z T-4030097550 D | MessX 2009-07-23T14:10:44Z T-4030097550 D | MessY 2009-07-23T14:34:08Z T-7298651656 D | MessX 2009-07-23T14:41:00Z T-7298651656 D | MessY 2009-07-23T15:05:10Z T-4030097550 D | MessZ... (5 Replies)
Discussion started by: daccad
5 Replies

3. UNIX for Dummies Questions & Answers

Split Function Prefix Numbers

Hello, Hello, I use the following command to split a file: split -Number_of_Lines Input_File MyPrefix_ output is MyPrefix_a MyPrefix_b MyPrefix_c ...... Instead, how can I get numerical values like: MyPrefix_1 MyPrefix_2 MyPrefix_3 ...... (2 Replies)
Discussion started by: Gussifinknottle
2 Replies

4. Shell Programming and Scripting

Extracting formatted text and numbers

Hello, I have a file of text and numbers from which I want to extract certain fields and write it to a new file. I would use awk but unfortunately the input data isn't always formatted into the correct columns. I am using tcsh. For example, given the following data I want to extract: and... (3 Replies)
Discussion started by: DFr0st
3 Replies

5. UNIX for Dummies Questions & Answers

Breaking a fasta formatted file into multiple files containing each gene separately

Hey, I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time: for i in *.rtf.out do awk '/^>/{f=++d".fasta"} {print > $i.out}' $i done (1 Reply)
Discussion started by: Ann Mc Cartney
1 Replies

6. Shell Programming and Scripting

Split a file into multiple files based on line numbers and first column value

Hi All I have one query,say i have a requirement like the below code should be move to diffent files whose maximum lines can be of 10 lines.Say in the below example,it consist of 14 lines. This should be moved logically using the data in the fisrt coloumn to file1 and file 2.The data of first... (2 Replies)
Discussion started by: sarav.shan
2 Replies

7. Shell Programming and Scripting

awk split numbers

I would like to split a string of numbers "1-2,4-13,16,19-20,21-25,31-32" and output these with awk into -dFirstPage=1 -dLastPage=2 file.pdf -dFirstPage=4 -dLastPage=13 file.pdf -dFirstPage=16 -dLastPage=16 file.pdf file.pdf -dFirstPage=19 -dLastPage=20 file.pdf -dFirstPage=21 -dLastPage=25... (3 Replies)
Discussion started by: sdf
3 Replies

8. UNIX for Beginners Questions & Answers

Split and Rename Split Files

Hello, I need to split a file by number of records and rename each split file with actual filename pre-pended with 3 digit split number. What I have tried is the below command with 2 digit numeric value split -l 3 -d abc.txt F (# Will Produce split Files as F00 F01 F02) How to produce... (19 Replies)
Discussion started by: techedipro
19 Replies

9. Shell Programming and Scripting

Sum up formatted numbers with comma separation

I need to sum up the values in field nr 5 in a data file that contains some file listing. The 5th field denotes the size of each file and following are some sample values. 1,775,947,633 4,738 7,300 16,610 15,279 0 0 I tried the following code in a shell script. awk '{sum+=$5} END{print... (4 Replies)
Discussion started by: krishmaths
4 Replies
CSPLIT(1)						    BSD General Commands Manual 						 CSPLIT(1)

NAME
csplit -- split files based on context SYNOPSIS
csplit [-ks] [-f prefix] [-n number] file args ... DESCRIPTION
The csplit utility splits file into pieces using the patterns args. If file is a dash ('-'), csplit reads from standard input. Files are created with a prefix of ``xx'' and two decimal digits. The size of each file is written to standard output as it is created. If an error occurs whilst files are being created, or a HUP, INT, or TERM signal is received, all files previously written are removed. The options are as follows: -f prefix Create file names beginning with prefix, instead of ``xx''. -k Do not remove previously created files if an error occurs or a HUP, INT, or TERM signal is received. -n number Create file names beginning with number of decimal digits after the prefix, instead of 2. -s Do not write the size of each output file to standard output as it is created. The args operands may be a combination of the following patterns: /regexp/[[+|-]offset] Create a file containing the input from the current line to (but not including) the next line matching the given basic reg- ular expression. An optional offset from the line that matched may be specified. %regexp%[[+|-]offset] Same as above but a file is not created for the output. line_no Create containing the input from the current line to (but not including) the specified line number. {num} Repeat the previous pattern the specified number of times. If it follows a line number pattern, a new file will be created for each line_no lines, num times. The first line of the file is line number 1 for historic reasons. After all the patterns have been processed, the remaining input data (if there is any) will be written to a new file. Requesting to split at a line before the current line number or past the end of the file will result in an error. The csplit utility exits 0 on success, and >0 if an error occurs. ENVIRONMENT
The LANG, LC_ALL, LC_COLLATE, and LC_CTYPE environment variables affect the execution of csplit as described in environ(7). EXAMPLES
Split the mdoc(7) file foo.1 into one file for each section (up to 20): $ csplit -k foo.1 '%^.Sh%' '/^.Sh/' '{20}' Split standard input after the first 99 lines and every 100 lines thereafter: $ csplit -k - 100 '{19}' SEE ALSO
sed(1), split(1), re_format(7) STANDARDS
The csplit utility conforms to IEEE Std 1003.1-2004 (``POSIX.1''). HISTORY
A csplit command appeared in PWB UNIX. BUGS
Input lines are limited to LINE_MAX (2048) bytes in length. BSD
January 4, 2009 BSD
All times are GMT -4. The time now is 11:48 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy