Sponsored Content
Full Discussion: Script to split text files
Top Forums Shell Programming and Scripting Script to split text files Post 302405827 by phil8258 on Saturday 20th of March 2010 07:59:10 AM
Old 03-20-2010
Script to split text files

Hi All,
I'm fairly new to scripting, so need a little help to get started with this problem.
I don't mind whether I go for an awk/bash/other approach, I don't really know which would be best suited to the problem...

Lets say I have a 10000 line text file, I would like to split this up into a few smaller files. Something like:
10 line, say the last 10 lines
100 line, say the first 100 lines
1000 line, say the last 1000 lines
5000 line, say the middle 5000 lines

This I could probably manage with head & tail etc.
However, if my text file was only 1000 lines long it would now work so well. I'g get 10 and 100 lines ok, but the 3rd would give me what I already have, and I guess the 4th would fail. What I would actually want is more like:
1 line
10 lines
100 lines
500 lines

Similarly, a text file much larger than 10000 lines, I'd want to behave the same the other way, like a 100k file = 100, 1000, 10000, 50000.

The numbers of lines does not need to be exact either. I would not mind doing the splits based on a percentage of the lines in the original file. Nor would I mind if lines in the original file were selected at random.
Basically, I just want a set of small medium large larger files of whatever size, but proportional to the original. Files would not need to be unique either, line 1 in the small file, and then line 1-10 in the medium file is fine, though if it's easier I would not mind lines 2-11 in the second file.

I hope I've not over-complicated this explanation...
Would somebody please give me a steer on where to start. What should I use for this - awk?, should I try and use percentages, or try and work out absolutes that work in every situation?

Many thanks!

Phil.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script to split files based on number of lines

I am getting a few gzip files into a folder by doing ftp to another server. Once I get them I move them to another location .But before that I need to make sure each gzip is not more than 5000 lines and split it up . The files I get are anywhere from 500 lines to 10000 lines in them and is in gzip... (4 Replies)
Discussion started by: gubbu
4 Replies

2. Shell Programming and Scripting

Split line to multiple files Awk/Sed/Shell Script help

Hi, I need help to split lines from a file into multiple files. my input look like this: 13 23 45 45 6 7 33 44 55 66 7 13 34 5 6 7 87 45 7 8 8 9 13 44 55 66 77 8 44 66 88 99 6 I want to split every 3 lines from this file to be written to individual files. (3 Replies)
Discussion started by: saint2006
3 Replies

3. UNIX for Dummies Questions & Answers

Writing awk script to read csv files and split them

Hi Here is my script that calls my awk script #!/bin/bash set -x dir="/var/local/dsx/csv" testfile="$testfile" while getopts " f: " option do case $option in f ) testfile="$OPTARG";; esac; done ./scriptFile --testfile=$testfile >> $dir/$testfile.csv It calls my awk... (1 Reply)
Discussion started by: ladyAnne
1 Replies

4. Shell Programming and Scripting

Script to create a text file whose content is the text of another files

Hello everyone, I work under Ubuntu 11.10 (c-shell) I need a script to create a new text file whose content is the text of another text files that are in the directory $DIRMAIL at this moment. I will show you an example: - On the one hand, there is a directory $DIRMAIL where there are... (1 Reply)
Discussion started by: tenteyu
1 Replies

5. Shell Programming and Scripting

Backup script to split and tar files

Hi Guys, I'm very new to bash scripting. Please help me on this. I'm in need of a backup script which does the ff. 1. If a file is larger than 5GB. split it and tar the file. 2. Weekly backup file to amazon s3 using s3rsync 3. If a file is unchanged it doesn't need to copy to amazon s3 ... (4 Replies)
Discussion started by: ganitolngyundre
4 Replies

6. Shell Programming and Scripting

perl script to split the text file after every 4th field

I had a text file(comma seperated values) which contains as below 196237,ram,25-May-06,ram.kiran@xyz.com,204183,Pavan,4-Jun-07,Pavan.Desai@xyz.com,237107,ram Chandra,15-Mar-10,ram.krishna@xyz.com ... (3 Replies)
Discussion started by: giridhar276
3 Replies

7. Shell Programming and Scripting

awk script to split file into multiple files based on many columns

So I have a space delimited file that I'd like to split into multiple files based on multiple column values. This is what my data looks like 1bc9A02 1 10 1000 FTDLNLVQALRQFLWSFRLPGEAQKIDRMMEAFAQRYCQCNNGVFQSTDTCYVLSFAIIMLNTSLHNPNVKDKPTVERFIAMNRGINDGGDLPEELLRNLYESIKNEPFKIPELEHHHHHH 1ku1A02 1 10... (9 Replies)
Discussion started by: viored
9 Replies

8. Shell Programming and Scripting

Split a text file into multiple text files?

I have a text file with entries like 1186 5556 90844 7873 7722 12 7890.6 78.52 6679 3455 9867 1127 5642 ..N so many records like this. I want to split this file into multiple files like cluster1.txt, cluster2.txt, cluster3.txt, ..... clusterN.txt. (4 Replies)
Discussion started by: sammy777
4 Replies

9. Shell Programming and Scripting

How to split files using shell script?

solid top facet normal 0 1 0 outer loop vertex 0 1 0 vertex 1 1 1 vertex 1 1 0 endloop endfacet facet normal 0 1 0 outer loop vertex 0 1 0 vertex 0 1 1 vertex 1 1 1 endloop endfacet endsolid top solid bottom facet normal 0 -1 ... (3 Replies)
Discussion started by: linuxUser_
3 Replies

10. UNIX for Beginners Questions & Answers

Shell script to Split matrix file with delimiter into multiple files

I have a large semicolon delimited file with thousands of columns and many thousands of line. It looks like: ID1;ID2;ID3;ID4;A_1;B_1;C_1;A_2;B_2;C_2;A_3;B_3;C_3 AA;ax;ay;az;01;02;03;04;05;06;07;08;09 BB;bx;by;bz;03;05;33;44;15;26;27;08;09 I want to split this table in to multiple files: ... (1 Reply)
Discussion started by: trymega
1 Replies
INTRO(9)						   BSD Kernel Developer's Manual						  INTRO(9)

NAME
intro -- introduction to system kernel interfaces DESCRIPTION
This section contains information about the interfaces and subroutines in the kernel. PROTOTYPES ANSI-C AND ALL THAT Yes please. We would like all code to be fully prototyped. If your code compiles cleanly with cc -Wall we would feel happy about it. It is important to understand that this isn't a question of just shutting up cc, it is a question about avoiding the things it complains about. To put it bluntly, don't hide the problem by casting and other obfuscating practices, solve the problem. INDENTATION AND STYLE
Believe it or not, there actually exists a guide for indentation and style. It isn't generally applied though. We would appreciate if people would pay attention to it, and at least not violate it blatantly. We don't mind it too badly if you have your own style, but please make sure we can read it too. Please take time to read style(9) for more information. NAMING THINGS
Some general rules exist: 1. If a function is meant as a debugging aid in DDB, it should be enclosed in #ifdef DDB #endif /* DDB */ And the name of the procedure should start with the prefix DDB_ to clearly identify the procedure as a debugger routine. SCOPE OF SYMBOLS
It is important to carefully consider the scope of symbols in the kernel. The default is to make everything static, unless some reason requires the opposite. There are several reasons for this policy, the main one is that the kernel is one monolithic name-space, and pollution is not a good idea here either. For device drivers and other modules that don't add new internal interfaces to the kernel, the entire source should be in one file if possi- ble. That way all symbols can be made static. If for some reason a module is split over multiple source files, then try to split the module along some major fault-line and consider using the number of global symbols as your guide. The fewer the better. SEE ALSO
style(9) HISTORY
The intro section manual page appeared in FreeBSD 2.2. BSD
December 13, 1995 BSD
All times are GMT -4. The time now is 04:56 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy