Sponsored Content
Full Discussion: Splitting files
Top Forums Shell Programming and Scripting Splitting files Post 302665921 by drl on Tuesday 3rd of July 2012 11:01:24 AM
Old 07-03-2012
Hi.

Here is a shell script that calls a perl script to split a file. The contents of the split results are not uniform in number, but they seem somewhat balanced. It works on small datasets like the sample, but whether it would work on a very large dataset is unknown -- for example, it may be limited by memory. This is a quickly written solution, and if you are curious, you'd need to look over the documentation and code, and / or contact the author the perl module Text::Parts:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate perl Text::Parts for splitting file.

pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
edges() { head -3 $1;pe "---";tail -3 $1; pe ; }
C=$HOME/bin/context && [ -f $C ] && $C

FILE=${1-data1}
pl " Edges of $(wc -l <$FILE) lines in input data file $FILE:"
# head -5 $FILE ; pe "---" ; tail -5 $FILE
edges $FILE

pl " Sample perl script to split file:"
cat p1

# Remove debris form previous runs, try various PARTS.
rm -f file*.txt
PARTS=6
PARTS=3
PARTS=10
pl " Splitting $FILE into $PARTS parts:"
./p1 $FILE $PARTS

pl " Files created:"
wc -l file*.txt

pl " Sample result, edges of part 1:"
edges file1.txt
pl " Sample result, edges of part $PARTS:"
edges file$PARTS.txt

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39

-----
 Edges of 80 lines in input data file data1:
1
2
3
---
78
79
80


-----
 Sample perl script to split file:
#!/usr/bin/env perl

# @(#) p1	Split file into n parts, Text::Parts.
# See:
# http://search.cpan.org/~ktat/Text-Parts-0.15/lib/Text/Parts.pm

use strict;
use warnings;
use Text::Parts;

my $file  = shift || die " Need a filename to split.\n";
my $parts = shift || "4";

my $splitter = Text::Parts->new( file => $file );

$splitter->write_files(
  'file%d.txt',
  num  => $parts,
  code => \&do_after_split
);

sub do_after_split {
  my ( $filename, $f );
  $filename = shift;    # e.g. 'path/to/name1.txt'
  open( $f, ">>", $filename ) || die "Cannot open $filename for append.\n";
  print $f "\n";
  close $f;
}

exit(0);

-----
 Splitting data1 into 10 parts:

-----
 Files created:
 11 file1.txt
  7 file10.txt
  8 file2.txt
  8 file3.txt
  8 file4.txt
  8 file5.txt
  8 file6.txt
  7 file7.txt
  8 file8.txt
  7 file9.txt
 80 total

-----
 Sample result, edges of part 1:
1
2
3
---
9
10
11


-----
 Sample result, edges of part 10:
74
75
76
---
78
79
80

Best wishes ... cheers, drl
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

splitting the files

Hi, I'am using HP-UX.I have a input file which has 102 drop statements in it.I'am using csplit to split the files.The upper limit is 99 only.I'am using the -n 102 option.It says "suffix size not vaild".Any suggestions how to do it using csplit? Thanx in advance, sounder. (1 Reply)
Discussion started by: sounder123
1 Replies

2. Shell Programming and Scripting

Splitting large files

Hi Unix gurus, We have a masterfile which is to be split into smallerfiles with names as masterfile00,masterfile01,masterfile03...etal I was able to split the file using the "Split" cmd but as masterfileaa,masterfileab.. Is it posiible to change the default suffix? or is there any other... (2 Replies)
Discussion started by: Rvbs
2 Replies

3. UNIX for Advanced & Expert Users

splitting the files

Hi, How can i split the big file by the lines?. For eg. I wanna split the file from the line 140 to 1700. (9 Replies)
Discussion started by: sharif
9 Replies

4. Shell Programming and Scripting

Splitting input files into multiple files through AWK command

Hi, I needs to split *.txt files from single directory depends on the some mutltiple input values. i have wrote the code like below for file in *.txt do grep -i -h "value1|value2" $file > $file; done. My requirment is more input values needs to be given in grep; let us say 50... (3 Replies)
Discussion started by: arund_01
3 Replies

5. UNIX for Dummies Questions & Answers

splitting the files

Hi, I have some files with 2 million odd records which i need to split into chunks of 0.5 millions. I have the file sorted with a key column in order. The same key value can appear as 4 or 5 records in the file. Hence after splitting we are checking whether all the key values are present in the... (5 Replies)
Discussion started by: dnat
5 Replies

6. Shell Programming and Scripting

Splitting files from one file

Hi, I have an input file like: 111 abcdefgh asdfghjk dfghjkl 222 aaaaaaa bbbbbb 333 djfhfgjktitjhgfkg 444 djdhfjkhfjkghjkfg hsbfjksdbhjkgherjklg fjkhfjklsahjgh fkrjkgnj I want to read this input file and make separate output files with the header as numric value like "111"... (9 Replies)
Discussion started by: saltysumi
9 Replies

7. UNIX for Dummies Questions & Answers

Splitting Files Help

Hi Gurus, Lets say i have a file with some 30 records... How can i split that file into 3 files Also it shud be dynamic in the sense.. I wouldnt mind if file 1 has 15, file 2 has 10 and file 3 has 5.... Please help.. Thanks (6 Replies)
Discussion started by: saggiboy10
6 Replies

8. Shell Programming and Scripting

Splitting files into 100 files with field value

I want a script to split my file upon the last field (15) As file A,b,c,.......,01 C,v,n,.......,02 C,r,v,........,01 F,s,a,........,03 X,y,d,........,99 To make output 01.txt A,b,c,.......,01 C,r,v,........,01 02.txt C,v,n,.......,02 (12 Replies)
Discussion started by: teefa
12 Replies

9. UNIX for Dummies Questions & Answers

Splitting log files

I am trying to split my IRSSI logs into weekly and monthly .log files. My log format looks like this: --- Day changed Fri Mar 04 2016 00:11 <Jack> Test --- Day changed Sat Mar 05 2016 00:11 <Jack> Test --- Day changed Sun Mar 06 2016 15:20 <Jack> Test The script I have been playing... (2 Replies)
Discussion started by: Stacked
2 Replies

10. UNIX for Beginners Questions & Answers

Automate splitting of files , scp files as each split completes and combine files on target server

i use the split command to split a one terabyte backup file into 10 chunks of 100 GB each. The files are split one after the other. While the files is being split, I will like to scp the files one after the other as soon as the previous one completes, from server A to Server B. Then on server B ,... (2 Replies)
Discussion started by: malaika
2 Replies
X2SYS_MERGE(1gmt)					       Generic Mapping Tools						 X2SYS_MERGE(1gmt)

NAME
x2sys_merge - Merge an updated COEs tables SYNOPSIS
x2sys_merge -Amain_COElist.d -Mnew_COElist.d DESCRIPTION
x2sys_merge will read two crossovers data base and output the contents of the main one updated with the COEs in the second one. The second file should only contain updated COEs relatively to the first one. That is, it MUST NOT contain any new two tracks intersections (This point is NOT checked in the code). This program is useful when, for any good reason like file editing NAV correction or whatever, one had to recompute only the COEs between the edited files and the rest of the database. -A Specify the file main_COElist.d with the main crossover error data base. -M Specify the file new_COElist.d with the newly computed crossover error data base. OPTIONS
No space between the option flag and the associated arguments. EXAMPLES To update the main COE_data.txt with the new COEs estimations saved in the smaller COE_fresh.txt, try x2sys_merge -ACOE_data.txt -MCOE_fresh.txt > COE_updated.txt SEE ALSO
x2sys_binlist(1), x2sys_cross(1), x2sys_datalist(1), x2sys_get(1), x2sys_init(1), x2sys_list(1), x2sys_put(1), x2sys_report(1) GMT 4.5.7 15 Jul 2011 X2SYS_MERGE(1gmt)
All times are GMT -4. The time now is 03:44 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy