Script to break up file (write new files) in bash


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Script to break up file (write new files) in bash
# 1  
Old 04-22-2014
Script to break up file (write new files) in bash

Hello experts, I need help writing individual files from a data matrix, with each new file being written every time there is a blank line:

From this
Code:
cat file.txt
col1   col2    col3
6661   7771    8881
6661   7771    8881
6661   7771    8881

col1   col2    col3
3451   2221    1221
3451   2221    1221
3451   2221    1221

col1   col2    col3
AAA1   BBB1    CCC1
AAA1   BBB1    CCC1
AAA1   BBB1    CCC1

to this:
Code:
file_new1.txt
col1   col2    col3
6661   7771    8881
6661   7771    8881
6661   7771    8881

file_new2.txt
col1   col2    col3
3451   2221    1221
3451   2221    1221
3451   2221    1221

file_new3.txt
col1   col2    col3
AAA1   BBB1    CCC1
AAA1   BBB1    CCC1
AAA1   BBB1    CCC1

Each file may split into 10-20 different individual files, and each "segment" would be around 100-1000 lines, otherwise I could break them up manually. I can't seem to figure out a strategy to do it automatically.

Many thanks for the help.
# 2  
Old 04-22-2014
Try
Code:
awk 'BEGIN{FN=1} NF==0 {FN++;next} {print $0 ">" "file_new"FN}' file

These 2 Users Gave Thanks to RudiC For This Post:
# 3  
Old 04-22-2014
You could do something very descriptive like this:-
Code:
#!/bin/ksh
count=1
while read line
do
   if [ "$line" != "" ]
   then
      echo "$line" >> file_new$count.txt
   else
      ((count=$count+1))
   fi
done < input_file

.... which is messy and might have a logic error in it, and it is certainly expensive as it will be open/appending to the output files many many times.


A better alternative may be to look at the command csplit We do a similar thing with this:-
Code:
csplit -f output_prefix -n 5 -s input_file "/^$/" {*}

It will create output files based on your prefix, e.g -f robin will produces files robin00000, robin00001, robin00002, etc. The -n 5 sets the length of the counter.


I hope that this helps,
Robin
Liverpool/Blackburn
UK
This User Gave Thanks to rbatte1 For This Post:
# 4  
Old 04-22-2014
Code:
awk '{print $0 > "file_new" NR ".txt"}' RS= file.txt

These 2 Users Gave Thanks to SriniShoo For This Post:
# 5  
Old 04-22-2014
Wow thanks so much, both awk options and the csplit command worked. I really appreciate the help.
# 6  
Old 04-23-2014
I'm sure we're all glad to have helped. IT would be interesting to know if either performs better for anyone else finding this thread with a similar query.

Can you run a few tests and tell us what you think? If awk is faster for larger files, then I will be happy to learn too. Smilie


Thanks again,
Robin
# 7  
Old 04-23-2014
My apologies, here are some numbers. The file I'm using has 252 "pieces" and 49,134 lines total.
I like the csplit option, its clean and gives options:
Code:
time csplit -f test_ -n3 -s segments.test.txt "/^$/" {*}
real    0m2.112s
user    0m0.106s
sys     0m0.033s

awk command 1 (had to remove the quotes around the redirect ">")
Code:
time awk 'BEGIN{FN=1} NF==0 {FN++;next} {print $0 > "file_new_TEST"FN}' segments.test.txt
real    0m0.362s
user    0m0.054s
sys     0m0.034s

awk command 2:
Code:
time awk -F "\t" '{print $0 > "file_new" NR ".txt"}' RS= segments.test.txt
real    0m0.322s
user    0m0.012s
sys     0m0.035s

Awk seems to perform better...I'm quite the computing newb, it would be interesting to understand why this might be?

Thanks for all the help.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How To Write my Bash Script To Automate it?

Hello guys, I need some help. I am new in bash and I don't know how to automate the following script. head -2 out1 > 1.fasta sed ‘1,2 d' out1 > out2 rm out1 head -2 out2 > 2.fasta sed ‘1,2 d' out2 > out1 rm out2 head -2 out2 > 3.fasta sed '1,2 d' out2 > out1 rm out2 .......... (3 Replies)
Discussion started by: dellia222
3 Replies

2. Shell Programming and Scripting

Break output file into three files

Help! :) I am getting an output file that looks similar to below. EMAIL_ADDR ----------------------------------------------------------------------------------- user@gmail.com DATABASENAME ----------------------------------------------------------------------------------- db1 db2 db3... (6 Replies)
Discussion started by: cpolikowsky
6 Replies

3. UNIX for Dummies Questions & Answers

Write pid and command name to a txt file while executing a bash script

Hi All, Just have a requirement, I am executing a bash shell script, my requirement is to catch the pid and job name to a txt file in the same directory, is there anyway to do it? please help me out. Regards Rahul ---------- Post updated at 08:42 AM ---------- Previous update was at... (2 Replies)
Discussion started by: rahulkalra9
2 Replies

4. Shell Programming and Scripting

how to write bash script that will automatically extract zip file

i'm trying to write a bash script that that will automatically extract zip files after the download. i writed this script #!/bin/bash wget -c https://github.com/RonGokhle/kernel-downloader/zipball/master CURRENDIR=/home/kernel-downloader cd $CURRENDIR rm $CURRENDIR/zipfiles 2>/dev/null ... (2 Replies)
Discussion started by: ron gokhle
2 Replies

5. Homework & Coursework Questions

How to write script in bash.

I am very new to Linux/Unix. Kindly assist the following: I wish to write a bash shell script called how_many_to_go that calculates and prints the number of days, hours, minutes and/or seconds until the end of the current month (based on the output of the date command). Do ... (2 Replies)
Discussion started by: alobi
2 Replies

6. Shell Programming and Scripting

Write a new file from 2 files as input to the script

Hi- I am hoping someone can give me some pointers to get me started. I have a file which contains some dn's .e.g file 1 cn=bob,cn=user,dc=com cn=kev,cn=user,dc=com cn=john,cn=user,dc=com I have a second file e.g. file.template which looks something like :- dn: <dn> objectclass:... (5 Replies)
Discussion started by: sniper57
5 Replies

7. Shell Programming and Scripting

How to write bash script to explode multiple zip files

I have a directory full of zip files. How would I write a bash script to enumerate all the zip files, remove the ".zip" from the file name, create a directory by that name and unzip each zip file into its corresponding directory? Thanks! Siegfried (3 Replies)
Discussion started by: siegfried
3 Replies

8. Shell Programming and Scripting

Can anybody write this bash script ?

hi, first congratulations on the nice forum! Can anybody write script, which can make copy of some or all files of the current directory in new directory (called "backups", which must be made in the current directory, if it's not already exist). And bring out a massage (report) with the count... (7 Replies)
Discussion started by: Cecko
7 Replies

9. Shell Programming and Scripting

shell script to remove old files and write to a log file

Hi, I have a script that works on a unix box but am trying to get it working on a linux box that uses shell. I am not a programmer so this is proving harder than I imagined. I made some changes and ended up with the script below but when I run it I get the following messages. Any help would be... (4 Replies)
Discussion started by: yabai
4 Replies

10. Shell Programming and Scripting

Break a file into separate files

Hello I am facing a scenario where I have a file with XML content and I am running shell script over it. But the problem is the XML is getting updated with new services. In the below scenario, my script takes values from the xml file from one service name say ABCD. Since there are multiple, it is... (8 Replies)
Discussion started by: chiru_h
8 Replies
Login or Register to Ask a Question