The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Advanced & Expert Users
Google UNIX.COM


UNIX for Advanced & Expert Users Advanced UNIX and Linux questions go here. Expert-to-Expert.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Split files using Csplit savitha UNIX for Dummies Questions & Answers 7 12-01-2007 07:55 AM
csplit limitations ravagga UNIX for Dummies Questions & Answers 0 11-23-2006 04:29 AM
Script behaving differently on two servers mhssatya UNIX for Advanced & Expert Users 5 09-13-2006 11:28 AM
csplit problem....please help me kumar1 Shell Programming and Scripting 2 10-07-2005 03:04 AM
ftp application behaving erratically diganta UNIX for Advanced & Expert Users 7 02-02-2005 06:23 AM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 05-22-2006
Junior Member
 

Join Date: May 2006
Posts: 12
Stumble this Post!
Angry csplit not behaving

I have a large file with the first 2 characters of each line determining the type of record. type 03 being a subheader and then it will have multiple 04 records.

eg: 03,xxx,xxxx,xxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
03,xxx,xxx,xxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx

I am looking to get N files like
file n+1
03,xxxx,xxxx,xxxx
04,xxxxxxxxxxxxxxx

file n+2

03,xxxx,xxx,xx
04,xxxxxxxxxxxxx

Using the beow script, which according the syntax of the man csplit should work (This is on HP-UX btw)

#!/bin/ksh

set -x

#This gets the occurrences of the subheader I wish to split on
awk -F"," '$2 != prev && $1=="03" && NR !=1 { print NR; prev = $2 }' MyFile > data

#This then gets the data file and transposes the line numbers to 1 305 315 398 509 515

num=$(awk -F"," 'NR==1 { print NF }' data)
print $num

i=1
while (( $i <= $num ))
do
newline=''
for val in $(cut -d" " -f$i data)
do
newline=$newline$val" "
done
nline=`print ${newline%?}`
print $nline >> tmpdata
(( i = i + 1 ))
done
mv tmpdata data

# This then gets the rows we transposed and fires the below command
rows=`cat data`
csplit 'MyFile' ${rows}

#Which looks like csplit MyFile 305 315 398 509 515
#But the split seems to split the first file at line 152?! !!!! and not 305, and then the subsequent splits are wrong
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 05-25-2006
Registered User
 

Join Date: May 2006
Posts: 95
Stumble this Post!
i uderstand your problem is to split a file at every line starting with 03.

testfile:
03,xxx,xxxx,xxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
03,xxx,xxx,xxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx

i used csplit -z testfile /^03/ {*} with success.
-z prevent empty files
/^03/ split at line starting with 03
{*} repeat until eof

using gnu csplit
Reply With Quote
  #3 (permalink)  
Old 05-25-2006
Junior Member
 

Join Date: May 2006
Posts: 12
Stumble this Post!
In the end I got this working:

#This gets the occurrences of the subheader I wish to split on
awk -F"," '$2 != prev && $1=="03" && NR !=1 { print (NR*2)-1; prev = $2 }' MyFile > data

#This then gets the data file and transposes the line numbers eg: 1 305 315 398 509 515

#HPUX seems to be coming in at under 1/2 so have doubled the NR above
#num=$(awk -F"," 'NR==1 { print NF }' data)

num=$(awk -F"," 'NR==1 { print NF }' data)
print $num

i=1
while (( $i <= $num ))
do
newline=''
for val in $(cut -d" " -f$i data)
do
newline=$newline$val" "
done
nline=`print ${newline%?}`
print $nline >> tmpdata
(( i = i + 1 ))
done
mv tmpdata data

# This then gets the rows we transposed and fires the below command
rows=`cat data`
csplit 'MyFile' ${rows}

#Which looks like csplit MyFile 305 315 398 509 515
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 05:26 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0