csplit not behaving


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users csplit not behaving
# 1  
Old 05-22-2006
Network csplit not behaving

I have a large file with the first 2 characters of each line determining the type of record. type 03 being a subheader and then it will have multiple 04 records.

eg: 03,xxx,xxxx,xxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
03,xxx,xxx,xxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx

I am looking to get N files like
file n+1
03,xxxx,xxxx,xxxx
04,xxxxxxxxxxxxxxx

file n+2

03,xxxx,xxx,xx
04,xxxxxxxxxxxxx

Using the beow script, which according the syntax of the man csplit should work (This is on HP-UX btw)

#!/bin/ksh

set -x

#This gets the occurrences of the subheader I wish to split on
awk -F"," '$2 != prev && $1=="03" && NR !=1 { print NR; prev = $2 }' MyFile > data

#This then gets the data file and transposes the line numbers to 1 305 315 398 509 515

num=$(awk -F"," 'NR==1 { print NF }' data)
print $num

i=1
while (( $i <= $num ))
do
newline=''
for val in $(cut -d" " -f$i data)
do
newline=$newline$val" "
done
nline=`print ${newline%?}`
print $nline >> tmpdata
(( i = i + 1 ))
done
mv tmpdata data

# This then gets the rows we transposed and fires the below command
rows=`cat data`
csplit 'MyFile' ${rows}

#Which looks like csplit MyFile 305 315 398 509 515
#But the split seems to split the first file at line 152?! Smilie !!!! and not 305, and then the subsequent splits are wrong Smilie
# 2  
Old 05-25-2006
i uderstand your problem is to split a file at every line starting with 03.

testfile:
03,xxx,xxxx,xxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
03,xxx,xxx,xxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx
04,xxxxxxxxxxxxxxxxxxxxxxxxxxxx

i used csplit -z testfile /^03/ {*} with success.
-z prevent empty files
/^03/ split at line starting with 03
{*} repeat until eof

using gnu csplit
# 3  
Old 05-25-2006
In the end I got this working:

#This gets the occurrences of the subheader I wish to split on
awk -F"," '$2 != prev && $1=="03" && NR !=1 { print (NR*2)-1; prev = $2 }' MyFile > data

#This then gets the data file and transposes the line numbers eg: 1 305 315 398 509 515

#HPUX seems to be coming in at under 1/2 so have doubled the NR above
#num=$(awk -F"," 'NR==1 { print NF }' data)

num=$(awk -F"," 'NR==1 { print NF }' data)
print $num

i=1
while (( $i <= $num ))
do
newline=''
for val in $(cut -d" " -f$i data)
do
newline=$newline$val" "
done
nline=`print ${newline%?}`
print $nline >> tmpdata
(( i = i + 1 ))
done
mv tmpdata data

# This then gets the rows we transposed and fires the below command
rows=`cat data`
csplit 'MyFile' ${rows}

#Which looks like csplit MyFile 305 315 398 509 515
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Linux csplit command

Input file: CLK00027 TESTDATA 0 S 600000 \r 0001RFC 192321 321321 321321 \r 0002 rfd this is testdata1 CLK00027 TESTDATA 0 S 600000 \r 0001RFC 192321 321321 321321 \r 0002 rfd this is testdata2 I Need to split this file into seperate files file1.txt CLK00027 ... (1 Reply)
Discussion started by: VXANDERB
1 Replies

2. Shell Programming and Scripting

simple csplit problem

Hello am trying to split the following top output at the TTY line but having some issues: System: msisprd Sun Oct 9 09:35:37 2011 Load averages: 0.14, 0.17, 0.16 411 processes: 361 sleeping, 50 running Cpu states: CPU LOAD USER NICE SYS ... (3 Replies)
Discussion started by: delphys
3 Replies

3. UNIX for Dummies Questions & Answers

csplit to stdout

I want to split big files based on a pattern to stdout. Although csplit works well for me splitting the output into separate files (e.g. xx00, xx01, xx02, ...), the following is not working as expected: <code> # assuming pattern occurs less than 100 times csplit bigfile '%pattern%'... (2 Replies)
Discussion started by: uiop44
2 Replies

4. Shell Programming and Scripting

CSPLIT help

I have a file with contents <wmqi> sdf sdf sdffghghhjjfh </wmqi> <wmqi> gh dfg hhjhj sdfsdf g </wmqi> <wmqi> dfgdf fg dfgfg </wmqi> <wmqi> (6 Replies)
Discussion started by: Shivdatta
6 Replies

5. Shell Programming and Scripting

csplit suffix-format: how to?

I am using GNU csplit to extract chapters from a big file into smaller files. I want to use the -b option -b, --suffix-format=FORMAT use sprintf FORMAT instead of %d but I have failed so far. 1) All the generated files need to have a suffix .txt at the end 2) They have to look like... (1 Reply)
Discussion started by: MarioColuzzi
1 Replies

6. Shell Programming and Scripting

csplit issue

Hello all The below command works perfeft when executed from the shell prompt. "csplit -f first Allocation.log.1 "%. Oct 02 .%" {1} " and generates first00 file. But it fails if I include the same command in my script. I am trying to automate some process and this is turning... (1 Reply)
Discussion started by: uandme2k2
1 Replies

7. Shell Programming and Scripting

Problem with csplit

Hi All, I have a strange behaviour from csplit command on some text files. I have a comma separated file and data with in double quotes. This file is being generated from Sequential stage of Data stage tool. This file has 67 fields in each records with some null and blank in the data.... (4 Replies)
Discussion started by: shreekrishnagd
4 Replies

8. UNIX for Dummies Questions & Answers

Split files using Csplit

I have an excel file with more than 65K records... Since excel does not take more than 65K records i wan to split the file and send it as two excel files... Could some help me how to use the csplit by specifiying the no of records (7 Replies)
Discussion started by: savitha
7 Replies

9. UNIX for Dummies Questions & Answers

csplit limitations

I am trying to use the csplit file on a file that contains records that have more than 2048 characters on a line. The resultant split file seems to ignore the rest of the line and I lose the data. Is there any way that csplit can handle record lengths greater than 2048? Thanks (0 Replies)
Discussion started by: ravagga
0 Replies

10. Shell Programming and Scripting

csplit problem....please help me

Dear Friends, please help me. I am using csplit to split the files, i.e., csplit -f filetype_ -n 3 filename '/regexpn/' {5} which will split file "filename" into 5 subfiles filetype_000 filetype_001 filetype_002 filetype_003 filetype_004 but if I run the csplit commad again it will... (2 Replies)
Discussion started by: kumar1
2 Replies
Login or Register to Ask a Question