File splitting, naming file according to internal field


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File splitting, naming file according to internal field
# 1  
Old 09-15-2010
File splitting, naming file according to internal field

Hi All,

I have a rather stange set of requirements that I'm hoping someone here could help me with. We receive a file that is actually a concatenation of 4 files (don't believe this would change, but ideally the solution would handle n files).

The super-file looks like:
Code:
FileHeader,Filename.csv,FileType,RowCount,Date,Time
ColumnHeaders....
Data.....
...
Data.....
FileHeader,Filename.csv,FileType2,RowCount2,Date2,Time2
ColumnHeaders2....
Data.....
...
Data.....
FileHeader,Filename.csv,FileType3,RowCount3,Date3,Time3
ColumnHeaders3....
Data.....
...
Data.....
FileHeader,Filename.csv,FileType4,RowCount4,Date4,Time4
ColumnHeaders4....
Data.....
...
Data.....

I would like to split that super-file into the 4 constituent files (each time the constant "Fileheader" is seen at start of line), naming each file: Filename-FileType.csv. Rowcounts dates and times can remain unchanged in their separate files.

Additionally, if possible, I would then like to update the Filename.csv in each sub-file to its newly allocated filename (Filename_FileType.csv)

Unfortunately my awking skills are extremely minimal. Can someone please help me with this?

Many many thanks in advance.

Lee

Last edited by Franklin52; 09-15-2010 at 02:24 PM.. Reason: Please use code tags
# 2  
Old 09-15-2010
try the code below:

Code:
while read line; do

  if [ $line ~  /^FileHeader/ ]; then
    filename=`sed s/^FileHeader\(.*\).csv,\([^,]*\),.*/\1_\2.csv/`
  else
    echo $line >>$filename
  fi

done < superfilename


Last edited by Franklin52; 09-15-2010 at 02:25 PM.. Reason: Replaced quote tags with code tags and indented code
# 3  
Old 09-15-2010
Something like this?
Code:
awk '/^FileHeader/{fn=$2 "-" $3 ".csv"}{print > fn}' file

# 4  
Old 09-15-2010
Thanks guys for the very quick responses, unfortunately I can't get either solution to work.

116: I get the error below
Code:
 
split.sh[5]: /home/dlee: 0403-012 A test command parameter is not valid

I've tried to put it into a .sh script, so I hope that's not having any adverse effect:

Code:
 
#!/usr/bin/ksh
 
while read line; do
 
  if [ $line ~  /^FileHeader/ ]; then
    filename=`sed s/^FileHeader\(.*\).csv,\([^,]*\),.*/\1_\2.csv/`
  else
    echo $line >>$filename
  fi
 
done < $1

Franklin: I don't get any output from yours Smilie

To clarify, here's a sample file:

<<ICON_NIL_Trans.csv>>
Code:
FileHeader,ICON_NIL_Trans.csv,Trades,2,20100818,09:50:00,,
PortfolioCode,SourceSystem,AssetCode
1,2,3
FileHeader,ICON_NIL_Trans.csv,Cash,2,20100818,09:50:00,,
PortfolioCode,SourceSystem,AccountCode
4,5,6
FileHeader,ICON_NIL_Trans.csv,Fx,2,20100818,09:50:00,,
PortfolioCode,SourceSystem,BuyAccSecCode
7,8,9
FileHeader,ICON_NIL_Trans.csv,Inc,2,20100818,09:50:00,,
PortfolioCode,SourceSystem,AssetCode
0,1,2

and I'd want 4 outputs:

<<ICON_NIL_Trans_Trades.csv>>
Code:
PortfolioCode,SourceSystem,AssetCode
1,2,3

<<ICON_NIL_Trans_Cash.csv>>
Code:
PortfolioCode,SourceSystem,AccountCode
4,5,6

<<ICON_NIL_Trans_Fx.csv>>
Code:
PortfolioCode,SourceSystem,BuyAccSecCode
7,8,9

<<ICON_NIL_Trans_Inc.csv>>
Code:
PortfolioCode,SourceSystem,AssetCode
0,1,2

Many thanks again,

Lee
# 5  
Old 09-15-2010
Sorry, I presumed u were using a bash shell. I guess it is ksh. Then can try this

Code:
#!/usr/bin/ksh
 
while read line; do
 
  if echo $line | grep  /^FileHeader/ >/dev/null ; then
    filename=`sed s/^FileHeader\(.*\).csv,\([^,]*\),.*/\1_\2.csv/`
  else
    echo $line >>$filename
  fi
 
done < $1



---------- Post updated at 01:32 PM ---------- Previous update was at 01:25 PM ----------

I guess Franklin code might work with a bit of addition

Code:
awk 'BEGIN{FS=","} /^FileHeader/{fn=$2 "-" $3 ".csv"}{print > fn}' file


Last edited by Scott; 09-15-2010 at 03:41 PM.. Reason: Extra code tag
# 6  
Old 09-15-2010
Try this:
Code:
awk -F, '/^FileHeader/{s=$2;sub(".csv","",s);fn=s "-" $3 ".csv"}{print > fn}' OFS="," file

This User Gave Thanks to Franklin52 For This Post:
# 7  
Old 09-15-2010
Code:
awk -F, '/^File/{print>($3"-"$2)}' in.file



---------- Post updated at 03:54 PM ---------- Previous update was at 03:49 PM ----------

or
Code:
awk -F, '/^File/{_=".";split($2,a,_);print>(a[1]_ $3_ a[2])}'  in.file

Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with naming the file

Hi, I have a folder that contains files abc.txt def.txt ....and so on Inside abc.txt, I have @<TRIPOS>MOLECULE 4|Chelerythrine|abcb11_earlyIdentification_Stronginhib_washed_ligprep|sdf|1|dock Inside def.txt, I have @<TRIPOS>MOLECULE... (6 Replies)
Discussion started by: rossi
6 Replies

2. Shell Programming and Scripting

IFS(Internal Field separator)

Hi All, I need your small help in using IFS. I am using it like below but i am not getting the desired output. var=ABCD,EFGH,IJKL,MNOP,QRST export IFS="," for x in $var do if ] then temp="Group_$x" GP=${!temp} else temp="Group_$x" MP="${!temp}" GP="$GP","$MP" (2 Replies)
Discussion started by: prasanna2166
2 Replies

3. Shell Programming and Scripting

file naming in a script

#!/bin/bash while read inputline do what="$inputline" echo $what; if ; then exit fi $reextend $what $print ls -a done this is my code i am trying to change all of the file types of a certain directory to another file type but im not all the way there can someone help please (3 Replies)
Discussion started by: domdom110
3 Replies

4. Shell Programming and Scripting

Splitting large file and renaming based on field

I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this: HMMER3/b NAME 1-cysPrx_C ACC ... (2 Replies)
Discussion started by: fozrun
2 Replies

5. Shell Programming and Scripting

Append 1st field from a file into 2nd field of another file

Hi, I've internally searched through forums for about 2+ hours. Unfortunately, with no luck. Although I've found some cases close to mine below, but didn't help so much. Actually, I'm in short with time. So I had to post my case. Hoping that you can help. I have 2 files, FILE1 ... (1 Reply)
Discussion started by: amurib
1 Replies

6. Shell Programming and Scripting

Appending 1st field in a file into 2nd field in another file

Hi, I've internally searched through forums for about 2+ hours. Unfortunately, with no luck. Although I've found some cases close to mine below, but didn't help so much. Actually, I'm in short with time. So I had to post my case. Hoping that you can help. I have 2 files, FILE1 ... (0 Replies)
Discussion started by: amurib
0 Replies

7. Shell Programming and Scripting

#file naming

hi all, Please advise at what circumstance those file will become -rwxr-xr-x 1 psa psa 1969088 Aug 18 2006 #libaa.sl -rwx------ 1 psa psa 2166784 Jul 25 2006 #libcrypto.sl.0.9.7 -rwx------ 1 psa psa 904040 Jul 25 2006 #libxxx.sl -rwx------ 1 psa ... (2 Replies)
Discussion started by: rauphelhunter
2 Replies

8. Shell Programming and Scripting

issue in naming a file

Hi, I want to create a file named 'abc(+1)' and append the data of file 'abc' to it. But getting error as unexpected'(' when i tried to use the following command. cat abc > abc(+1) Is there any other way to include brackets along with +1 in the file name? TIA. (3 Replies)
Discussion started by: vimalr
3 Replies

9. Shell Programming and Scripting

naming a file to hostname

I am running a script remotely to another client. after it runs it places the file in /tmp. I need the file in /tmp to be renamed to the local hostname. but when i set the variable it names the file to my local hostname. how do i fix that (4 Replies)
Discussion started by: deaconf19
4 Replies

10. Shell Programming and Scripting

Splitting a file based on some condition and naming them

I have a file given below. I want to split the file where ever I came across ***(instead you can put ### symbols in the file) . Also I need to name the file by extracting the report name from the first line which is in bold(eg:RPT507A) concatinated with DD(day on which the file runs). Can someone... (1 Reply)
Discussion started by: srivsn
1 Replies
Login or Register to Ask a Question