Splitting based on occurence of a Character at fixed position | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Splitting based on occurence of a Character at fixed position

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 07-21-2013
Neelkanth Neelkanth is offline
Registered User
 
Join Date: May 2013
Last Activity: 24 September 2013, 4:55 AM EDT
Posts: 12
Thanks: 0
Thanked 0 Times in 0 Posts
Question Splitting based on occurence of a Character at fixed position

I have a requirement where i need to split a file based on occurence of a character which is present at a fixed position. Description is as below:
1. The file will be more than 1 Lakh records.
2. Each line will be of fixed length of 987 characters.
3. At position 28 in each line either 'C' or 'D' will be present.
4. I need to split the file whenever occurence of 'D' is there.
5. Also the file name of the splitted files should have some common characters, something like <Original File Name>_aa,<Original File Name>_ab,<Original File Name>_ac and so on.
PFB example of the file:


Code:
666617000338    INR        C           1800.0
655517000338    INR        C           1000.0
644417000338    INR        C           1800.0
655517000338    INR        C           1500.0
666617000338    INR        C           1200.0
699917000338    INR        C           1100.0
688817000338    INR        C           1500.0
644417000338    INR        D          10000.0
655517000338    INR        C           1800.0
677717000338    INR        C           1800.0
699917000338    INR        C           1800.0
622217000338    INR        D           3600.0

So the splitted files should be like:
First File:

Code:
666617000338    INR        C           1800.0
655517000338    INR        C           1000.0
644417000338    INR        C           1800.0
655517000338    INR        C           1500.0
666617000338    INR        C           1200.0
699917000338    INR        C           1100.0
688817000338    INR        C           1500.0
644417000338    INR        D          10000.0

and second file should be like:

Code:
655517000338   INR         C            1800.0
677717000338   INR         C            1800.0
699917000338   INR         C            1800.0
622217000338   INR         D            3600.0

ans so on.
Moderator's Comments:
I note that none of the input shown matches the description of the input files. (None of the input shown comes anywhere close to be a fixed length of 987 characters.) Not using CODE tags exacerbated the problem because without the code tags, HTML processing coalesces adjacent space characters.

Please use CODE tags when showing code, input, and output samples.

Last edited by Don Cragun; 07-21-2013 at 01:48 PM.. Reason: CODE tags
Sponsored Links
    #2  
Old 07-21-2013
bartus11's Avatar
bartus11 bartus11 is offline Forum Staff  
Moderator
 
Join Date: Apr 2009
Last Activity: 23 November 2014, 6:49 AM EST
Posts: 3,720
Thanks: 7
Thanked 1,147 Times in 1,118 Posts
Will the "C" or "D" character be always in the third column of the file?
Sponsored Links
    #3  
Old 07-21-2013
Neelkanth Neelkanth is offline
Registered User
 
Join Date: May 2013
Last Activity: 24 September 2013, 4:55 AM EDT
Posts: 12
Thanks: 0
Thanked 0 Times in 0 Posts
No the column is not fixed, only the position is fixed.
    #4  
Old 07-21-2013
bartus11's Avatar
bartus11 bartus11 is offline Forum Staff  
Moderator
 
Join Date: Apr 2009
Last Activity: 23 November 2014, 6:49 AM EST
Posts: 3,720
Thanks: 7
Thanked 1,147 Times in 1,118 Posts
So why in your example the "C"/"D" is at position 18 and not 28?
Sponsored Links
    #5  
Old 07-21-2013
cfajohnson's Avatar
cfajohnson cfajohnson is offline Forum Advisor  
Shell programmer, author
 
Join Date: Mar 2007
Last Activity: 31 August 2014, 7:32 PM EDT
Location: Toronto, Canada
Posts: 2,877
Thanks: 0
Thanked 110 Times in 102 Posts

Code:
position=18
char=D

awk -v p="$position" -v c="$char" '
BEGIN { basefile = "txt"; filename = basefile "" ++x }
{print > filename}
(substr($0,p,1) == c) { filename = basefile "" ++x }
' "$file"

Sponsored Links
    #6  
Old 07-21-2013
Neelkanth Neelkanth is offline
Registered User
 
Join Date: May 2013
Last Activity: 24 September 2013, 4:55 AM EDT
Posts: 12
Thanks: 0
Thanked 0 Times in 0 Posts
Hi Bartus,

The postion is coming as 18 because multiple spaces after 338 are getting truncated while posting on the forum.
Sponsored Links
    #7  
Old 07-21-2013
bartus11's Avatar
bartus11 bartus11 is offline Forum Staff  
Moderator
 
Join Date: Apr 2009
Last Activity: 23 November 2014, 6:49 AM EST
Posts: 3,720
Thanks: 7
Thanked 1,147 Times in 1,118 Posts
Use code tags to keep the original spacing.
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Fixed width file search based on position value onesuri Shell Programming and Scripting 4 03-07-2013 08:34 AM
Change a character based on its position number a_bahreini UNIX for Dummies Questions & Answers 10 10-17-2012 02:32 PM
Splitting file based on pattern and first character pema.yozer Shell Programming and Scripting 8 05-29-2012 03:16 AM
Using grep to check for character at fixed position tjmannonline UNIX for Dummies Questions & Answers 2 03-31-2011 11:41 AM
Append line based on fixed position ashikin_8119 Shell Programming and Scripting 2 03-19-2008 05:09 AM



All times are GMT -4. The time now is 12:58 PM.