|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Split command
Hi I have a sequence which looks like this Code:
# PH01000000 PH01000000G0240 P.he_genemodel_v1.0 CDS 120721 121773 . - . ID=PH01000000G0240.CDS;Parent=PH01000000G0240 PH01000001G0190 P.he_genemodel_v1.0 mRA 136867 137309 . - . ID=PH01000001G0190.mRNA;Parent=PH01000001G0190 ............................................. PH01278028G0010 P.he_genemodel_v1.0 CDS 27 501.. . - . ID=PH01278028G0010;Description="oereed" PH01278104G0010 P.he_genemodel_v1.0 CDS 34 171 . - . ID=PH01278104G0010.CDS;Parent=PH01278104G0010 i want to split the first colum into 2 columns seperating first 10 bits as column 1 and then remainnig as column 2 and retain the remaining columns as it is. Code:
PH01000000 G0240 P.he_genemodel_v1.0 CDS 120721 121773 . - . ID=PH01000000G0240.CDS;Parent=PH01000000G0240 PH01000001 G0190 P.he_genemodel_v1.0 mRA 136867 137309 . - . ID=PH01000001G0190.mRNA;Parent=PH01000001G0190 ............................................. PH01278028 G0010 P.he_genemodel_v1.0 CDS 27 501.. . - . ID=PH01278028G0010;Description="oereed" PH01278104 G0010 P.he_genemodel_v1.0 CDS 34 171 . - . ID=PH01278104G0010.CDS;Parent=PH01278104G0010 i am doing this becoz i want to modify the first column and after modification i want to merge again. So is it possible to first split the 1st column into 2 and then after my modification merge them again? What command can i use to split and merge them Last edited by Scott; 03-13-2013 at 12:23 PM.. Reason: Please use code tags, and a meaningful thread title |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
One way would be Code:
sed 's:.:& :10' file to split, and Code:
sed 's: ::' file to merge again. |
| The Following User Says Thank You to RudiC For This Useful Post: | ||
siya@ (03-13-2013) | ||
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
What makes you think you need to split the 1st column before modifying it?
Why not just modify the 1st 10 characters on the line instead of splitting, modifying the 1st 10 characters on the line, and merging? |
| The Following User Says Thank You to Don Cragun For This Useful Post: | ||
siya@ (03-13-2013) | ||
|
#4
|
|||
|
|||
|
my required result is
string0G0240 P.he_genemodel_v1.0 CDS 120721 121773 . - . ID=PH01000000G0240.CDS;Parent=PH01000000G0240 string1G0190 P.he_genemodel_v1.0 mRA 136867 137309 . - . ID=PH01000001G0190.mRNA;Parent=PH01000001G0190 ............................................. string278028G0010 P.he_genemodel_v1.0 CDS 27 501.. . - . ID=PH01278028G0010;Description="oereed" string278104G0010 P.he_genemodel_v1.0 CDS 34 171 . - . ID=PH01278104G0010.CDS;Parent=PH01278104G0010 So if i need this to happen,I need to replace the entries of this format PH01 by string in first column directly but if i do it the entries of PH01278028G0010 will become string 278028G0001 as per my requirement but my entries of PH01000000G0240 will look like string000000G0240 which i want as string0G0240 so i thought i will split from 10 bits n do selective replace only on the first column Is my approach too run around the situation? thanks between for your feedback!! ![]() |
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
siya, Your description of what you are trying to do is not at all clear. Looking at the "required result" in message #4 in this thread, I'm guessing that you want to replace PH01 immediately followed by up to four zeros with string . If that is what you want, the following awk script will do that for you: Code:
awk 'match($1, /^PH010{0,4}/) {
$1 = "string" substr($1, RLENGTH+1)
}
1' inputIf you are using a Solaris/SunOS system, use /usr/xpg4/bin/awk or nawk instead of awk . If the file input contains the data specified in message #1 in this thread, the output is: Code:
string0G0240 P.he_genemodel_v1.0 CDS 120721 121773 . - . ID=PH01000000G0240.CDS;Parent=PH01000000G0240 string1G0190 P.he_genemodel_v1.0 mRA 136867 137309 . - . ID=PH01000001G0190.mRNA;Parent=PH01000001G0190 ............................................. string278028G0010 P.he_genemodel_v1.0 CDS 27 501.. . - . ID=PH01278028G0010;Description="oereed" string278104G0010 P.he_genemodel_v1.0 CDS 34 171 . - . ID=PH01278104G0010.CDS;Parent=PH01278104G0010 which matches what you specified in message #4 in this thread. |
| The Following User Says Thank You to Don Cragun For This Useful Post: | ||
siya@ (03-13-2013) | ||
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
Hi,
Sorry for the confusion!! I want to basically convert ONLY the first column of my entire sequence from PH01000000G0240 to string0G0240 PH01000001G0190 to string1G0190 PH01000002G0120 to string2G0120 ,.... .... PH01270000G0010 to string270000G0010 PH01278028G0014 to string278028G0014 PH012781040010 to string278104G0010 With respect to code,why does it have {0,4 }in initial part? I dint understand the part in code : awk 'match($1, /^PH010{0,4}/) Please do advise. Thanks ![]() Last edited by siya@; 03-13-2013 at 05:50 PM.. |
| Sponsored Links | |
|
|
#7
|
|||
|
|||
|
Quote:
But, it will not insert the G shown in red in your new example. That G did not appear at all in the 1st string whether or not we would break it into an initial 10 character field and a 2nd field with the remaining characters, or left it as a single field. PLEASE explain in English what you are trying to do instead of giving a small set of inconsistent examples! |
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| split command | arv600 | UNIX for Advanced & Expert Users | 7 | 01-12-2010 01:33 PM |
| filenames from split command | ChicagoBlues | UNIX for Dummies Questions & Answers | 3 | 11-11-2008 03:03 PM |
| Split Command in Perl | rochitsharma | UNIX for Advanced & Expert Users | 9 | 03-09-2008 03:56 AM |
| Split command | malaymaru | Shell Programming and Scripting | 1 | 11-18-2005 01:40 AM |
| Problem in split command | superprogrammer | UNIX for Dummies Questions & Answers | 4 | 06-06-2005 01:25 AM |
|
|