Rearrangement of data content problem


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Rearrangement of data content problem
# 1  
Old 09-26-2010
Rearrangement of data content problem

Input data:
Code:
>sample_1
WETYUPVLGK
DGGHHHWETY
QPERTTGGLO

>sample_2
WRRTTOOLLP
MKMKNJUTYE
DLGLTTOC
.
.

Desired output:
Code:
>sample_1
WETYUP
VLGKDG
GHHHWE
TYQPER
TTGGLO

>sample_2
WRRTTO
OLLPMK
MKNJUT
YEDLGL
TTOC
.
.

The content of my input data is 10 symbol per line. My purpose is to convert all of them becomes 6 symbol per line.
Thanks for any advice and suggestion Smilie
# 2  
Old 09-26-2010
Code:
$ ruby -00 -ne 'f=$_.split("\n");f[1..-1]=f[1..-1].join.scan(/.{6}/);f[-1]<<"\n\n";print f.join("\n")' file
>sample_1
WETYUP
VLGKDG
GHHHWE
TYQPER
TTGGLO

>sample_2
WRRTTO
OLLPMK
MKNJUT
YEDLGL

# 3  
Old 09-26-2010
Code:
awk '/>/||/^$/ {print;next} {printf $0}' infile |awk  'BEGIN{FS=OFS=""}! />/ {for (i=1;i<=int(NF/6);i++) $(i*6)=$(i*6) RS}1'

This User Gave Thanks to rdcwayx For This Post:
# 4  
Old 09-26-2010
Another approach:
Code:
awk '{printf("%s%s\n",r,$1);r=ORS;$1=""} {
  for (i=1;i<=length;i+=6) {
    print substr($0,i,6)
  }
}' RS= OFS= file

This User Gave Thanks to Franklin52 For This Post:
# 5  
Old 09-26-2010
And a Perl one-liner -

Code:
$
$
$ cat f22
WETYUPVLGK
DGGHHHWETY
QPERTTGGLO
WRRTTOOLLP
MKMKNJUTYE
DLGLTTOCXX
$
$ perl -lne '$x.=$_; do{print substr($x,0,6); $x=substr($x,6)} until(length $x<6)' f22
WETYUP
VLGKDG
GHHHWE
TYQPER
TTGGLO
WRRTTO
OLLPMK
MKNJUT
YEDLGL
TTOCXX
$
$

tyler_durden
# 6  
Old 09-26-2010
Code:
tr -d \\\n <file | sed 's/\(.\{6\}\)/&\
/g'

# 7  
Old 09-26-2010
Code:
sed -n '/>/{p;n;N;N;s/\n//g;s/....../&\n/gp}' infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with duplicate data content problem asking

Input file: A_69510335_ASD>aw 1199470 USA A_119571157_C>awe,QWEQE 113932840 USA C_34646666_qwe>TAWTT,G,TT 112736796 UK C_69510335_QW>T 1199470 USA D_70520237_WR>QEE,G 34459863 UK D_71380003_QWR>T 145418226 IK . Desired output: A_69510335_ASD>aw 1199470 USA... (1 Reply)
Discussion started by: perl_beginner
1 Replies

2. Shell Programming and Scripting

Help with data rearrangement based on share same content

Input file data_2 USA data_2 JAPAN data_3 UK data_4 Brazil data_5 Singapore data_5 Indo data_5 Thailand data_6 China Desired output file data_2 USA/JAPAN data_3 UK data_4 Brazil data_5 Singapore/Indo/Thailand data_6 China I would like to merge all data content that share same... (2 Replies)
Discussion started by: perl_beginner
2 Replies

3. Shell Programming and Scripting

Data reformat and rearrangement problem asking

Input file: dependent general_process dependent general_process regulation general_process - - template component food component binding data_rearrangement binding data_rearrangement specific_activity data_rearrangement - ... (7 Replies)
Discussion started by: cpp_beginner
7 Replies

4. Shell Programming and Scripting

Help with renaming data content

Input file: data21_a0_result1 data23_a1_result1 data43_a0_result1 data43_a1_result2 data43_a0_result3 data3_c0_result1 data3_c1_result1 data423_c0_result1 data423_c1_result1 data423_a0_result1 . . data9_c1_result1 Desired output file: data1_result1 data2_result1 (3 Replies)
Discussion started by: cpp_beginner
3 Replies

5. Shell Programming and Scripting

Help with replace data content

Format of one input file: # >length=1 seq program data 909 1992 seq program record 909 1190 Desired output result: # >length=1 length=1 program data 909 1992 length=1 program record 909 1190 I wanna to replace all the column 1 content (exclude the content start with "#") with the... (5 Replies)
Discussion started by: cpp_beginner
5 Replies

6. Shell Programming and Scripting

Help with rename data content

Input file: data21_result1 data23_result1 data43_result1 data43_result2 data43_result3 data3_result1 . . data9_result1 Desired output data1_result1 data2_result1 data3_result1 data3_result2 data3_result3 data4_result1 (3 Replies)
Discussion started by: perl_beginner
3 Replies

7. Shell Programming and Scripting

Help with reformat data content

input file: hsa-miR-4726-5p Score hsa-miR-483-5p Score hsa-miR-125b-2* Score hsa-miR-4492 hsa-miR-4508 hsa-miR-4486 Score Desired output file: hsa-miR-4726-5p Score hsa-miR-483-5p Score hsa-miR-125b-2* Score hsa-miR-4492 hsa-miR-4508 hsa-miR-4486 Score ... (6 Replies)
Discussion started by: perl_beginner
6 Replies

8. Shell Programming and Scripting

Scan and change file data content problem

Input file >Read_1 XXXXXXXXXXSDFXXXXXDS (condition 1: After the last "X" per line, if the distance is less than or equal to 3 letter, replace those not "X" letter with "X") TREXXXXXXXSDFXXXXXDS (condition 2: Before the first "X" per line, if the distance is less than or equal to 3 letter,... (12 Replies)
Discussion started by: patrick87
12 Replies

9. Shell Programming and Scripting

Extract specific content from data and rename its header problem asking

Input file 1: >pattern_5 GAATTCGTTCATGTAGGTTGASDASFGDSGRTYRYGHDGSDFGSDGGDSGSDGSDFGSDF ATTTAATTATGATTCATACGTCATATGTTATTATTCAATCGTATAAAATTATGTGACCTT SDFSDGSDFKSDAFLKJASLFJASKLFSJAKJFHASJKFHASJKFHASJKFHSJAKFHAW >pattern_1 AAGTCTTAAGATATCACCGTCGATTAGGTTTATACAGCTTTTGTGTTATTTAAATTTGAC... (10 Replies)
Discussion started by: patrick87
10 Replies

10. Shell Programming and Scripting

Extract specific data content from a long list of data

My input: Data name: ABC001 Data length: 1000 Detail info Data Direction Start_time End_time Length 1 forward 10 100 90 1 forward 15 200 185 2 reverse 50 500 450 Data name: XFG110 Data length: 100 Detail info Data Direction Start_time End_time Length 1 forward 50 100 50 ... (11 Replies)
Discussion started by: patrick87
11 Replies
Login or Register to Ask a Question