Change a character based on its position number


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Change a character based on its position number
# 1  
Old 10-08-2012
Change a character based on its position number

Hi I have a text file that I want to change some of the characters based on their position. My file contain multiple lines and characters should be counted continuously line by line. For example, I want to convert the 150th T to C. What can I do? Here is a portion of my file:
Code:
GATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCAT
TTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGACGCTG
GAGCCGGAGCACCCTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATT
CTATTATTTATCGCACCTACGTTCAATATTACAGGCGAACATACCTACTA
AAGTGTGTTAATTAATTAATGCTTGTAGGACATAATAATAACAATTGAAT
GTCTGCACAGCCGCTTTCCACACAGACATCATAACAAAAAATTTCCACCA
AACCCCCCCCTCCCCCCGCTTCTGGCCACAGCACTTAAACACATCTCTGC
CAAACCCCAAAAACAAAGAACCCTAACACCAGCCTAACCAGATTTCAAAT
TTTATCTTTAGGCGGTATGCACTTTTAACAGTCACCCCCCAACTAACACA
TTATTTTCCCCTCCCACTCCCATACTACTAATCTCATCAATACAACCCCC
GCCCATCCTACCCAGCACACACACACCGCTGCTAACCCCATACCCCGAAC
CAACCAAACCCCAAAGACACCCCCCACAGTTTATGTAGCTTACCTCCTCA
AAGCAATACACTGAAAATGTTTAGACGGGCTCACATCACCCCATAAACAA
ATAGGTTTGGTCCTAGCCTTTCTATTAGCTCTTAGTAAGATTACACATGC
AAGCATCCCCGTTCCAGTGAGTTCACCCTCTAAATCACCACGATCAAAAG
GGACAAGCATCAAGCACGCAGCAATGCAGCTCAAAACGCTTAGCCTAGCC
ACACCCCCACGGGAAACAGCAGTGATTAACCTTTAGCAATAAACGAAAGT
TTAACTAAGCTATACTAACCCCAGGGTTGGTCAATTTCGTGCCAGCCACC
GCGGTCACACGATTAACCCAAGTCAATAGAAGCCGGCGTAAAGAGTGTTT


Last edited by Scott; 10-08-2012 at 02:17 PM.. Reason: Code tags, please...
# 2  
Old 10-08-2012
try this...

Code:
awk -F "" '{if((max+NF)>150){for(i=1;i<=NF;i++){if((max+i) == 150 && $i ~ /T/){$i = "C"}}}else{max+=NF}}1' file

This User Gave Thanks to pamu For This Post:
# 3  
Old 10-08-2012
Code:
awk -v p=150 -v l="C" '{for (i=1; i<=length($0); i++) {++c; o=$0; if (c==p) o=substr($0,1,i-1) l substr($0,i+1);};print o;}' infile

This User Gave Thanks to rdrtx1 For This Post:
# 4  
Old 10-08-2012
Here is a much longer alternative way to do this:
Code:
#!/bin/ksh
# Usage: tester [count [from [to]]]
#       Change the "count"th occurrence of the character specified by "from"
#       to the character specified by "to".  If not given on the command line,
#               "count" defaults to 150,
#               "from" defaults to "T", and
#               "to" defaujlts to "C".
awk -v cnt=${1:-150} -v from=${2:-T} -v to=${3:-C} 'BEGIN{tmpc = "\a"}
cnt>0 {
        # See if changing every "from" character on this line will go too far.
        if((n = gsub(from, from)) < cnt) {
                # No.  Reduce cnt by the number of "from" characters found and
                # print the unchanged line.
                cnt -= n
        } else {
                # The "from" character we need to change is on this line.
                # Change cnt - 1 "from" characters to "tmpc" characters.
                for(i = 1; i < cnt; i++) sub(from, tmpc)
                # Change the desired "from" charaacter to the "to" character.
                sub(from, to)
                # Change the "tmpc" characters inserted above back to "from"
                # characters.
                gsub(tmpc, from)
                # Note that we are done looking.
                cnt = 0
        }
        # Fall through to next action to print the processed line.
}
1
END {   if(cnt) {
                printf("Still looking for %d %s characters when EOF found.\n",
                        cnt, from)
                exit 1
        }
        exit 0
}' in

This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 10-09-2012
Hi Guys,
Thanks for the codes you sent. However, they don't work properly when I change the the number of the letter. Here I've attached the file which may help you work on it a little bit more easily. Thank you agian for putting effort and time on my problem.
# 6  
Old 10-09-2012
Which letter do you want to change?
Which letter do you want to replace it?
Which occurrence of that latter do you want to change?

You said my script doesn't work properly. What does that mean? Did it produce a diagnostic message? Was some other character in the file changed? What system are you using?

When I try to change the 150th occurrence of T to C in the file you attached using the script I provided, the T in column 20 of line 13 does change from T to C just as you said you wanted.

Looking more closely at your input file I see that there are 4998 As, 5095 Cs, 2108 Gs, 3997 Ts, and 332 newlines in a file that contains 17,235 bytes. Could the problem be that there are four sequences of lowercase letters in the big file you attached, but only uppercase letters in the sample that you said was representative of your entire file?

Last edited by Don Cragun; 10-09-2012 at 10:58 PM.. Reason: The sample data was not representative of the actual data???
# 7  
Old 10-09-2012
Code:
awk -v p=150 -v r="T" -v l="C" '{for (i=1; i<=length($0); i++) {s=substr($0,i,1);if (s~r)n++;if (n==p)s=l;printf s}print ""}' infile

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count specific character of a file in each line and delete this character in a specific position

I will appreciate if you help me here in this script in Solaris Enviroment. Scenario: i have 2 files : 1) /tmp/TRANSACTIONS_DAILY_20180730.txt: 201807300000000004 201807300000000005 201807300000000006 201807300000000007 201807300000000008 2)... (10 Replies)
Discussion started by: teokon90
10 Replies

2. Shell Programming and Scripting

Search for a string at a particular position and replace with blank based on position

Hi, I have a file with multiple lines(fixed width dat file). I want to search for '02' in the positions 45-46 and if available, in that lines, I need to replace value in position 359 with blank. As I am new to unix, I am not able to figure out how to do this. Can you please help me to achieve... (9 Replies)
Discussion started by: Pradhikshan
9 Replies

3. Shell Programming and Scripting

Splitting based on occurence of a Character at fixed position

I have a requirement where i need to split a file based on occurence of a character which is present at a fixed position. Description is as below: 1. The file will be more than 1 Lakh records. 2. Each line will be of fixed length of 987 characters. 3. At position 28 in each line either 'C' or... (9 Replies)
Discussion started by: Neelkanth
9 Replies

4. Shell Programming and Scripting

Replacing a character with a number based on lines

Hi, I am in need of help for the two things which is to be done. First, I have a file that has around four columns. The first column is filled with letter "A". There are around 400 lines in the files as shown below. A 1 5.2 3.2 A 2 0.2 4.5 A 1 2.2 2.2 A 5 2.1 ... (2 Replies)
Discussion started by: begin_shell
2 Replies

5. Shell Programming and Scripting

Find character and Replace character for given position

Hi, i want find the character '-' in a file from position 284-298, if it occurs i need to replace it with 'O ' for the position in the file. How to do that using SED command. thanks in advance, Sara (9 Replies)
Discussion started by: Sara183
9 Replies

6. Linux

Linux script to remove a character in a file based on position.

Greetings, We have a requirement where we need to loop in a fixed width file in linux and remove a character based on a position for every record. It would highly appreciate if someone can help to automate this. Appreciate your time and help! Regards (3 Replies)
Discussion started by: mailme0205
3 Replies

7. Shell Programming and Scripting

Cut multiple data based on character position

How to extract multiple data based on character position. I need to fetch from 7-9 and 22-26 and there is no delimiter for 22-26 since it is part of the column. The file may have more than 1000 character long.I managed to pull any one but not both for example test data 12345 zxc vbnmlk... (1 Reply)
Discussion started by: zooby
1 Replies

8. Shell Programming and Scripting

Change Position of word character

Hi, I have following format in file aaa with content below, and would like to seek help from forumer about how to change and swap the position on 2nd field. 5874957|901125| 95874960|650614| 95874966|870308| 901125 to be changed as 25-11-1990 for eg Can someone help please ?? :) ... (6 Replies)
Discussion started by: cedrichiu
6 Replies

9. Shell Programming and Scripting

Sorting a flat file based on multiple colums(using character position)

Hi, I have an urgent task here. I am required to sort a flat file based on multiple columns which are based on the character position in that line. I am restricted to use the character position instead of the space and sort +1 +2 etc to do the sorting. I understand that there is a previous... (8 Replies)
Discussion started by: cucubird
8 Replies

10. UNIX for Dummies Questions & Answers

Character position

Hi , I am required to view the fixed postion file very often . I am looking for the utility like this if the file has a one or multile line abcdefghijklmnopqr Utility should make my file look like this 12345678910111213141516-------------------------- abcdefghijk l m n o p q r ... (4 Replies)
Discussion started by: akrathi
4 Replies
Login or Register to Ask a Question