Removing a character at specific position in a column


 
Thread Tools Search this Thread
Operating Systems Linux Removing a character at specific position in a column
# 1  
Old 10-13-2015
Removing a character at specific position in a column

Hi,

I have a file like this (about 8 columns in total, this being the 2nd column)
Code:
gi_49482297_ref_YP_039521.1_
gi_49482297_ref_YP_039521.1_
gi_49482315_ref_YP_039539.1_
gi_49482315_ref_YP_039539.1_

I want to remove the _ at the end of the line.
And at later stages I would want to replace the _ with another character perhaps.

how can I do it using awk or sed?

Any help would be highly appreciated.
# 2  
Old 10-13-2015
Hello Syeda,

Following may help you in same, let's say you have a Input_file as follows(which is an example as you haven't shown us complete input and didn't tell us about field separator so I am taking it as a test, where field separator is a space and which has 7 columns in it.)
Input_file:
Code:
cat Input_file
Ravinder gi_49482297_ref_YP_039521.1_ TESTing test123 sixth_column_ seventh eight_column_test
TEST121 gi_49482297_ref_YP_039521.1_ TESTing test123 sixth_column_ seventh eight_column_test
TEST1211 gi_49482315_ref_YP_039539.1_ TESTing test123 sixth_column_ seventh eight_column_test
TEST12134 gi_49482315_ref_YP_039539.1_ TESTing test123 sixth_column_ seventh eight_column_test

Now following code may help in same.
Code:
awk '{for(i=1;i<=NF;i++){if(i==2){sub(/\_$/,X,$i)} else {sub(/\_$/,"_new charachter",$i)};}} 1'  Input_file

Output will be as follows.
Code:
Ravinder gi_49482297_ref_YP_039521.1 TESTing test123 sixth_column_new charachter seventh eight_column_test
TEST121 gi_49482297_ref_YP_039521.1 TESTing test123 sixth_column_new charachter seventh eight_column_test
TEST1211 gi_49482315_ref_YP_039539.1 TESTing test123 sixth_column_new charachter seventh eight_column_test
TEST12134 gi_49482315_ref_YP_039539.1 TESTing test123 sixth_column_new charachter seventh eight_column_test

Where I am changing 2nd columns _ with NULL and other columns (only 5th column in my example file) _ with a string _new charachter which you can put it as per your requirement into code. Let us know if this helps you.


Thanks,
R. Singh

Last edited by RavinderSingh13; 10-13-2015 at 02:23 AM..
This User Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 10-13-2015
Thanks R. Singh but I am not really getting it, possibly because i have a very limited knowledge of awk commands.
what do I have to do if I only want to remove the _ from 2nd column? I have tried using the first part of your code but its not working.
Code:
awk '{for(i=1;i<=NF;i++){if(i==2){sub(/\_$/,X,$i)}

what am I doing wrong?
# 4  
Old 10-13-2015
Hello Syeda,

If you want to only substitute $2's _ present at last of $2then following may help you. As you had mentioned in first post that you need to substitute other columns _ too so I have taken POST#2 example, please try following and let me know if this helps you.
Input_file:
Code:
cat Input_file
Ravinder gi_49482297_ref_YP_039521.1_ TESTing test123 sizth_column_ seventh eight_column_test
TEST121 gi_49482297_ref_YP_039521.1_ TESTing test123 sizth_column_ seventh eight_column_test
TEST1211 gi_49482315_ref_YP_039539.1_ TESTing test123 sizth_column_ seventh eight_column_test
TEST12134 gi_49482315_ref_YP_039539.1_ TESTing test123 sizth_column_ seventh eight_column_test

Code:
awk '{sub(/\_$/,X,$2);print}'  Input_file

Output will be as follows.
Code:
Ravinder gi_49482297_ref_YP_039521.1 TESTing test123 sizth_column_ seventh eight_column_test
TEST121 gi_49482297_ref_YP_039521.1 TESTing test123 sizth_column_ seventh eight_column_test
TEST1211 gi_49482315_ref_YP_039539.1 TESTing test123 sizth_column_ seventh eight_column_test
TEST12134 gi_49482315_ref_YP_039539.1 TESTing test123 sizth_column_ seventh eight_column_test

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 5  
Old 10-13-2015
Oh yes I got it. thanks.
now i can change the code into
Code:
 awk '{sub(/\_$/,"anything",$2);print}

to print anything I want at the end of column 2.

Thanks a lot Smilie

One thing more, how can I specify the specific position at which I want to make the change? I mean if I want to change something that is not at the end of the column.
# 6  
Old 10-13-2015
Hello Syeda,

Here is an example suppose you want to substitute the 2nd occurrence of _ in $2 then following may help you.
Input_file:
Code:
Ravinder gi_49482297_ref_YP_039521.1_ TESTing test123 sizth_column_ seventh eight_column_test
TEST121 gi_49482297_ref_YP_039521.1_ TESTing test123 sizth_column_ seventh eight_column_test
TEST1211 gi_49482315_ref_YP_039539.1_ TESTing test123 sizth_column_ seventh eight_column_test
TEST12134 gi_49482315_ref_YP_039539.1_ TESTing test123 sizth_column_ seventh eight_column_test

Following is the code for same.
Code:
awk -vvar=2 '{split($2, A,"_");{for(i=1;i<=length(A);i++){if((i-1)==var){k=""} else {k="_"};q=q?q k A[i]:A[i]};$2=q;;q=""}} 1'  Input_file

Output will be as follows.
Code:
Ravinder gi_49482297ref_YP_039521.1_ TESTing test123 sizth_column_ seventh eight_column_test
TEST121 gi_49482297ref_YP_039521.1_ TESTing test123 sizth_column_ seventh eight_column_test
TEST1211 gi_49482315ref_YP_039539.1_ TESTing test123 sizth_column_ seventh eight_column_test
TEST12134 gi_49482315ref_YP_039539.1_ TESTing test123 sizth_column_ seventh eight_column_test

Here I have given a variable named var=2 in my code as I wanted to change only second occurrence in $2 of _.
You could change it accordingly as per your requirement too. Hope this helps.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count specific character of a file in each line and delete this character in a specific position

I will appreciate if you help me here in this script in Solaris Enviroment. Scenario: i have 2 files : 1) /tmp/TRANSACTIONS_DAILY_20180730.txt: 201807300000000004 201807300000000005 201807300000000006 201807300000000007 201807300000000008 2)... (10 Replies)
Discussion started by: teokon90
10 Replies

2. Post Here to Contact Site Administrators and Moderators

Search for a pattern and replace a space at specific position with a Character in File

In file, we have millions of records each of 1000 in length. And at specific position say 800 there is a space, we need to replace it with Character X if the ID in that row starts with 123. So far i have used the below which is replacing space at that position to X but its not checking for... (3 Replies)
Discussion started by: Jagmeet Singh
3 Replies

3. Shell Programming and Scripting

Delete character on specific position

Hi, im still new in unix. i want to ask how to delete character on specific position in line, lets say i want to remove 5 character from position 1000, so characters from position 1000-1005 will be deleted. i found this sed command can delete 4 characters from position 10, but i dont know if... (7 Replies)
Discussion started by: bluesue
7 Replies

4. Shell Programming and Scripting

Removing 0 from a specific position - if it exists

I have a file that I need to parse using a script. The dates in the file are displayed in the format: Mar 2, 2011 9:09:31 PM I have tried using the date command %e and %l but it pads an extra space for the day and hour if they are single digits. So this I used a normal date command: ... (6 Replies)
Discussion started by: crazyideas
6 Replies

5. Shell Programming and Scripting

using awk removing newline and specific position

Hello Friends, Input File looks as follows: >FASTA Header1 line1 line2 line3 linen >FASTA Header2 Line1 Line2 linen >FASTA Header3 and so on ....... Output: Want something as: >FASTA Header1 line1line2line3linen >FASTA Header2 (5 Replies)
Discussion started by: Deep9000
5 Replies

6. Shell Programming and Scripting

Using sed to replace specific character and specific position

I am trying to use sed to replace specific characters at a specific position in the file with a different value... can this be done? Example: File: A0199999123 A0199999124 A0199999125 Need to replace 99999 in positions 3-7 with 88888. Any help is appreciated. (5 Replies)
Discussion started by: programmer22
5 Replies

7. Shell Programming and Scripting

Insert character in a specific position of a file

Hi, I need to add Pipe (|) at 5th and 18th position of all records a file. How can I do this? I tried to add it at 5th position using the below code. It didnt work. Please help!!! awk '{substr($0,5,1) ~ /|/}{print}' $input_file > $temp_file (1 Reply)
Discussion started by: gpaulose
1 Replies

8. Shell Programming and Scripting

Print lines with specific character at nth position in a file

I need to print lines with character S at nth position in a file...can someone pl help me with appropriate awk command for this (1 Reply)
Discussion started by: manaswinig
1 Replies

9. Shell Programming and Scripting

Print lines with specific character at nth position in a file

I need to print lines with character S at nth position in a file...can someone pl help me with appropriate awk command for this (2 Replies)
Discussion started by: manaswinig
2 Replies

10. Shell Programming and Scripting

How to add character in specific position of a string?

Hi All, I would like to use sed to add "-" between the following string: Value: 20060830 Result: 2006-08-30 Pls advice. Thx a lot Victor (5 Replies)
Discussion started by: victorlung
5 Replies
Login or Register to Ask a Question