How to append strings with whitespace?


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers How to append strings with whitespace?
# 1  
Old 09-03-2019
How to append strings with whitespace?

Hi,

Need help. This seems simple but I tried many things but failed to get what I wanted. Below is the input file:

Code:
Chr1	lnci	exon	83801516	83803251	.	-	.	gene_id"LINC01725";	transcript_id"LINC01725:44";	gene_alias_1"ENSG00000233008";	gene_alias_2"RP11-475O6.1";	gene_alias_3"ENSG00000233008.1";	gene_alias_4"OTTHUMG00000009930.1";	gene_alias_5"ENSG00000233008.5";	gene_alias_6"LINC01725";	gene_alias_7"LOC101927560";	transcript_alias_1"ENST00000457273";	transcript_alias_2"ENST00000457273.1";	transcript_alias_3"RP11-475O6.1-005";	transcript_alias_4"OTTHUMT00000027496.1";	transcript_alias_5"NONHSAT004171";	transcript_alias_6"NR_119374";	transcript_alias_7"ENST00000457273.5";	transcript_alias_8"NR_119374.1";
chr16	lnci	exon	83849907	83850022	.	-	.	gene_id"LINC01725";	transcript_id"LINC01725:44";	gene_alias_1"ENSG00000233008";	gene_alias_2"RP11-475O6.1";	gene_alias_3"ENSG00000233008.1";	gene_alias_4"OTTHUMG00000009930.1";

I need to append each row by adding a whitespace after field id starting from column 9 onwards. The output should be like below:-

Code:
Chr1	lnci	exon	83801516	83803251	.	-	.	gene_id "LINC01725";	transcript_id "LINC01725:44";	gene_alias_1 "ENSG00000233008";	gene_alias_2 "RP11-475O6.1";	gene_alias_3 "ENSG00000233008.1";	gene_alias_4 "OTTHUMG00000009930.1";	gene_alias_5 "ENSG00000233008.5";	gene_alias_6 "LINC01725";	gene_alias_7 "LOC101927560";	transcript_alias_1 "ENST00000457273";	transcript_alias_2 "ENST00000457273.1";	transcript_alias_3 "RP11-475O6.1-005";	transcript_alias_4 "OTTHUMT00000027496.1";	transcript_alias_5 "NONHSAT004171";	transcript_alias_6 "NR_119374";	transcript_alias_7 "ENST00000457273.5";	transcript_alias_8 "NR_119374.1";
chr16	lnci	exon	83849907	83850022	.	-	.	gene_id "LINC01725";	transcript_id "LINC01725:44";	gene_alias_1 "ENSG00000233008";	gene_alias_2 "RP11-475O6.1";	gene_alias_3 "ENSG00000233008.1";	gene_alias_4 "OTTHUMG00000009930.1";

Really appreciate your kind help. Thanks
# 2  
Old 09-03-2019
Can you post what you tried?
# 3  
Old 09-03-2019
Also note that I do not see any difference between your input and output examples. It is hard to help without an idea of what your tried. It also is extremely helpful for good answers to include your OS and shell. Thank you.
This User Gave Thanks to jim mcnamara For This Post:
# 4  
Old 09-03-2019
Quote:
Originally Posted by anbu23
Can you post what you tried?
One of the codes that i did :-

Code:
sed -e 's/\.*id/& \ /' -e 's/\.*alias_./& \ /' inputfile

It worked in certain columns only.

--- Post updated at 02:33 PM ---



Quote:
Originally Posted by jim mcnamara
Also note that I do not see any difference between your input and output examples. It is hard to help without an idea of what your tried. It also is extremely helpful for good answers to include your OS and shell. Thank you.
The difference is the "whitespace" before the quote. For instance, in column 9,

Code:
gene_id"LINC01725";  ---> gene_id "LINC01725";

I am using MacOS
# 5  
Old 09-03-2019
Code:
$ sed -e 's/[^\t]*id/& \ /g' -e 's/[^\t]*alias_./& \ /g' file
Chr1    lnci    exon    83801516        83803251        .       -       .       gene_id  "LINC01725";   transcript_id  "LINC01725:44";    gene_alias_1  "ENSG00000233008";        gene_alias_2  "RP11-475O6.1";   gene_alias_3  "ENSG00000233008.1";gene_alias_4  "OTTHUMG00000009930.1";   gene_alias_5  "ENSG00000233008.5";      gene_alias_6  "LINC01725";      gene_alias_7  "LOC101927560";     transcript_alias_1  "ENST00000457273";  transcript_alias_2  "ENST00000457273.1";        transcript_alias_3  "RP11-475O6.1-005";   transcript_alias_4  "OTTHUMT00000027496.1";     transcript_alias_5  "NONHSAT004171";    transcript_alias_6  "NR_119374";  transcript_alias_7  "ENST00000457273.5";        transcript_alias_8  "NR_119374.1";
chr16   lnci    exon    83849907        83850022        .       -       .       gene_id  "LINC01725";   transcript_id  "LINC01725:44";    gene_alias_1  "ENSG00000233008";        gene_alias_2  "RP11-475O6.1";   gene_alias_3  "ENSG00000233008.1";gene_alias_4  "OTTHUMG00000009930.1";

This User Gave Thanks to anbu23 For This Post:
# 6  
Old 09-03-2019
Quote:
Originally Posted by anbu23
Code:
$ sed -e 's/[^\t]*id/& \ /g' -e 's/[^\t]*alias_./& \ /g' file
Chr1    lnci    exon    83801516        83803251        .       -       .       gene_id  "LINC01725";   transcript_id  "LINC01725:44";    gene_alias_1  "ENSG00000233008";        gene_alias_2  "RP11-475O6.1";   gene_alias_3  "ENSG00000233008.1";gene_alias_4  "OTTHUMG00000009930.1";   gene_alias_5  "ENSG00000233008.5";      gene_alias_6  "LINC01725";      gene_alias_7  "LOC101927560";     transcript_alias_1  "ENST00000457273";  transcript_alias_2  "ENST00000457273.1";        transcript_alias_3  "RP11-475O6.1-005";   transcript_alias_4  "OTTHUMT00000027496.1";     transcript_alias_5  "NONHSAT004171";    transcript_alias_6  "NR_119374";  transcript_alias_7  "ENST00000457273.5";        transcript_alias_8  "NR_119374.1";
chr16   lnci    exon    83849907        83850022        .       -       .       gene_id  "LINC01725";   transcript_id  "LINC01725:44";    gene_alias_1  "ENSG00000233008";        gene_alias_2  "RP11-475O6.1";   gene_alias_3  "ENSG00000233008.1";gene_alias_4  "OTTHUMG00000009930.1";

Sorry, it still did not work on my actual data. It worked for "transcript_alias_#" but it created 2 whitespaces. I just need 1 whitespace. and It did not work for "gene_alias_#" at all. Also, it created whitespaces at wrong location. for instance,

Code:
gene_alias_1  0"LOC101928035";   it supposed to be gene_alias_10 "LOC101928035";

thanks

Last edited by bunny_merah19; 09-03-2019 at 11:31 AM..
# 7  
Old 09-03-2019
Looks like you want to prefix every double quoted string with a space. How far would

Code:
sed 's/"[^"]*"/ &/g' file

get you, provided the double quotes certainley, reliably appear in pairs?
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to pass strings from a list of strings from another file and create multiple files?

Hello Everyone , Iam a newbie to shell programming and iam reaching out if anyone can help in this :- I have two files 1) Insert.txt 2) partition_list.txt insert.txt looks like this :- insert into emp1 partition (partition_name) (a1, b2, c4, s6, d8) select a1, b2, c4, (2 Replies)
Discussion started by: nubie2linux
2 Replies

2. Shell Programming and Scripting

Need to append matching strings in a file

Hi , I am writing a shell script to check pvsizes in linux box. # for i in `cat vgs1` > do > echo "########### $i ###########" > pvs|grep -i $i|awk '{print $2,$1,$5}'>pvs_$i > pvs|grep -i $i|awk '{print $1}'|while read a > do > fdisk -l $a|head -2|tail -1|awk '{print $2,$3}'>pvs_$i1 >... (3 Replies)
Discussion started by: nanduri
3 Replies

3. UNIX for Dummies Questions & Answers

Append command and strings to a text file

hi gurus, I'm executing some commands and I want to append both the command and output to a text file. Example: echo "strings -a wicmex.o|grep '$Header'" >> tmp.txt strings -a wicmex.o|grep '$Header' >> tmp.txt echo "strings -a libwip.a|grep '$Header'" >> tmp.txt strings -a libwip.a|grep... (1 Reply)
Discussion started by: donisback
1 Replies

4. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

The question is not as simple as the title... I have a file, it looks like this <string name="string1">RZ-LED</string> <string name="string2">2.0</string> <string name="string2">Version 2.0</string> <string name="string3">BP</string> I would like to check for duplicate entries of... (11 Replies)
Discussion started by: raidzero
11 Replies

5. Shell Programming and Scripting

Sed: Remove whitespace between two strings

I have a 13 number string, some whitespace, and then /mp3. I need to join them. Everyline that I need this for begins with "cd" (without the quotes). What it looks like now: cd media/Audio/WAVE/9781933976334 /mp3 What I want my output to be: cd media/Audio/WAVE/9781933976334/mp3 The 13... (7 Replies)
Discussion started by: glev2005
7 Replies

6. Shell Programming and Scripting

How to match (whitespace digits whitespace) sequence?

Hi Following is an example line. echo "192.22.22.22 \"33dffwef\" 200 300 dsdsd" | sed "s:\(\ *\ \):\1:" I want it's output to be 200 However this is not the case. Can you tell me how to do it? I don't want to use AWK for this. Secondly, how can i fetch just 300? Should I use "\2"... (3 Replies)
Discussion started by: shahanali
3 Replies

7. Shell Programming and Scripting

How to append some strings line by line?

I would like to append the numbers 1, 2, 3, 4 and so on to the lines of the file: Adam Wilkins | Colorado | 36 John Butler | Los Angeles | 47 Cassey Johnson | Minneapolis | 25 Albert Aniston | Miami | 19 .... Tony Legler | Sacramento | 55 Matt Simmons | New York | 38 Output would look... (4 Replies)
Discussion started by: xinoo
4 Replies

8. Shell Programming and Scripting

Need help with command to append strings

Greetings all, I'm in need of some help in coming up with this command which requires me to append 5 strings together: 1. echo "Status from system:" 2. `cat logs.txt` (i need the output of this command) 3. echo "Error output: " 4. `cat errors.txt`(i need the output of this command) 5.... (3 Replies)
Discussion started by: rockysfr
3 Replies

9. UNIX for Dummies Questions & Answers

How to append "spaces" between strings

HI, Supose I have the folowing strings: "unix" and "linux". I want to concatenate the two strings, inserting between them a variable number of spaces. ex1: unix linux ex2: unix linux Can you help me in this simple problem? Regards, Elio (2 Replies)
Discussion started by: efernandes
2 Replies

10. UNIX for Dummies Questions & Answers

Append strings with filler spaces

Hi I am looping through the contents of a file as follows cat file |while read inrec do echo $inrec >> $TMP done (obviously this isn't all i am doing as it would be pointless but for the sake of the problem this is the important bit) The file has fields which are separated by... (1 Reply)
Discussion started by: handak9
1 Replies
Login or Register to Ask a Question