Adding a prefix to a column using awk/sed commands


 
Thread Tools Search this Thread
Operating Systems Linux Adding a prefix to a column using awk/sed commands
# 1  
Old 12-29-2010
Adding a prefix to a column using awk/sed commands

Hello,

I am a newbie to linux and struggling to find a better way to append a column in a text file.
Here is the file i want to modify: It has 8 columns (and thousands of rows). I want to append the first column by adding "chr" infront of the numbers. Some rows have a string in the first column and I don't want to change them.
1 . miRNA 548816 548893 . + . ACC="MI0002023"; ID="dre-mir-155";
1 . miRNA 1651461 1651541 . + . ACC="MI0002180"; ID="dre-mir-459";
1 . miRNA 23269491 23269603 . - . ACC="MI0004786"; ID="dre-mir-740";
1 . miRNA 27656240 27656327 . + . ACC="MI0002052"; ID="dre-mir-218a-2";
1 . miRNA 34527751 34527843 . + . ACC="MI0004780"; ID="dre-mir-734";
1 . miRNA 40174414 40174523 . + . ACC="MI0010857"; ID="dre-mir-2197";
1 . miRNA 46862496 46862635 . - . ACC="MI0001895"; ID="dre-mir-16b";
1 . miRNA 46862739 46862822 . - . ACC="MI0001891"; ID="dre-mir-15a-1";
1 . miRNA 55355143 55355233 . - . ACC="MI0004765"; ID="dre-mir-722";
2 . miRNA 1085488 1085564 . + . ACC="MI0002181"; ID="dre-mir-460";
2 . miRNA 6031391 6031475 . + . ACC="MI0002000"; ID="dre-mir-137-1";
2 . miRNA 22105590 22105669 . - . ACC="MI0004782"; ID="dre-mir-736";
2 . miRNA 23568780 23568883 . - . ACC="MI0010841"; ID="dre-mir-2190";
2 . miRNA 25338635 25338716 . - . ACC="MI0001966"; ID="dre-mir-124-1";
2 . miRNA 31878456 31878533 . + . ACC="MI0001916"; ID="dre-mir-23a-3";
2 . miRNA 31880346 31880476 . + . ACC="MI0001928"; ID="dre-mir-27a";
2 . miRNA 34798348 34798457 . + . ACC="MI0010847"; ID="dre-mir-2198";
2 . miRNA 44164796 44164904 . - . ACC="MI0001366"; ID="dre-mir-181b-1";
2 . miRNA 57907954 57908073 . - . ACC="MI0001879"; ID="dre-mir-7a-3";


Is there any simple way to change the first column. Any help will be appreciated.

Thanks
# 2  
Old 12-29-2010
Input
Code:
$ cat file
1	.	miRNA	548816	548893	.	+	.	ACC="MI0002023"; ID="dre-mir-155";
1	.	miRNA	1651461	1651541	.	+	.	ACC="MI0002180"; ID="dre-mir-459";
1	.	miRNA	23269491	23269603	.	-	.	ACC="MI0004786"; ID="dre-mir-740";
1	.	miRNA	27656240	27656327	.	+	.	ACC="MI0002052"; ID="dre-mir-218a-2";
1	.	miRNA	34527751	34527843	.	+	.	ACC="MI0004780"; ID="dre-mir-734";
1	.	miRNA	40174414	40174523	.	+	.	ACC="MI0010857"; ID="dre-mir-2197";
1	.	miRNA	46862496	46862635	.	-	.	ACC="MI0001895"; ID="dre-mir-16b";
str .	miRNA	46862739	46862822	.	-	.	ACC="MI0001891"; ID="dre-mir-15a-1";
1	.	miRNA	55355143	55355233	.	-	.	ACC="MI0004765"; ID="dre-mir-722";
2	.	miRNA	1085488	1085564	.	+	.	ACC="MI0002181"; ID="dre-mir-460";
2	.	miRNA	6031391	6031475	.	+	.	ACC="MI0002000"; ID="dre-mir-137-1";
str .	miRNA	22105590	22105669	.	-	.	ACC="MI0004782"; ID="dre-mir-736";
2	.	miRNA	23568780	23568883	.	-	.	ACC="MI0010841"; ID="dre-mir-2190";
2	.	miRNA	25338635	25338716	.	-	.	ACC="MI0001966"; ID="dre-mir-124-1";
2	.	miRNA	31878456	31878533	.	+	.	ACC="MI0001916"; ID="dre-mir-23a-3";
2	.	miRNA	31880346	31880476	.	+	.	ACC="MI0001928"; ID="dre-mir-27a";
2	.	miRNA	34798348	34798457	.	+	.	ACC="MI0010847"; ID="dre-mir-2198";
2	.	miRNA	44164796	44164904	.	-	.	ACC="MI0001366"; ID="dre-mir-181b-1";
2	.	miRNA	57907954	57908073	.	-	.	ACC="MI0001879"; ID="dre-mir-7a-3";

Command
Code:
sed 's/^\([0-9].*\)/char \1/g' file

Output
Code:
char 1	.	miRNA	548816	548893	.	+	.	ACC="MI0002023"; ID="dre-mir-155";
char 1	.	miRNA	1651461	1651541	.	+	.	ACC="MI0002180"; ID="dre-mir-459";
char 1	.	miRNA	23269491	23269603	.	-	.	ACC="MI0004786"; ID="dre-mir-740";
char 1	.	miRNA	27656240	27656327	.	+	.	ACC="MI0002052"; ID="dre-mir-218a-2";
char 1	.	miRNA	34527751	34527843	.	+	.	ACC="MI0004780"; ID="dre-mir-734";
char 1	.	miRNA	40174414	40174523	.	+	.	ACC="MI0010857"; ID="dre-mir-2197";
char 1	.	miRNA	46862496	46862635	.	-	.	ACC="MI0001895"; ID="dre-mir-16b";
str .	miRNA	46862739	46862822	.	-	.	ACC="MI0001891"; ID="dre-mir-15a-1";
char 1	.	miRNA	55355143	55355233	.	-	.	ACC="MI0004765"; ID="dre-mir-722";
char 2	.	miRNA	1085488	1085564	.	+	.	ACC="MI0002181"; ID="dre-mir-460";
char 2	.	miRNA	6031391	6031475	.	+	.	ACC="MI0002000"; ID="dre-mir-137-1";
str .	miRNA	22105590	22105669	.	-	.	ACC="MI0004782"; ID="dre-mir-736";
char 2	.	miRNA	23568780	23568883	.	-	.	ACC="MI0010841"; ID="dre-mir-2190";
char 2	.	miRNA	25338635	25338716	.	-	.	ACC="MI0001966"; ID="dre-mir-124-1";
char 2	.	miRNA	31878456	31878533	.	+	.	ACC="MI0001916"; ID="dre-mir-23a-3";
char 2	.	miRNA	31880346	31880476	.	+	.	ACC="MI0001928"; ID="dre-mir-27a";
char 2	.	miRNA	34798348	34798457	.	+	.	ACC="MI0010847"; ID="dre-mir-2198";
char 2	.	miRNA	44164796	44164904	.	-	.	ACC="MI0001366"; ID="dre-mir-181b-1";
char 2	.	miRNA	57907954	57908073	.	-	.	ACC="MI0001879"; ID="dre-mir-7a-3";

See in the output, the starting string 'str' doesn't replaced with 'char'
R0H0N
# 3  
Old 12-29-2010
thank you, rohon

It works perfectly.

Thank you,
# 4  
Old 01-03-2011
I have a question regarding extracting information from csv file. I have very large file with 7 columns and few thousand rows. I would like to search using one or two of these columns and extract information into a text file.

For example, I want to search for Column "Name" for mir-19b and extract all the columns.

Here is the sample csv file.
Code:
Small RNA	                      Expression values	Length	Count	Name	                       Match type	      Mismatches
TGTGCAAATCCATGCAAAACTGA	43,919	23	43,919	mir-19b	Mature	   0
CAGTGCAATATTAAAAGGGCAT 	42,583	22	42,583	mir-130c-1//mir-130c-2	Mature	0
GTGAAATGTTCAGGACCACTTG	        28,357	22	28,357	mir-203b	Mature	0
TTCCCTTTGTCATCCTATGCCT	        27,297	22	27,297	mir-204-1//mir-204-2	Mature	0
TAAAGTGCTTATAGTGCAGGTAG	25,594	23	25,594	mir-20a	Mature	1
CAGTGCAATAATGAAAGGGCAT	23,802	22	23,802	mir-130b	Mature	0
TCCTTCATTCCACCGGAGTCTG	       17,791	22	17,791	mir-205	Mature	2
TGTGCAAATCTATGCAAAACTGA	17,501	23	17,501	mir-19a	Mature	0
TACCCTGTAGATCCGGATTTGT	       17,431	22	17,431	mir-10c	Mature	0
CAGTGCAATAGTATTGTCATAGCAT	17,203	25	17,203	mir-301c	Precursor	0
TGGAATGTAAGGAAGTGTGTGG	16,786	22	16,786	mir-206-1//mir-206-2	Mature	0
GTGAAATGTTTAGGACCACTTG	       16,657	22	16,657	mir-203a	Mature	0
TGTGCAAATCCATGCAAAACTCG	14,449	23	14,449	mir-19c	Mature	0

Any suggestions in using perl or linux commands will be helpful.

Last edited by joeyg; 01-03-2011 at 04:07 PM.. Reason: break out the file
# 5  
Old 01-03-2011
Question should be a separate request

After re-reading your follow-up, this should be its own question. Also, you refer to this as a csv file, but your sample did not seem to be a comma-separated file. It looks like a tab-delimited file.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need to add prefix using sed or awk from cat the file

I need the use sed or AWK using cat the file Node1 TDEV RW 1035788 TDEV RW 1035788 Server1 TDEV RW 69053 Server2 TDEV RW 69053 TDEV RW 103579 Server3 TDEV RW 69053 server4 RDF1+TDEV RW 69053 RDF1+TDEV RW 517894 RDF1+TDEV RW 621473 server6 TDEV RW 34526 TDEV RW 34526 (22 Replies)
Discussion started by: ranjancom2000
22 Replies

2. UNIX for Dummies Questions & Answers

awk adding counts together from column

Hello Im new treat me nicely, I have a headache :) I have a script that seemed to work now it doesnt anyway, the last part is adding counts of unique items in a csv file eg 05492U34 38 05492U34 47 two columns, (many different values like this in file) i want... (7 Replies)
Discussion started by: aniquebmx
7 Replies

3. Shell Programming and Scripting

Adding a specified value to a specified column - awk?

Hi everyone! I sometimes need to do some simple arithmetics, like adding a number to a certain column of a file. So I wrote a small function in the .bashrc file, which looks like this shifter() { COL=$1 VAL=$2 FILE=$3 cp $FILE $FILE.shifted awk 'NF==4 {$(( $COL )) = $(( $COL ))... (6 Replies)
Discussion started by: radudownload
6 Replies

4. Shell Programming and Scripting

AWK adding prefix/suffix to list of strings

75 103 131 133 138 183 197 221 232 234 248 256 286 342 368 389 463 499 524 538 (5 Replies)
Discussion started by: chrisjorg
5 Replies

5. UNIX for Dummies Questions & Answers

Adding Filename as column using sed

Hi , Can any one please tell me, how can we add the file name as column using sed. right now we are using the below awk command for adding the file name as column but when we are calling this script from datastage it is deleting the file data..very weird raised a support ticket with datastage.... (2 Replies)
Discussion started by: mora
2 Replies

6. Shell Programming and Scripting

Adding prefix to the values in the script

Hi, test.txt contains below values 1 2 3 4 5 Desired output: 'TT.1', 'TT.2', 'TT.3', 'TT.4', 'TT.5' Last value should not contain the comma after the value. Below is the script which i have tried. I'm using Linux. #!/bin/bash for i in $test.txt (4 Replies)
Discussion started by: venkatesht
4 Replies

7. Shell Programming and Scripting

Adding column using awk

Hello everyone, I have a file with the following structure: abc xyz 111 222 agf hjhf 787 799 tht yah 878 898 ... ... ... ... ... ... ... ... ... ... ... ... I want to add a column (with a fixed value of 1000) at the end such that it becomes: abc xyz 111 222 1000 agf hjhf 787... (5 Replies)
Discussion started by: ad23
5 Replies

8. UNIX for Dummies Questions & Answers

Adding a column with the row number using awk

Is there anyway to use awk to add a first column to my data that automatically goes from 1 to n , where n is the numbers of my rows?:confused: (4 Replies)
Discussion started by: cosmologist
4 Replies

9. Shell Programming and Scripting

sed/awk-adding numeric to a column

I have a txt file as follows Code: Oct 1 file1 4144 Oct 1 file23 5170 Oct 2 file5 3434 Oct 21 file56 2343 I need to add a new column by marking the right log file from current directory. For example populate like this. Please not in the second columt for "1" it has... (2 Replies)
Discussion started by: gubbu
2 Replies

10. Shell Programming and Scripting

awk-adding a column to a file

Hello Friends, i used awk to sum up total size of files under a directory (with the help of examples, threads here). ls -l | awk '/^-/ {total += $5} END {printf "%15.0f\n",total}' >> total.txt After each execution of the script total result is appended into a text file: 7010 7794 8890 ... (7 Replies)
Discussion started by: EAGL€
7 Replies
Login or Register to Ask a Question