Replacing part of a delimited string


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Replacing part of a delimited string
# 1  
Old 05-03-2014
Replacing part of a delimited string

Hello,

I have some tab delimited text data that I am processing. The second column looks like,
Code:
NAME;pyrimidine-2,4-diol;cpd;2;line;37

I need to clean this up to just the name,
Code:
pyrimidine-2,4-diol

All lines have the same format,
NAME;text;cpd;int;line;int followed by tab

I have tried something like,

Code:
sed 's/NAME;//g' | \
sed 's/;cpd;*;line;*\t/\t/g' | \

To replace NAME; with nothing and ;cpd;*;line;*\t with \t

The NAME; replaces fine, but the second part doesn't .

Next I tried
Code:
sed 's/;cpd;[0-9]*;line;[0-9]*//g' | \

for the second part and that does seem to work.

Am I going about this the right way? It seems like I should just be able to use awk with ; as FS.

LMHmedchem

Last edited by Scrutinizer; 05-03-2014 at 03:02 AM.. Reason: ICODE -> CODE tags
# 2  
Old 05-03-2014
Code:
awk 'BEGIN{FS=";"}{print $2}' filename

hope this helps
This User Gave Thanks to newageBATMAN For This Post:
# 3  
Old 05-03-2014
Code:
awk -F "\t" '{split($2, a, ";"); $2 = a[2]}1' file

This User Gave Thanks to SriniShoo For This Post:
# 4  
Old 05-03-2014
@OP: Regarding your attempts with sed, you need to use .* instead of * ( .* in a regular expression means zero or more characters). When you use the asterisk like this: ;* it means zero or more semi-colons.

What you did in the last example was [0-9]* which means zero or more digits, so that is correct..

Last edited by Scrutinizer; 05-03-2014 at 03:15 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 05-03-2014
sed solution
Code:
sed 's/\tNAME;\([^;]*\);[^\t]*\t/\t\1\t/' file

---------- Post updated at 09:03 AM ---------- Previous update was at 02:55 AM ----------

Perl solution
Code:
perl -F"\t" -lane 'BEGIN{$"="\t"}
  {@A = split(/;/, $F[1], 3); $F[1] = $A[1]; print "@F"}' file

This User Gave Thanks to SriniShoo For This Post:
# 6  
Old 05-06-2014
I am sorry for the very long delay, one of my servers has been acting up and I needed to get it fixed.

I always seem to forget that regex is not the same as using a wildcard in bash. I guess it doesn't help that there are some many flavors of regex. I spent a few days a while ago trying to decipher some regex in ruby.

Quote:
Originally Posted by SriniShoo
Code:
awk -F "\t" '{split($2, a, ";"); $2 = a[2]}1' file

Could you provide an interpretation of this? Is there a difference between -F "" and FS=""? It looks like you used -F "\t" to indicate that it is a tab delimited file, but then split on ; to split column 2. Is that right? That would be at least three different ways to specify a delimiter.

As indicated, this works properly,
sed 's/;cpd;[0-9]*;line;[0-9]*//g' | \

LMHmedchem
# 7  
Old 05-07-2014
awk's standard field splitting is done according to the FS variable. There are several ways to set that variable, one is with the -F directive. The split() function also uses the FS variable, but it can be overridden with a 3rd field to the function, which is done here..

These are just two different approaches, you can use sed's regex to specify the pattern that needs to be substituted (this could also be done with regex in awk for that matter), or you can use awk's field splitting capabilities (which technically also uses regex BTW)..

---
SriniShoo's solution would also need to set the OFS variable to '\t', for it to work properly:
Code:
awk -F "\t" '{split($2, a, ";"); $2 = a[2]}1' OFS='\t' file

Otherwise the end result would be space-separated rather than TAB-separated..

Last edited by Scrutinizer; 05-07-2014 at 04:07 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help/Advise please for converting space delimited string variable to comma delimited with quote

Hi, I am wanting to create a script that will construct a SQL statement based on a a space delimited string that it read from a config file. Example of the SQL will be For example, it will read a string like "AAA BBB CCC" and assign to a variable named IN_STRING. I then concatenate... (2 Replies)
Discussion started by: newbie_01
2 Replies

2. Shell Programming and Scripting

Replacing a column in a pipe delimited file

Hi, I have a pipe delimited file as below and I need to replace the 2nd column of each line with null values. 1|10/15/2011|fname1|lname1 2|10/15/2012|fname2|lname2 3|10/15/2013|fname3|lname3 Output file: 1||fname1|lname1 2||fname2|lname2 3||fname3|lname3 I tried this ... (2 Replies)
Discussion started by: member2014
2 Replies

3. UNIX for Dummies Questions & Answers

Replacing part of filename

Hi guys! I have quite a lot of files like all_10001_ct1212307460308.alf* and I want to get rid of the first number for all at once like: all_ct1212307460308.alf* How can I do this in the shell? (12 Replies)
Discussion started by: TimmyTiz
12 Replies

4. Shell Programming and Scripting

Renaming Filenames by replacing a part

Hi, I have little experience on Shell scripts, I searched the forum but couldn't make out what I want. I want to rename a set of files to a new file name a_b_20100101 c_d_20100101 ....................... ...................... I want to rename the files to a_b_20140101... (5 Replies)
Discussion started by: JaisonJ
5 Replies

5. Shell Programming and Scripting

Replacing part of the sentence using echo and sed

Hi, Iam using ksh and trying to execute the following syntax to replace one word of the sentence with a new word. But somehow sed is not able to replace the old value with new value. Please let me know where Iam going wrong. Sample Code : --> export line="VORTEX,abcdef" export... (3 Replies)
Discussion started by: ajithab
3 Replies

6. Shell Programming and Scripting

Replacing part of a pattern in sed

Hi I have a piece of xml that has a pattern like this <int>159</int><int>30</int> I want to find this pattern but only substitute the second part of the pattern to {rid1}. Is that possible in sed ? Thanks. ---------- Post updated at 12:10 PM ---------- Previous update was at 12:01 PM... (11 Replies)
Discussion started by: vnn
11 Replies

7. Shell Programming and Scripting

Replacing some part of file

Hello, I have two files, consider that as file1 and file2. Here file1 is the master file. file1 will contain data like GS*RA*071000013*102562451P*091130*0520*334052023*X*003050 ST*820*334052023 BPR*C*509.77*C*ACH*CTX*01*071000013*DA*5529085*9000002008**01*071000013*DA*5529085*091130... (8 Replies)
Discussion started by: atlantis
8 Replies

8. UNIX for Dummies Questions & Answers

Replacing a field in pipe delimited TEXT File

Hi, I want to replace a field in a text delimited file with the actual number of records in the same file. HDR|ABCD|10-13-2008 to 10-19-2008.txt|10-19-2008|XYZ DTL|0|5464-1|0|02-02-2008|02-03-2008||||F||||||||| DTL|1|5464-1|1|02-02-2008|02-03-2008|1||JJJ... (3 Replies)
Discussion started by: ravi0435
3 Replies

9. Shell Programming and Scripting

Replacing the part of file name?

Hi All One of my script generate following files. These files has static TIMESTAMP 20080227. AccAdd_20080227_1000.dat AccBal_20080227_1000.dat Acc_20080227_1000.dat AccGrpMem_20080227_1000.dat AccToCust_20080227_1000.dat What i need to do is, once the file has been generated, it... (7 Replies)
Discussion started by: Amit.Sagpariya
7 Replies

10. Shell Programming and Scripting

Need help regarding replacing a part of string

Hi all suppose i have a string "abacus sabre", i need to replace occurences 'ab' with 'cd' and i need to store this result into same string and i need to return this result from script to the calling function, where as the string is passed from calling function. i tried like this ... (1 Reply)
Discussion started by: veerapureddy
1 Replies
Login or Register to Ask a Question