How to replace and remove few junk characters from a specific field?


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers How to replace and remove few junk characters from a specific field?
# 8  
Old 09-30-2014
Maybe any of these two:
Code:
awk -v OFS="\t" '{$1=$1;sub(/%.*\)/, "", $4)}1' filename

perl -wnla -e '$F[3] =~ s/%.*\)// if $#F > 2; print join "\t", @F;' filename

if not, please, post the unadulterated output of the following command:
Code:
head filename | od -c


Last edited by Aia; 09-30-2014 at 08:49 PM.. Reason: grammar correction
# 9  
Old 09-30-2014
Aia,
Thank you so much - both of your latest solutions work.
As I am in the learning phase, please help me to interpret few of your magical coding.
I would appreciate if you can put few words for this piece of code:
Smilie
1)
'{$1=$1;sub(/%.*\)/, "", $4)}1'
2)
perl -wnla -e '$F[3] =~ s/%.*\)// if $#F > 2; print join "\t", @F;'
# 10  
Old 10-01-2014
As I suspected you do not use tabs between fields, you use multiple spaces imitating a tab.

awk
Code:
awk -v OFS="\t" '{$1=$1;sub(/%.*\)/, "", $4)}1' filename

-v OFS="\t" # sets the built-in Output Field Separator to a tab, instead of the default single space when outputting (which is used in the next part)

$1=$1 # changing a field rebuilds the $0 (whole record), which by default is a line, however the output separator is a tab now, substituting any conbination of spaces into a tab

sub(/%.*\)/, "", $4) # this is a built-in function in awk that takes three argument: 1st the regular expression to match, 2nd the string to substitute instead, and 3rd the field or string to look into it, in this case the 4th field.

1 # evaluate to true will print the default $0

Perl
Code:
perl -wnla -e '$F[3] =~ s/%.*\)// if $#F > 2; print join "\t", @F;' filename

-wnla -e # -w is for warnings, -n is for reading but not automatically printing, -l automatically adds an output separator after the print and when used in combination with -n (like here), it takes away any new line or input separator from the line, -a tells Perl to create an array name F and uses it to hold fields, -e tells Perl that what it comes next should be interpreted or executed as Perl code.

$F[3] =~ s/%.*\)// # take the 4th field stored in subscript 3 of array F and substitutes the first match of the regex between / and / with (empty) final /

if $#F > 2 # do the previous only if the array F has more than 3 elements (we are looking for the 4th)

print join "\t", @F # add new tab between each element stored in array F and then display it
# 11  
Old 10-01-2014
Aia,
Simply awesome! The way you explained enlightened me. Your support is well appreciated as I am always trying to learn.
Thanks again.SmilieSmilie
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Find records with specific characters in 2 nd field

Hi , I have a requirement to read a file ( 5 fields , ~ delimited) and find the records which contain anything other than Alphabets, Numbers , comma ,space and dot . ie a-z and A-Z and 0-9 and . and " " and , in 2nd field. Once I do that i would want the result to have field1|<flag> flag can... (2 Replies)
Discussion started by: ashwin3086
2 Replies

2. UNIX for Beginners Questions & Answers

Need to remove Junk characters

Hi All, I have a issue that we are getting Junk characters from source and i am not able to load that records to Database. Line breakers Junk Characters (Â and different every time) Japanese Characters Every time I am using grep command and awk -F "\007" to find them and delete that... (1 Reply)
Discussion started by: spradeep86
1 Replies

3. Shell Programming and Scripting

Remove all junk characters from a text file

I am using flatfile, in that flat file we are getting the junk chars 1)I21001f<82>^Me<85>!h49 Service Charge 2) I21001f‚ e...!h49 Service Charge please tell me how to remove all junk chars in unix scripts. (1 Reply)
Discussion started by: Talari
1 Replies

4. Shell Programming and Scripting

Remove first n characters from specific columns

I have a file like: s_20331 803 1 1 5 1:2=0.00000000 1:3=0.00000000 1:4=0.11111111 s_20331 814 1 1 5 1:2=0.00000000 1:3=0.12611607 1:4=0.00000000I would like to remove the four characters "x:x=" from all columns containing them (in my actual file, there are 15 total columns (i.e. columns... (1 Reply)
Discussion started by: pathunkathunk
1 Replies

5. Shell Programming and Scripting

[Solved] Counting specific characters within each field

Hello, I have a file like following: ALB_13554 1 1 1 ALB_13554 1 2 1 ALB_18544 2 0 2 ALB_18544 1 0 1 This is a sample of my file, my real file has 441845 number of fields. What I want to do is to calculate the number of 1 and 2 in each column using AWK, so, the output file looks like... (5 Replies)
Discussion started by: Homa
5 Replies

6. UNIX for Dummies Questions & Answers

How to remove JUNK characters (FROM�)

Hi I have to remove the junk characters from my file. Please help.. File content : CURITY_CODE_GSD) FROM� DL_CB_SOD_EOD_VALUATION WHERE� ASOF (1 Reply)
Discussion started by: arukuku
1 Replies

7. Shell Programming and Scripting

Remove the special characters from field

Hi, In source data few of columns are having special charates(like *) due to this i am not able to display the data into flat file.it's displaying the some of junk data into the flat file. source dataExample: Address1="XDERFTG * HYJUYTG" how to remove the special charates in a string (2 Replies)
Discussion started by: koti_rama
2 Replies

8. Shell Programming and Scripting

Replace specific field on specific line sed or awk

I'm trying to update a text file via sed/awk, after a lot of searching I still can't find a code snippet that I can get to work. Brief overview: I have user input a line to a variable, I then find a specific value in this line 10th field in this case. After asking for new input and doing some... (14 Replies)
Discussion started by: crownedzero
14 Replies

9. Shell Programming and Scripting

Remove junk characters using Perl

Guys, can you help me in removing the junk character "^S" from the below line using perl Reference Data Not Recognised ^S Where a value is provided by the consuming system, which is not reco Thanks, M.Mohan (1 Reply)
Discussion started by: mohan_xunil
1 Replies

10. HP-UX

extract field of characters after a specific pattern - using UNIX shell script

Hello, Below is my input file's content ( in HP-UX platform ): ABCD120672-B21 1 ABCD142257-002 1 ABCD142257-003 1 ABCD142257-006 1 From the above, I just want to get the field of 13 characters that comes after 'ABCD' i.e '120672-B21'... . Could... (2 Replies)
Discussion started by: jansat
2 Replies
Login or Register to Ask a Question