Stripping unwanted characters in field


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Stripping unwanted characters in field
# 1  
Old 07-08-2015
Stripping unwanted characters in field

I wrote myself a small little shell script to clean up a file I have issues with. In particular, I am stripping down a fully qualified host/domain name to just the hostname itself. The script works, but from a performance standpoint, it's not very fast and I will be working with large data sets.

Here is a sample dataset:
Code:
Field1|hostname|field3|field4.....
Field1|hostname.f.q.d.n|field3|field4......

My code is below:
Code:
while read LINE
do
        CUST=`echo $LINE | cut -d\| -f1`
        SERVER=`echo $LINE | cut -d\| -f2 | sed 's/\..[^.]*//g'`
        REST=`echo $LINE | cut -d\| -f3-`
        echo "$CUST|$SERVER|$REST" >> tmp1
        mv tmp1 $1
done < $1

As you can see, not an elegant solution, but it creates the wanted output (strip FQDN from field 2). My awk is a bit rusty and my perl is basic. If someone has a faster, cleaner way of doing this, i'm all ears.
# 2  
Old 07-08-2015
Code:
while IFS='|' read f1 f2 rest; do
    echo "$f1|${f2%%.*}|$rest"
done < $1

Code:
prog.sh dataset.txt > saved_result.txt

Once that saved_result.txt is what you want, you can rename it.
This User Gave Thanks to Aia For This Post:
# 3  
Old 07-08-2015
Try also
Code:
awk '{sub(/\..*$/,"",$2)}1' FS=\| OFS=\| file

This User Gave Thanks to RudiC For This Post:
# 4  
Old 07-08-2015
Both solutions worked, but your solution RudiC gave me the "just a few seconds" type of result I was looking for.

Thanks to you as well Aia as you taught me a different way to deal with "while read" loops that is cleaner than my old way.
# 5  
Old 07-08-2015
Code:
perl -pe 's/(|\w+)\.[^|]*/$1/' dataset.txt

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed to get rid of unwanted characters

so i have strings such as this: 'postfix/local#2,5#|CRON.*12062.*root.*CMD#2,5#|roice.*NQN1#1,2#|toysprc#1,4#' i need to get rid of the "#" and the numbers between them for each of the strings above. so the desired output should be: ... (1 Reply)
Discussion started by: SkySmart
1 Replies

2. Shell Programming and Scripting

Stripping characters from a file and reformatting according to another one

Dear experts, my problem is pretty tricky. I want to change a file (see attached input.txt), according to another file (help.txt). The output that is desired is in output.txt. The example is attached. Note that -dashes should not be treated specially, they are considered normal characters,... (2 Replies)
Discussion started by: TheTransporter
2 Replies

3. IP Networking

How do delete unwanted characters from lsof?

Hi. I need to trace on Unix level number of connections to an Oracle database. The listener runs on port 1521. The following is run: oracle@server03 >lsof -Pni |grep ".1521" |grep IPv4 | awk {'print $5'}|cut -d: -f 1|sort|uniq -c|sort -nk 1 87 IPv4 oracle@server03 > I need to append... (2 Replies)
Discussion started by: grigorianvlad
2 Replies

4. Shell Programming and Scripting

Stripping characters from a variable

I'm using a shell script to get user input with this command: read UserInput I would then like to take the "UserInput" variable and strip out all of the following characters, regardless of where they appear in the variable or how many occurrences there are: \/":|<>+=;,?*@ I'm not sure... (5 Replies)
Discussion started by: nrogers64
5 Replies

5. Shell Programming and Scripting

removing unwanted characters from a file

i have a file like this 1111_2222#$#$dudgfdk 11111111_343434#$#$334 1111_22222#43445667 i want to remove all those charachetrs from # how can i do this Thank in advance Saravanan (4 Replies)
Discussion started by: saravanan71184
4 Replies

6. Shell Programming and Scripting

Bash script - stripping away characters that can't be used in filenames

I want to create a temp file which is named based on a search string. The search string may contain spaces or characters that aren't supposed to be used in filenames so I want to strip those out. My thought was to use 'tr' with but the result is the opposite of what I want: $ echo "test... (5 Replies)
Discussion started by: mglenney
5 Replies

7. Shell Programming and Scripting

Unwanted field separation in awk

Hi everyone, My problem is strange, I cannot think of why this is happening. I have a set of data that looks like this: Although it does not look it, the fields are tab delimited. I have made sure of this, and awk does recognize them as such. However, it divides what I would expect... (2 Replies)
Discussion started by: ccox85
2 Replies

8. Shell Programming and Scripting

stripping certain characters in at the middle of a string

I am trying to strip out certain characters from a string on both (left & right) sides. For example, line=see@hear|touch, i only want to echo the "hear" part. Well i have tried this approach: line=see@hear|touch templine=${line#*@} #removed "see@" echo ${templine%%\|*} #removed... (4 Replies)
Discussion started by: mcoblefias
4 Replies

9. Shell Programming and Scripting

stripping leftmost characters from string

Hi there, if i have some strings ie test_324423 test_242332 test_767667 but I only want the number part (the bolded bit) how do I strip the leftmost 5 characters from the output so that i will have just 324423 242332 767667 any help would be greatly appreciated Gary (5 Replies)
Discussion started by: hcclnoodles
5 Replies

10. UNIX for Dummies Questions & Answers

Diff output, unwanted characters

I've got a diff command running in a shell script that writes the ouput to a new file. In the new file there is a ">" at the beginning of each line. The output file is going to be used by another program and that character makes the file useless. What I'm getting in the new file: > 2007-09-27... (5 Replies)
Discussion started by: scanner248
5 Replies
Login or Register to Ask a Question