Script in Perl or awk to remove multiple hyphens


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Script in Perl or awk to remove multiple hyphens
# 1  
Old 12-04-2016
Script in Perl or awk to remove multiple hyphens

Dear all,
I have a database of compound words. I want to retain only strings with a single hyphen and identify those strings which have more than one hyphen. I am giving an example below
Code:
test-test
test-test-test
test-test-test-test-test
good-for-nothing

The regex/script should remove all strings which have more than one hyphen as shown in the example below where a string separated by a single hyphen alone is retained
Code:
test-test

A Perl regex or an Awk or Perl script would be of great help.
Thanks in advance
Moderator's Comments:
Mod Comment Please use CODE tags for sample input, sample output, and code segments (not HTML tags).

Last edited by Don Cragun; 12-05-2016 at 12:03 AM.. Reason: Change HMTL tags to CODE tags; add ICODE tags.
# 2  
Old 12-05-2016
If you're trying to print lines that contain less than two hyphens, try:
Code:
awk 'gsub(/-/, "-") < 2' file

If you're trying to just print lines that contain exactly one hyphen, try:
Code:
awk 'gsub(/-/, "-") == 1' file

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.

Last edited by Don Cragun; 12-05-2016 at 12:17 AM.. Reason: Fix typos: s/-'/-"/
# 3  
Old 12-05-2016
Many thanks. it worked. I replaced the
Code:
' by "
in
gsub(/-/, "-') == 1

and it zipped through a huge file of over 50,000+ compounds
As a matter of curiosity was a single apostrophe a typo? I work in a DOS environment and was this the reason?
# 4  
Old 12-05-2016
Quote:
Originally Posted by gimley
Many thanks. it worked. I replaced the
Code:
' by "
in
gsub(/-/, "-') == 1

and it zipped through a huge file of over 50,000+ compounds
As a matter of curiosity was a single apostrophe a typo? I work in a DOS environment and was this the reason?
Yes, it was a typo. I will edit post #2 to fix the typos.
# 5  
Old 12-05-2016
Thanks a lot for your kind help
# 6  
Old 12-05-2016
Using awk's field splitting
Code:
awk -F- 'NF==2'

# 7  
Old 12-05-2016
Thanks. This solution also worked. I have stored it as a backup.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sed/awk/perl substitution with multiple lines

OSX I have been grinding my teeth on a portion of code. I am building a bash script that edits a html email template. In the template, I have place holders for SED (or whatever program is appropriate) to use as anchors for find and replace, with user defined corresponding html code. The HTML code... (3 Replies)
Discussion started by: sudo
3 Replies

2. Shell Programming and Scripting

Get multiple values from an xml file using one of the following commands or together awk/perl/script

Hello, I have a requirement to extract the value from multiple xml node and print out the values to new file to compare. Would be done using either awk/perl or some unix script. For example sample input file: ..... ..... <factories xmi:type="resources.jdbc:DataSource"... (2 Replies)
Discussion started by: slbmind
2 Replies

3. Shell Programming and Scripting

Using sed, awk or perl to remove substring of all lines except the first

Greetings All, I would like to find all occurences of a pattern and delete a substring from the all matching lines EXCEPT the first. For example: 1234::group:user1,user2,user3,blah1,blah2,blah3 2222::othergroup:user9,user8 4444::othergroup2:user3,blah,blah,user1 1234::group3:user5,user1 ... (11 Replies)
Discussion started by: jacksolm
11 Replies

4. Shell Programming and Scripting

perl script to remove the extension from its name

There are few files in my windows directory and I need a perl script to rename the files to its original names i.e., the last extension(.orig) needs to be removed programatically, files in directory data1.htm.orig data2.htm.orig data3.htm.orig to be renamed to data1.htm data2.htm... (4 Replies)
Discussion started by: giridhar276
4 Replies

5. Shell Programming and Scripting

awk delete/remove rest of line on multiple search pattern

Need to remove rest of line after the equals sign on search pattern from the searchfile. Can anybody help. Couldn't find any similar example in the forum: infile: 64_1535: Delm. = 86 var, aaga 64_1535: Fran. = 57 ex. ccc 64_1639: Feb. = 26 (link). def 64_1817: mar. = 3/4. drz ... (7 Replies)
Discussion started by: sdf
7 Replies

6. Shell Programming and Scripting

Need an awk / sed / or perl one-liner to remove last 4 characters with non-unique pattern.

Hi, I'm writing a ksh script and trying to use an awk / sed / or perl one-liner to remove the last 4 characters of a line in a file if it begins with a period. Here is the contents of the file... the column in which I want to remove the last 4 characters is the last column. ($6 in awk). I've... (10 Replies)
Discussion started by: right_coaster
10 Replies

7. UNIX for Advanced & Expert Users

awk - remove block of text, multiple actions for 'if', inline edit

I'm having a couple of issues. I'm trying to edit a nagios config and remove a host definition if a certain "host_name" is found. My thought is I would find host definition block containing the host_name I'm looking for and output the line numbers for the first and last lines. Using set, I will... (9 Replies)
Discussion started by: mglenney
9 Replies

8. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Input: hello hello hello hello monkey donkey hello hello drink dance drink Output should be: hello hello monkey donkey drink dance (9 Replies)
Discussion started by: cola
9 Replies

9. Shell Programming and Scripting

How to remove spaces using awk,sed,perl?

Input: 3456 565 656 878 235 8 4 8787 3 7 35 878 Expected output: 3456 565 656 878 235 8 4 8787 3 7 35 878 How can i do this with awk,sed and perl? (10 Replies)
Discussion started by: cola
10 Replies

10. Shell Programming and Scripting

perl or awk remove empty lines when condition

Hi Everyone, # cat 1 a b b cc 1 2 3 3 3 4 55 5 a b (2 Replies)
Discussion started by: jimmy_y
2 Replies
Login or Register to Ask a Question