Shell Programming and Scripting

BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

Help to identify blank space in a file


👤 Login to reply

    #1  
Old 04-19-2018
gimley gimley is offline
Registered User
 
Help to identify blank space in a file

Hello,
I have a dictionary of over 400,000 words with the following structure
Code:
source=target

The database contains single words as well as phrases. To train the data, I need only mappings with out a space i.e. where both source and target do not have any space in between.
I use Ultraedit as my editor and have been using the following regex in Unix to identify a blank space
Code:
^[^ ]+$

Since the database is too large, the editor runs out of memory and cannot store all instances to the clipboard.
Am giving below a small sample text
Code:
هيراآباد=हीरा आबाद
واسڪوڊيگاما=वास्कोडीगामा
کانسواءِ=खांसवाइ/खा सिवाइ
آوازنکي=आवाज़नखे
سانآهي=सान आहे
سڏبوآهي=सॾबो आहे
شڪارڪرڻ=शकार करण
ٺاهيندوآهي=ठाहींदो आहे
ٻولينجو=ॿोलीनजो
ٻولينجي=ॿोलीनजे
ڪنديآهي=कंदी आहे
گئسنجو=गैसन जो
ماموغلام=मामूग़ुलाम
زاهدچانڊيو=ज़ाहिद चांडियो
عطرڪمار=अतुरकुमार
غلاممحي=ग़ुलाममही
گلشيرڪوريجو=गुलशेर कोरीजो
زيرحراست=ज़ीर हिरासत

The script should identify only those entries without a space on either side and store them in a separate file, as in the sample output below
Code:
واسڪوڊيگاما=वास्कोडीगामा
آوازنکي=आवाज़नखे
ٻولينجو=ॿोलीनजो
ٻولينجي=ॿोलीनजे
ماموغلام=मामूग़ुलाम
عطرڪمار=अतुरकुमार
غلاممحي=ग़ुलाममही

A perl or awk script would help. I work in a windows environment.
Many thanks.
Sponsored Links
    #2  
Old 04-19-2018
rovf rovf is offline
Registered User
 
Have a look at the -v option of grep:
Code:
grep -vF ' ' your_file

Sponsored Links
    #3  
Old 04-19-2018
bakunin bakunin is offline Forum Staff  
Bughunter Extraordinaire
 
Quote:
Originally Posted by rovf View Post
Have a look at the -v option of grep:
Code:
grep -vF ' ' your_file

True. Still, as a measure of safety i would rule out trailing or leading spaces:

Code:
sed -n '/^[[:blank]]*//;s/[[:blank:]]*$//;/ /!p' > /result/file

I hope this helps.

bakunin
The Following User Says Thank You to bakunin For This Useful Post:
gimley (04-19-2018)
    #4  
Old 04-19-2018
rovf rovf is offline
Registered User
 
Quote:
Originally Posted by bakunin View Post
True. Still, as a measure of safety i would rule out trailing or leading spaces:
For instance using grep:

Code:
grep -v '[^ ] [^ ]' your_file

The Following User Says Thank You to rovf For This Useful Post:
gimley (04-19-2018)
Sponsored Links
    #5  
Old 04-19-2018
gimley gimley is offline
Registered User
 
Many thanks for all your kind help. My broadband connectivity was down all day and hence the delay. All the solutions worked. I had ensured that my data had no trailing spaces so the issue of trailing spaces does not arise but is is nice to have a solution which ensures that trailing spaces are handled.
Thanks once again.
Sponsored Links
👤 Login to reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
How to identify exact text and then add a blank line above it using sed? jbrass Shell Programming and Scripting 7 06-29-2015 05:33 AM
How to check if the file is empty or has blank space.? Sharma331 Shell Programming and Scripting 10 07-21-2014 04:11 AM
Remove Space and blank line from file in UNIX shell script Mohin Jain Shell Programming and Scripting 12 09-19-2013 10:28 AM
Removing blank space in file sususa Shell Programming and Scripting 0 07-03-2012 02:09 AM
Blank Space is not appending in each row of CSV File - Shell Script praka Shell Programming and Scripting 2 05-06-2009 07:25 AM



All times are GMT -4. The time now is 04:24 PM.

Unix & Linux Forums Content Copyright©1993-2018. All Rights Reserved.
×
UNIX.COM Login
Username:
Password:  
Show Password





Not a Forum Member?
Forgot Password?