Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Regular Expressions -- Find spaces outside Post 302449161 by arduino411 on Saturday 28th of August 2010 06:53:53 PM
Old 08-28-2010
Question Regular Expressions -- Find spaces outside

Hello,

I need help with using grep and regular expressions....

I have a long list of about 1000 lines of Chinese flashcards. Here's a small excerpt:

Code:
意文 yìwén (given name)
貴姓 guìxìng (honorable surname)
貴 guì (honorable)
姓 xìng (one's surname is; to be surnamed; surname)
呢 ne (interrogative particle)
叫 jiào (to be called; to call)
名字 míngzi (name)


syntax:
ChineseCharacter ChinesePinYin (EnglishTranslation)
(each has a space to separate it)

In order to import to my flashcards program on my iPod Touch, the information for each side should be seperated by tabs, and not spaces.
What regular expression will allow me to search for spaces outside the parenthesis and replace them with tabs (since I don't want the English text to be messed up)?

Any help is greatly appreciated Smilie.

Thanks,
Michael

Last edited by arduino411; 08-28-2010 at 07:54 PM.. Reason: unclear syntax
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regular Expressions

I'm trying to parse RichText to XML. I want to be able to capture everything between the '/par' tag in the RTF but not include the tag itself. So far all I have is this, '.*?\\par' but it leaves '\par' at the end of it. Any suggestions? (1 Reply)
Discussion started by: AresMedia
1 Replies

2. Shell Programming and Scripting

Regular Expressions

How can i create a regular expression which can detect a new line charcter followed by a special character say * and replace these both by a string of zero length? Eg: Input File san.txt hello hi ... (6 Replies)
Discussion started by: sandeep_hi
6 Replies

3. Shell Programming and Scripting

regular expressions

Hi, can anyone advise me how to shorten this: if || ; then I tried but it dosent seem to work, whats the correct way. Cheers (4 Replies)
Discussion started by: jack1981
4 Replies

4. UNIX for Dummies Questions & Answers

regular expressions

Hi Gurus, I need help with regular expressions. I want to create a regular expression which will take only alpha-numeric characters for 7 characters long and will throw out an error if longer than that. i tried various combinations but couldn't get it, please help me how to get it guys. ... (2 Replies)
Discussion started by: ragha81
2 Replies

5. UNIX for Dummies Questions & Answers

regular expressions

how to find for a file whose name has all characters in uppercase after 'project'? I tried this: find . -name 'project**.pdf' ./projectABC.pdf ./projectABC123.pdf I want only ./projectABC.pdf What is the regular expression that correponds to "all characters are capital"? thanks (8 Replies)
Discussion started by: melanie_pfefer
8 Replies

6. Shell Programming and Scripting

Need help with Regular Expressions

Hi, In ksh, I am trying to compare folder names having -141- in it's name. e.g.: 4567-141-8098 should match this expression '*-141-*' but, -141-2354 should fail when compared with '*-141-*' simlarly, abc should fail when compared with '*-141-*' I tried multiple things but nevertheless,... (5 Replies)
Discussion started by: jidsh
5 Replies

7. Shell Programming and Scripting

Regular Expressions

what elements does " /^/ " match? I did the test which indicates that it matches single lowercase character like 'a','b' etc. and '1','2' etc. But I really confused with that. Because, "/^abc/" matches strings like "abcedf" or "abcddddee". So, what does caret ^ really mean? Any response... (2 Replies)
Discussion started by: DavidHe
2 Replies

8. Shell Programming and Scripting

Help with regular expressions

I have a file that I'm trying to find all the cases of phone number extensions and deleting them. So input file looks like: abc x93825 def 13234 x52673 hello output looks like: abc def 13234 hello Basically delete lines that have 5 numbers following "x". I tried: x\(4) but it... (7 Replies)
Discussion started by: pxalpine
7 Replies

9. UNIX for Advanced & Expert Users

Using find and regular expressions

Hi Could you please advise how can one extract from the output of find . -name "*.c" -print only filenames in the current direcotry and not in its subdirectories? I tried using (on Linux x86_64) find . -name "*.c" -prune but it is not giving correct output. Whereas I am getting... (9 Replies)
Discussion started by: tinku981
9 Replies

10. Shell Programming and Scripting

Replacing Multiple spaces with a single space but excluding few regular expressions

Hi All. Attached are two files. I ran a query and have the output as in the file with name "FILEWITHFOURRECORDS.txt " I didn't want all the spaces between the columns so I squeezed the spaces with the "tr" command and also added a carriage return at the end of every line. But in two... (3 Replies)
Discussion started by: sparks
3 Replies
grep(1) 						      General Commands Manual							   grep(1)

Name
       grep, egrep, fgrep - search file for regular expression

Syntax
       grep [option...] expression [file...]

       egrep [option...] [expression] [file...]

       fgrep [option...] [strings] [file]

Description
       Commands  of  the family search the input files (standard input default) for lines matching a pattern.  Normally, each line found is copied
       to the standard output.

       The command patterns are limited regular expressions in the style of which uses a compact nondeterministic algorithm.  The command patterns
       are  full  regular  expressions.  The command uses a fast deterministic algorithm that sometimes needs exponential space.  The command pat-
       terns are fixed strings.  The command is fast and compact.

       In all cases the file name is shown if there is more than one input file.  Take care when using the characters $ * [ ^ | ( ) and   in  the
       expression because they are also meaningful to the Shell.  It is safest to enclose the entire expression argument in single quotes ' '.

       The command searches for lines that contain one of the (new line-separated) strings.

       The command accepts extended regular expressions.  In the following description `character' excludes new line:

	      A  followed by a single character other than new line matches that character.

	      The character ^ matches the beginning of a line.

	      The character $ matches the end of a line.

	      A .  (dot) matches any character.

	      A single character not otherwise endowed with special meaning matches that character.

	      A  string  enclosed in brackets [] matches any single character from the string.	Ranges of ASCII character codes may be abbreviated
	      as in `a-z0-9'.  A ] may occur only as the first character of the string.  A literal - must be placed where it can't be mistaken	as
	      a range indicator.

	      A  regular  expression  followed	by  an	* (asterisk) matches a sequence of 0 or more matches of the regular expression.  A regular
	      expression followed by a + (plus) matches a sequence of 1 or more matches of the regular expression.  A regular expression  followed
	      by a ? (question mark) matches a sequence of 0 or 1 matches of the regular expression.

	      Two regular expressions concatenated match a match of the first followed by a match of the second.

	      Two regular expressions separated by | or new line match either a match for the first or a match for the second.

	      A regular expression enclosed in parentheses matches a match for the regular expression.

       The  order  of  precedence  of  operators at the same parenthesis level is the following:  [], then *+?, then concatenation, then | and new
       line.

Options
       -b	   Precedes each output line with its block number.  This is sometimes useful in locating disk block numbers by context.

       -c	   Produces count of matching lines only.

       -e expression
		   Uses next argument as expression that begins with a minus (-).

       -f file	   Takes regular expression (egrep) or string list (fgrep) from file.

       -i	   Considers upper and lowercase letter identical in making comparisons and only).

       -l	   Lists files with matching lines only once, separated by a new line.

       -n	   Precedes each matching line with its line number.

       -s	   Silent mode and nothing is printed (except error messages).	This is useful for checking the error status (see DIAGNOSTICS).

       -v	   Displays all lines that do not match specified expression.

       -w	   Searches for an expression as for a word (as if surrounded by `<' and `>').  For further information, see only.

       -x	   Prints exact lines matched in their entirety only).

Restrictions
       Lines are limited to 256 characters; longer lines are truncated.

Diagnostics
       Exit status is 0 if any matches are found, 1 if none, 2 for syntax errors or inaccessible files.

See Also
       ex(1), sed(1), sh(1)

																	   grep(1)
All times are GMT -4. The time now is 05:27 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy