The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com



UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Print Full record and substring in that record ukatru UNIX for Advanced & Expert Users 1 09-19-2008 06:32 PM
how to read record by record from a file in unix raoscb UNIX for Dummies Questions & Answers 1 05-16-2008 06:30 AM
Splitting a file based on record sin another file er_ashu UNIX for Dummies Questions & Answers 7 05-15-2008 11:06 PM
Script to search a bad record in a file then put the record in the bad file shilendrajadon Shell Programming and Scripting 2 12-28-2007 10:02 AM
splitting a record and adding a record to a file rsolap Shell Programming and Scripting 1 08-13-2007 01:58 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 10-23-2008
maixu134 maixu134 is offline
Registered User
  
 

Join Date: Oct 2008
Posts: 4
Record splitting with AWK

Hi all !

I need your help as quick as possible.

My input file like this:

bạc t́nh ( 薄情) 1 . 薄情な.2. 夫婦或いは男女の不貞を指す。
bách (百,迫)1.100ドソ. tr a m b a c ともいう. 2.柏(カヽしわ)・ 3.圧迫する.4.差し迫った,

My propose is take the value in the firt bracket. I used the command like :
...if (index( $3,"(")==1)$3=substr($3,2,index($3,")")-1);
else if (index( $3,"(")==1)$3=substr($3,2,index($3,")1."-3);
and my result with the fist line, i take the value 薄情.
but the second line, the value is wrong. Because it took 百,迫)1.100ドソ. tr a m b a c ともいう. 2.柏(カヽしわ.
but I want to take the 百,迫.

So what can I do?
  #2 (permalink)  
Old 10-23-2008
radoulov's Avatar
radoulov radoulov is online now Forum Staff  
addict
  
 

Join Date: Jan 2007
Location: Варна, България / Milano, Italia
Posts: 2,854
Try this:

Code:
awk -F'[)(]' '{print $2}' infile
Note:

Moved to Q & A.
To the OP: please don't post to unrelated old threads, open a new one instead.
Thank you!

Last edited by radoulov; 10-23-2008 at 03:58 AM..
  #3 (permalink)  
Old 10-23-2008
maixu134 maixu134 is offline
Registered User
  
 

Join Date: Oct 2008
Posts: 4
Thanks you so much!

But I did not know what does this mean of funtion -F
  #4 (permalink)  
Old 10-23-2008
Franklin52 Franklin52 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2007
Posts: 4,302
Try this, I have used ( or ) as fieldseparators:

Code:
awk 'BEGIN{FS="\(|\)"} {print $2}' file
Regards
  #5 (permalink)  
Old 10-23-2008
maixu134 maixu134 is offline
Registered User
  
 

Join Date: Oct 2008
Posts: 4
Thanks!

Before, I do like this:
BEGIN {
FS="\t";RS="\n";
}
Because now I want to take the words during ( ), and $3 it means the value I will take.
I write like this
if (index( $3,"(")==0 && index($3,"・・)==0)$3="";
else if (index( $3,"(")==1)$3=substr($3,2,index($3,"・・)-2);

My result will have 2 column, 1 is
bách
and the second column is 百,迫
but now I have problem when the lines have 2 ( ) like this
bách (百,迫)1.100ドソ. tr a m b a c ともいう. 2.柏(カヽしわ)

After the bách I use tab code(button) and before the bách I also using the tab code.

And my result is
bách 百,迫)1.100ドソ. tr a m b a c ともいう. 2.柏(カヽしわ

Then could you help me to fix my mistake?
  #6 (permalink)  
Old 10-23-2008
Franklin52 Franklin52 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2007
Posts: 4,302
The problem is that the file is not well structured, you can try:

Code:
awk -F'[)(]' '{print $1, $2}' infile
Regards
  #7 (permalink)  
Old 10-23-2008
radoulov's Avatar
radoulov radoulov is online now Forum Staff  
addict
  
 

Join Date: Jan 2007
Location: Варна, България / Milano, Italia
Posts: 2,854
Quote:
Originally Posted by maixu134 View Post
Before, I do like this:
BEGIN {
FS="\t";RS="\n";
}
Because now I want to take the words during ( ), and $3 it means the value I will take.
I write like this
if (index( $3,"(")==0 && index($3,"・・)==0)$3="";
else if (index( $3,"(")==1)$3=substr($3,2,index($3,"・・)-2);

My result will have 2 column, 1 is
bách
and the second column is 百,迫
but now I have problem when the lines have 2 ( ) like this
bách (百,迫)1.100ドソ. tr a m b a c ともいう. 2.柏(カヽしわ)

After the bách I use tab code(button) and before the bách I also using the tab code.

And my result is
bách 百,迫)1.100ドソ. tr a m b a c ともいう. 2.柏(カヽしわ

Then could you help me to fix my mistake?
OK,
could you please post the output of the command below and explain what's wrong with it?
Just run it in your terminal:

Code:
awk -F'[)(]' '{print $1, $2}' name_of_your_input_file

Last edited by radoulov; 10-23-2008 at 07:53 AM..
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 05:05 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0