Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Find and replace single character w/awk given conditions Post 302949858 by jvoot on Thursday 16th of July 2015 04:29:08 PM
Old 07-16-2015
Find and replace single character w/awk given conditions

I have a file that looks like this:

Code:
14985      DPN                        verb                      PPa to spend.
12886      DPNDJN                                               bay tree.
15686      DQ                          verb                      to observe
15656      KC                          verb                      Pa to stay quiet
15835      KCJ                        verb                      Pp|PPa, PPp.

When there are two characters in $2, $3 does not line up when there are other strings in $3. That is to say, when there are three characters in $2, then there are 25 spaces until $3 should begin. However, then there are two characters in $2, then there are 26 spaces and this throws off the justification of both $3 and >=$4.

What I want to do is search for when $3 begins on the 40th character and delete a space so that it begins on the 39th.

Thus:

Code:
14985      DPN                        verb                      PPa to spend.
12886      DPNDJN                                               bay tree.
15686      DQ                         verb                      to observe
15656      KC                         verb                      Pa to stay quiet
15835      KCJ                        verb                      Pp|PPa, PPp.

In order to do this, I've attempted this code awk code, but have had trouble combining conditional statements with substrings and substitutions.

Code:
gawk '{if(substr($0,39,1)==" " && $3 ~/verb/); sub(/^ verb/,"verb", $3);print}' file.txt

I've also tried this:

Code:
gawk '{if(substr($0,39,1)==" " && $3 ~/verb/); sub(substr($0,39,2)," ");print}' file.txt

...and this with a variable:

Code:
gawk '$3 ~/^verb$/{X=substr($0,39,1); sub(/ /,"",X)} 1 {print}' SEDRAt

Perhaps I'm going at this all wrong, but ideally what I'd like is all of my columns to line up, but since my last column will have multiple spaces in it, I've had difficulty executing printf(). Perhaps there is some iteration of FIXEDWIDTH that is escaping me. Nevertheless, I need to be able to learn how to effectively combine conditionals, substrings, and substitutions in awk so this is why I'm asking for help in this manner.

Thank you all so much.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Matching multiples of a single character using sed and awk

Hi, I have a file 'imei_01.txt' having the following contents: $ cat imei_01.txt a123456 bbr22135 yet223 where I want to check whether the expression 'first single alphabet followed by 6 digits' is present in the file (here it is the first record 'a123456') I am using the following... (5 Replies)
Discussion started by: royalibrahim
5 Replies

2. Shell Programming and Scripting

AWK: replace single positional character given variables

I already have accomplished this task using sed and arrays, but since I get the variable using awk, I figured I'd ask this question and maybe I can get a cleaner solution using strictly awk.. I just can't quite grasp it in awk. Story: I'm automating the (re)configuration of network interfaces,... (3 Replies)
Discussion started by: System Shock
3 Replies

3. Shell Programming and Scripting

Script to multiple find and replace in a single file

Dear all I need a script for multiple find and replace in a single file. For example input file is - qwe wer ert rty tyu asd sdf dgf dfg fgh qwe wer det rtyyui jhkj ert asd asd dfgd now qwe should be replace with aaaaaa asd should be replace with bbbbbbbb rty should be replace... (6 Replies)
Discussion started by: wildhorse
6 Replies

4. Shell Programming and Scripting

How do you print a single quote character in AWK

How do you print out a single quote character in AWK? Using the escape character does not seem to work. {printf "%1$s %2$s%3$s%2$s\n" , "INCLUDE", " \' ", "THIS" } does not work. Any suggestions? (6 Replies)
Discussion started by: cold_Que
6 Replies

5. UNIX for Dummies Questions & Answers

find single quote in a string and replace it

Hi, I have variable inside shell script - from_item. from_item = 40.1'1/16 i have to first find out whether FROM_ITEM contains single quote('). If yes, then that need to be replace with two quotes (''). How to do it inside shell script? Please note that inside shell script........ (4 Replies)
Discussion started by: yogichavan
4 Replies

6. Shell Programming and Scripting

Replace multiple occurances of same character with a single character.

Hi all, Greetings, I have the following scenario, The contents of main file are like : Unix|||||forum|||||||||||||||is||||||the||best so||||||be|||||on||||||||||||||||||||||||||||||||||||||||||||it And i need the output in the following form: Unix=forum=is=the=best so=be=on=it ... (3 Replies)
Discussion started by: dipanchandra
3 Replies

7. Shell Programming and Scripting

awk - setting fs to equal any single character

Hi Does anyone know how to set any character as the field separator with awk/nawk on a solaris 10 box. I have tried using /./ regex but this doesnt work either and im out of ideas. thanks (7 Replies)
Discussion started by: chronics
7 Replies

8. Shell Programming and Scripting

Find and replace a character

Hi Team, i have 1st cloumn of data containing, LAMSBA01-BA-COFF-YTD LAMSBA01-BA-COFF-ITD LAMSBA01-BA-AGGR-IND . LAMSBA01-BA-CURR-COFF-BAL i need to replace the "-" to "_" (underscore) using AWK . please help me on this. Thanks, Baski (4 Replies)
Discussion started by: baskivs
4 Replies

9. Shell Programming and Scripting

Find character and Replace character for given position

Hi, i want find the character '-' in a file from position 284-298, if it occurs i need to replace it with 'O ' for the position in the file. How to do that using SED command. thanks in advance, Sara (9 Replies)
Discussion started by: Sara183
9 Replies

10. Shell Programming and Scripting

Find multiple strings and replace single string

Hi, following Perl code i used for finding multiple strings and replace with single string. code: #!/usr/bin/perl my @files = <*.txt>; foreach $fileName (@files) { print "$fileName\n"; my $searchStr = ',rdata\)' | ',,rdata\)' | ', ,rdata\)'; my $replaceStr =... (2 Replies)
Discussion started by: chettyravi
2 Replies
cs00toatok(1)							   User Commands						     cs00toatok(1)

NAME
cs00toatok - conversion cs00 user dictionary to ATOK user dictionary SYNOPSIS
cs00toatok [filename...] AVAILABILITY
SUNWjfpu DESCRIPTION
cs00toatok is a filter that converts cs00 word-list-file to ATOK word-list-file. cs00toatok is used for the purpose of using cs00 user words on ATOK12 as well as on cs00. cs00toatok reads file(s) specified by filename(s). If no filenames are given, cs00toatok reads a file from the standard input. The con- tents of the files must be the format of cs00 word-list-file. cs00toatok writes ATOK word-list-file to the standard output. cs00toatok con- verts each word according to the following rules. Kana reading (Phonetic), Kanji word Does not change any character and the length of Kana reading and Kanji word of each source word. Part-of-speech (Hinshi) Converts Hinshi of source word according to the table shown below. +--------------------------------------------------------------------------------------------------------+ |Part-of-speech in cs00 Part-of-speech in ATOK | | |:N1 noun1 | 01 common noun | |:N2 noun2 | 01 common noun | |:M1 person's name1 | 02 proper noun | |:M2 person's name2 | 02 proper noun | |:T1 place name1 | 02 proper noun | |:T2 place name2 | 02 proper noun | |:NM numeral | 13 numeral | |:NN supplemental numeral | 12 suffix | |:PR prefix | 11 prefix | |:SF suffix | 12 suffix | |:AD adverb | 29 adverb | |:CN conjunction | 09 conjunction | |:RT participial adjective | 08 participial adjective | |:AJ adjective | 27 adjective | |:AV adjective verb | 28 adjective verb | |:SH S-series irregular conjugation verb | 03 noun form of S-series irregu- | | | lar conjugation verb | |:ZH Z-series irregular conjugation verb | 04 noun form of Z-series irregu- | | | lar conjugation verb | |:1V single conjugation verb | 23 single conjugation verb | |:KV K-series five conjugation verb | 14 K-series five conjugation verb | |:GV G-series five conjugation verb | 15 G-series five conjugation verb | |:SV S-series five conjugation verb | 16 S-series five conjugation verb | |:TV T-series five conjugation verb | 17 T-series five conjugation verb | |:NV N-series five conjugation verb | 18 N-series five conjugation verb | |:BV B-series five conjugation verb | 19 B-series five conjugation verb | |:MV M-series five conjugation verb | 20 M-series five conjugation verb | |:RV R-series five conjugation verb | 21 R-series five conjugation verb | |:WV W-series five conjugation verb | 22 W-series five conjugation verb | |:UN no classification | - - | |:TK single kanji | 07 single kanji | |:BS clause | - - | +-------------------------------------------------------------+------------------------------------------+ Words with the part of speech "no classification" (:UN) or "clause" (:BS) need -a option to be put out. Also, a source word with multiple parts of speech is converted to plural words, each of which has the each part of speech. OPTIONS
-a Put out words whose part of speech are "no classification" or "clause" as words whose parts of speech are unknown, in addition to words put out by default. NOTES
cs00toatok does not change Kana reading and Kanji word of any word. Therefore, note below. o A word may not be registered to ATOK user dictionary with the characters or the length of kana reading and kanji word of the word. o If the edge of a word is `'' (Zenkaku single quote) or `"' (Zenkaku double quote), a new word stripped the edge of characters from the word is registered. o If Kanji word of a word contains `,' (Zenkaku comma), the word cannot be registered to ATOK user dictionary. Use ATOK12 dictionary utility to register a word-list-file to ATOK dictionary. For detail, refer to ATOK12 User's Guide. SEE ALSO
atok12(1), atok8wordlist(4) SunOS 5.10 10 Jan 2003 cs00toatok(1)
All times are GMT -4. The time now is 12:40 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy