Cleaning through perl or awk a Stemmer dictionary Post: 302813247

9 More Discussions You Might Find Interesting

1. AIX

doing some spring cleaning....

USERS="me you jim joe sue" for user in ${USERS}; do rmuser -p $user usrdir=`cat /etc/passwd|grep $user|awk -F":" '{ print $6 }'` rm -fr `cat /etc/passwd|grep $user|awk -F":" '{ print $6 }'` echo Deleting: $user '\t' REMOVING: $usrdir done This is for AIX ONLY!!! but easily ported to...

2. UNIX for Dummies Questions & Answers

Cleaning text files

I wish to clean a text file of the following characters 1/2, 1/4, o (degrees) I cant display these characters. I have tried ALT+189 etc (my terminal emulator is set to ASCII). How do I display the above ? I am using HP UX 10.

3. UNIX for Dummies Questions & Answers

AWK Data Cleaning

Hello, I am trying to analyze data I recently ran, and the only way to efficiently clean up the data is by using an awk file. I am very new to awk and am having great difficulty with it. In $8 and $9, for example, I am trying to delete numbers that contain 1. I cannot find any tutorials that...

4. Shell Programming and Scripting

File cleaning

HI , I am getting the source data as below. Source Data CDR_Data,,,,, F1,F2,F3,F4,F5,F6 5,5,6,7,8,7 6,6,g,,, 7,7,76,,, 8,8,gt,,, 9,9,df ,d,d,d ,,,,,

5. Shell Programming and Scripting

cleaning the file

Hi, I have a file with multiple rows. each row has 8 columns. Column 8 has entries separated by commas. I want to exclude all the rows in which column 8 has more than 3 commas. 1234#0/1 - ABC_1234 3 ATGCATGCATGC HHHIIIGIHVF 1 49:T>C,60:T>C,78:C>A,76:G>T,65:T>G Thanks, Diya

6. Shell Programming and Scripting

Cleaning AWK code

Hi I need some help to clean my code used to get city location. wget -q -O - http://www.ip2location.com/ | grep chkRegionCity | awk 'END { print }' | awk -F"" '{print $4}' It gives me the city but have a leading space. I am sure this could all be done by one single AWK Also if possible...

7. Shell Programming and Scripting

Cleaning output using awk

I have some small problem with my code. data.html <TD class="statuscol2">c</TD> <TD class="statuscol3">18</TD> <TD class="statuscol4"><SPAN TITLE="#04">test4</SPAN></TD> <TD...

8. Shell Programming and Scripting

OCR text that needs cleaning

Hi, I have OCR'ed text that needs cleaning. Lines are delimited by parts of speech (POS), for example, each line will have either an adj. OR s. f. OR s. m. etc I need to uppercase all text before the POS but all text within parentheses to be lowercase Text after (and including) the POS...

9. Shell Programming and Scripting

awk xml dictionary script: could I get some input?

I completely understand if nobody wants to take a look at the ENTIRE code. What I am asking is that if anyone could browse quickly over the code and perhaps see if anything could be improved. You need not run the program, but you can if you want to. I have been using awk for about a week or so,...

LEARN ABOUT MOJAVE

test::use::ok

Test::use::ok(3)					User Contributed Perl Documentation					  Test::use::ok(3)

NAME

       Test::use::ok - Alternative to Test::More::use_ok

SYNOPSIS

	   use ok 'Some::Module';

DESCRIPTION

       According to the Test::More documentation, it is recommended to run "use_ok()" inside a "BEGIN" block, so functions are exported at
       compile-time and prototypes are properly honored.

       That is, instead of writing this:

	   use_ok( 'Some::Module' );
	   use_ok( 'Other::Module' );

       One should write this:

	   BEGIN { use_ok( 'Some::Module' ); }
	   BEGIN { use_ok( 'Other::Module' ); }

       However, people often either forget to add "BEGIN", or mistakenly group "use_ok" with other tests in a single "BEGIN" block, which can
       create subtle differences in execution order.

       With this module, simply change all "use_ok" in test scripts to "use ok", and they will be executed at "BEGIN" time.  The explicit space
       after "use" makes it clear that this is a single compile-time action.

SEE ALSO

       Test::More

CC0 1.0 Universal
       To the extent possible under law, XX has waived all copyright and related or neighboring rights to Test-use-ok.

       This work is published from Taiwan.

       <http://creativecommons.org/publicdomain/zero/1.0>

POD ERRORS

       Hey! The above document had some coding errors, which are explained below:

       Around line 45:
	   Non-ASCII character seen before =encoding in 'XX'. Assuming UTF-8

perl v5.18.2							    2012-09-11							  Test::use::ok(3)