Sponsored Content
Top Forums Shell Programming and Scripting How to delete corrupted characters and then do fuzzy searches? Post 302456825 by Bashingaway on Sunday 26th of September 2010 07:23:50 AM
Old 09-26-2010
Hi

Because there's fuzzy searching on keywords against the page to happen....

So you may be able to get a 'hit' on %&^%pated which will allow you to see manually if it's a match whereas ^*&^ (or any block of consecutive) non alpha characters are always a dud.

It's the results of the fuzzy search I'm really interested in so I don't want to delete too much data that may 'hit' but I do want to delete as much 'garbage' as possible to speed up search times.

Hope that makes it clearer?

Scrutinizer

Thanks for the code but it's not working the way I expected...for example

Code:
echo "&^^%% a" | sed  -r 's/[[:punct:]]\{4,\}//g'

outputs &^^%% a

whereas I expected it to output

a

Have I missed something?

I'm using Fedora 13
 

10 More Discussions You Might Find Interesting

1. AIX

Delete specific characters

Hi every1 Well i have a list of numbers e.g 12304 13450 01234 00123 14567 what i want is a command to check if the number is starting from 0 and then delete the 0 without doing anything else!!!! any help wud b appreciated!!!!!!!!:( (4 Replies)
Discussion started by: masquerer
4 Replies

2. UNIX for Dummies Questions & Answers

how to delete M-^M characters from a file

I am receiving a file with 'M-^M' characters...how do I get rid of these characters. I tried tr -d '\015' and sed '/^M//g', but they didnot work. Appreciate if someone can help me with this (1 Reply)
Discussion started by: hyennah
1 Replies

3. Shell Programming and Scripting

Delete not readable characters

Hi All, I wanted to delete all the unwanted characters in the string. ie, to delete all the characters which are not alpha numeric values. var1="a./bc" var2='abc/\."123' like to get the output as print var1 abc print var2 abc123 Could you guys help me out pls. Your help is... (3 Replies)
Discussion started by: ajilesh
3 Replies

4. Shell Programming and Scripting

How to delete characters using a file

Hi All, I have a configuration file (file.cfg) in which data will be like this ; , _ + a to z A to Z Now i have to read a textfile (file.txt) and i need to check whether there is any other character present in text file that is not existing in (file.cfg). If other characters are present... (4 Replies)
Discussion started by: krishna_gnv
4 Replies

5. Shell Programming and Scripting

Delete characters from each line

Hi, I have a file that has data in the following manner, tt_0.00001.dat 123.000 tt_0.00002.dat 124.000 tt_0.00002.dat 125.000 This is consistent for all the entries in the file. I want to delete the 'tt_' and '.dat' from each line. Could anyone please guide me how to do this using awk or... (2 Replies)
Discussion started by: lost.identity
2 Replies

6. Shell Programming and Scripting

need to Delete first 10 characters of a file name

Hello Everyone, I need help in deleting first 10 characters from the filename in a directory eg: 1234567890samplefile1.txt 1234567890samplefile2.txt and so on.. need to get the output as samplefile1.txt Thanks in Advance!!!! (8 Replies)
Discussion started by: Olivia
8 Replies

7. Shell Programming and Scripting

delete first 2 characters for each line, please help

hi, ./R1_970330_210505.sard ./R1_970403_223412.sard ./R1_970626_115235.sard ./R1_970626_214344.sard ./R1_970716_234214.sard ... ... ... for these strings, i wanna remove the ./ for each line how can i do that? i know it could possibly be done by sed, but i really have not idea how... (4 Replies)
Discussion started by: sunnydanniel
4 Replies

8. Shell Programming and Scripting

Delete and retain some characters

Ive been trying to google and tried sed and awk. BUt still getting no exact formula. I would like to know how to parse this at: From: Compute Machin Appliance 3.2.9.10000 123456 To: Compute Machin Appliance 3.2.9.123456 (5 Replies)
Discussion started by: kenshinhimura
5 Replies

9. Shell Programming and Scripting

Delete last characters in each column

I need to delete the last 11 characters from each number and they are all in the same line (each is in a different column): -6.89080901827020800000 3.49348891708562325136 1.47988367839905286876 -2.29707635413510400000 -3.49342364708562325136 -4.43758473239905286876 -2.29707635413510400000... (14 Replies)
Discussion started by: rogeriog.em
14 Replies

10. Shell Programming and Scripting

Delete special characters

My sed is not working on deleting the entire special characters and leaving what is necessary.grep connections_per a|sed -e 's/\<\!\-\-//g' INPUT: <!-- <connections_per_instance>1</connections_per_instance> --> <method>HALF</method> <!--... (10 Replies)
Discussion started by: kenshinhimura
10 Replies
math::fuzzy(n)							 Tcl Math Library						    math::fuzzy(n)

__________________________________________________________________________________________________________________________________________________

NAME
math::fuzzy - Fuzzy comparison of floating-point numbers SYNOPSIS
package require Tcl ?8.3? package require math::fuzzy ?0.2? ::math::fuzzy::teq value1 value2 ::math::fuzzy::tne value1 value2 ::math::fuzzy::tge value1 value2 ::math::fuzzy::tle value1 value2 ::math::fuzzy::tlt value1 value2 ::math::fuzzy::tgt value1 value2 ::math::fuzzy::tfloor value ::math::fuzzy::tceil value ::math::fuzzy::tround value ::math::fuzzy::troundn value ndigits _________________________________________________________________ DESCRIPTION
The package Fuzzy is meant to solve common problems with floating-point numbers in a systematic way: o Comparing two numbers that are "supposed" to be identical, like 1.0 and 2.1/(1.2+0.9) is not guaranteed to give the intuitive result. o Rounding a number that is halfway two integer numbers can cause strange errors, like int(100.0*2.8) != 28 but 27 The Fuzzy package is meant to help sorting out this type of problems by defining "fuzzy" comparison procedures for floating-point numbers. It does so by allowing for a small margin that is determined automatically - the margin is three times the "epsilon" value, that is three times the smallest number eps such that 1.0 and 1.0+$eps canbe distinguished. In Tcl, which uses double precision floating-point numbers, this is typically 1.1e-16. PROCEDURES
Effectively the package provides the following procedures: ::math::fuzzy::teq value1 value2 Compares two floating-point numbers and returns 1 if their values fall within a small range. Otherwise it returns 0. ::math::fuzzy::tne value1 value2 Returns the negation, that is, if the difference is larger than the margin, it returns 1. ::math::fuzzy::tge value1 value2 Compares two floating-point numbers and returns 1 if their values either fall within a small range or if the first number is larger than the second. Otherwise it returns 0. ::math::fuzzy::tle value1 value2 Returns 1 if the two numbers are equal according to [teq] or if the first is smaller than the second. ::math::fuzzy::tlt value1 value2 Returns the opposite of [tge]. ::math::fuzzy::tgt value1 value2 Returns the opposite of [tle]. ::math::fuzzy::tfloor value Returns the integer number that is lower or equal to the given floating-point number, within a well-defined tolerance. ::math::fuzzy::tceil value Returns the integer number that is greater or equal to the given floating-point number, within a well-defined tolerance. ::math::fuzzy::tround value Rounds the floating-point number off. ::math::fuzzy::troundn value ndigits Rounds the floating-point number off to the specified number of decimals (Pro memorie). Usage: if { [teq $x $y] } { puts "x == y" } if { [tne $x $y] } { puts "x != y" } if { [tge $x $y] } { puts "x >= y" } if { [tgt $x $y] } { puts "x > y" } if { [tlt $x $y] } { puts "x < y" } if { [tle $x $y] } { puts "x <= y" } set fx [tfloor $x] set fc [tceil $x] set rounded [tround $x] set roundn [troundn $x $nodigits] TEST CASES
The problems that can occur with floating-point numbers are illustrated by the test cases in the file "fuzzy.test": o Several test case use the ordinary comparisons, and they fail invariably to produce understandable results o One test case uses [expr] without braces ({ and }). It too fails. The conclusion from this is that any expression should be surrounded by braces, because otherwise very awkward things can happen if you need accuracy. Furthermore, accuracy and understandable results are enhanced by using these "tolerant" or fuzzy comparisons. Note that besides the Tcl-only package, there is also a C-based version. REFERENCES
Original implementation in Fortran by dr. H.D. Knoble (Penn State University). P. E. Hagerty, "More on Fuzzy Floor and Ceiling," APL QUOTE QUAD 8(4):20-24, June 1978. Note that TFLOOR=FL5 took five years of refereed evolution (publication). L. M. Breed, "Definitions for Fuzzy Floor and Ceiling", APL QUOTE QUAD 8(3):16-23, March 1978. D. Knuth, Art of Computer Programming, Vol. 1, Problem 1.2.4-5. BUGS, IDEAS, FEEDBACK This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category math :: fuzzy of the Tcllib SF Trackers [http://sourceforge.net/tracker/?group_id=12883]. Please also report any ideas for enhancements you may have for either package and/or documentation. KEYWORDS
floating-point, math, rounding math 0.2 math::fuzzy(n)
All times are GMT -4. The time now is 03:38 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy