Sponsored Content
Top Forums Shell Programming and Scripting How to delete corrupted characters and then do fuzzy searches? Post 302456725 by Bashingaway on Saturday 25th of September 2010 12:59:13 PM
Old 09-25-2010
How to delete corrupted characters and then do fuzzy searches?

Hi All

I have a whole block of pages that have come in from various sources, unfortunately the pages in many instances have blocks of corrupted text. What I'm trying to do is write a sed line that will just delete non alphanumeric characters if they're in a block of say three or four characters, i.e.

constipated would stay the same

con5tipated would stay the same

con%^|pated would stay the same

&*^%%pated would stay the same

^& would stay the same

&^*^ would get deleted

I was thinking along the lines of....

Code:
sed -r 's/^.*[[:punct:]]//g'

However this seems to delete anything with a punctuation character in the block even if they are valid alphanumerics.

I'm familiar with using /b for word blocks but unless I can get the core sed to work I'm stuck.

Could anyone possibly offer some pointers with an explanation of why my example doesn't work and there's does, that way it helps me learn.

Thanks in advance.
 

10 More Discussions You Might Find Interesting

1. AIX

Delete specific characters

Hi every1 Well i have a list of numbers e.g 12304 13450 01234 00123 14567 what i want is a command to check if the number is starting from 0 and then delete the 0 without doing anything else!!!! any help wud b appreciated!!!!!!!!:( (4 Replies)
Discussion started by: masquerer
4 Replies

2. UNIX for Dummies Questions & Answers

how to delete M-^M characters from a file

I am receiving a file with 'M-^M' characters...how do I get rid of these characters. I tried tr -d '\015' and sed '/^M//g', but they didnot work. Appreciate if someone can help me with this (1 Reply)
Discussion started by: hyennah
1 Replies

3. Shell Programming and Scripting

Delete not readable characters

Hi All, I wanted to delete all the unwanted characters in the string. ie, to delete all the characters which are not alpha numeric values. var1="a./bc" var2='abc/\."123' like to get the output as print var1 abc print var2 abc123 Could you guys help me out pls. Your help is... (3 Replies)
Discussion started by: ajilesh
3 Replies

4. Shell Programming and Scripting

How to delete characters using a file

Hi All, I have a configuration file (file.cfg) in which data will be like this ; , _ + a to z A to Z Now i have to read a textfile (file.txt) and i need to check whether there is any other character present in text file that is not existing in (file.cfg). If other characters are present... (4 Replies)
Discussion started by: krishna_gnv
4 Replies

5. Shell Programming and Scripting

Delete characters from each line

Hi, I have a file that has data in the following manner, tt_0.00001.dat 123.000 tt_0.00002.dat 124.000 tt_0.00002.dat 125.000 This is consistent for all the entries in the file. I want to delete the 'tt_' and '.dat' from each line. Could anyone please guide me how to do this using awk or... (2 Replies)
Discussion started by: lost.identity
2 Replies

6. Shell Programming and Scripting

need to Delete first 10 characters of a file name

Hello Everyone, I need help in deleting first 10 characters from the filename in a directory eg: 1234567890samplefile1.txt 1234567890samplefile2.txt and so on.. need to get the output as samplefile1.txt Thanks in Advance!!!! (8 Replies)
Discussion started by: Olivia
8 Replies

7. Shell Programming and Scripting

delete first 2 characters for each line, please help

hi, ./R1_970330_210505.sard ./R1_970403_223412.sard ./R1_970626_115235.sard ./R1_970626_214344.sard ./R1_970716_234214.sard ... ... ... for these strings, i wanna remove the ./ for each line how can i do that? i know it could possibly be done by sed, but i really have not idea how... (4 Replies)
Discussion started by: sunnydanniel
4 Replies

8. Shell Programming and Scripting

Delete and retain some characters

Ive been trying to google and tried sed and awk. BUt still getting no exact formula. I would like to know how to parse this at: From: Compute Machin Appliance 3.2.9.10000 123456 To: Compute Machin Appliance 3.2.9.123456 (5 Replies)
Discussion started by: kenshinhimura
5 Replies

9. Shell Programming and Scripting

Delete last characters in each column

I need to delete the last 11 characters from each number and they are all in the same line (each is in a different column): -6.89080901827020800000 3.49348891708562325136 1.47988367839905286876 -2.29707635413510400000 -3.49342364708562325136 -4.43758473239905286876 -2.29707635413510400000... (14 Replies)
Discussion started by: rogeriog.em
14 Replies

10. Shell Programming and Scripting

Delete special characters

My sed is not working on deleting the entire special characters and leaving what is necessary.grep connections_per a|sed -e 's/\<\!\-\-//g' INPUT: <!-- <connections_per_instance>1</connections_per_instance> --> <method>HALF</method> <!--... (10 Replies)
Discussion started by: kenshinhimura
10 Replies
DUMPE2FS(8)                                                   System Manager's Manual                                                  DUMPE2FS(8)

NAME
dumpe2fs - dump ext2/ext3/ext4 filesystem information SYNOPSIS
dumpe2fs [ -bfghixV ] [ -o superblock=superblock ] [ -o blocksize=blocksize ] device DESCRIPTION
dumpe2fs prints the super block and blocks group information for the filesystem present on device. Note: When used with a mounted filesystem, the printed information may be old or inconsistent. OPTIONS
-b print the blocks which are reserved as bad in the filesystem. -o superblock=superblock use the block superblock when examining the filesystem. This option is not usually needed except by a filesystem wizard who is examining the remains of a very badly corrupted filesystem. -o blocksize=blocksize use blocks of blocksize bytes when examining the filesystem. This option is not usually needed except by a filesystem wizard who is examining the remains of a very badly corrupted filesystem. -f force dumpe2fs to display a filesystem even though it may have some filesystem feature flags which dumpe2fs may not understand (and which can cause some of dumpe2fs's display to be suspect). -g display the group descriptor information in a machine readable colon-separated value format. The fields displayed are the group number; the number of the first block in the group; the superblock location (or -1 if not present); the range of blocks used by the group descriptors (or -1 if not present); the block bitmap location; the inode bitmap location; and the range of blocks used by the inode table. -h only display the superblock information and not any of the block group descriptor detail information. -i display the filesystem data from an image file created by e2image, using device as the pathname to the image file. -x print the detailed group information block numbers in hexadecimal format -V print the version number of dumpe2fs and exit. BUGS
You need to know the physical filesystem structure to understand the output. AUTHOR
dumpe2fs was written by Remy Card <Remy.Card@linux.org>. It is currently being maintained by Theodore Ts'o <tytso@alum.mit.edu>. AVAILABILITY
dumpe2fs is part of the e2fsprogs package and is available from http://e2fsprogs.sourceforge.net. SEE ALSO
e2fsck(8), mke2fs(8), tune2fs(8). ext4(5) E2fsprogs version 1.44.1 March 2018 DUMPE2FS(8)
All times are GMT -4. The time now is 01:14 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy