![]() |
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| how do I identify files with characters beyond a certain range. | kcsunsun01dev | Shell Programming and Scripting | 3 | 02-14-2008 05:51 PM |
| UNIx junk emails | sireesha15 | UNIX for Dummies Questions & Answers | 6 | 12-14-2006 03:08 PM |
| Email ends with Junk Characters | Amruta Pitkar | UNIX for Dummies Questions & Answers | 5 | 08-16-2006 09:43 PM |
| Identify a remote machine as windows or unix | u449064 | UNIX for Advanced & Expert Users | 5 | 08-05-2006 01:03 PM |
| how does unix identify C and other language code! | a25khan | UNIX for Dummies Questions & Answers | 2 | 01-21-2004 10:44 AM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
Identify records having junk characters in unix
Hi Friends,
I need to have a command in Unix which output all teh records havingg junk characters in a file.... I know a command cat -tv <Filename> which opens the file and we can check for any junk character in it. But my requirement is to fetch ONLY THOSE records having junk characters. Please suggest Thanks in advance, Suresh. |
|
||||
|
Hi ,
Junk characters means somethin like this when I did a cat on the unix file |ש××××ª× ×¢×× ×ר××¦× ××שר×ת ×××× ××× ×××ר×× ×- ×©× × ×ש××××××× ××××¢× ××××¤× ×××§× ×× ×××ר ×××ש×××××××××× ××× × ×סר××× ××©× × ××××××× ×©×× ×× × ××§××××. ×××§×©× ×ס×ר××××× ×©× × ×©×"×: 482304481-×ש××× ×©× 3 ×××××ת ×©× ××¡×¨× ×¨×§ ×©× ××××× ×©× ×סר ×××¢× ×ª× ×××ר ×××ש××××§××× ×ª×× 6 ×××× ××× ×¢××¨× ×××ש××××ס××¨× ×ס×פ×ת. ××¢××¨×ª× ×ת××× ×ת ×ת ××¢×× ×××. ×××ת| Thanks and Regards, Suresh |
|
||||
|
© - ascii value - 169 this link should be useful to you, Unicode/UTF-8-character table - starting from code position 0080 something like this should do it, Code:
#! /opt/third-party/bin/perl
open(FILE, "<", $ARGV[0]) || die ("unable to open <$!>\n");
while( read(FILE, $data, 1) == 1 ) {
$ordVal = ord($data);
if( $ordVal == 169 ) {
# similarly for other characters as well,
# better option would be to build a range for that
# do the processing here
}
}
close(FILE);
exit 0
|
|
||||
|
Maybe the file command can help You?
Otherwise You must be more specific, You may be using character sets that come out strange in terminal but ok in any other application. Example: file *|grep text in a random directory it would give me something like ecl: ASCII text gitt: Bourne-Again shell script text executable HELP: ASCII English text t2s: POSIX shell script text executable time2Long.java: ASCII Java program text (and lines sorted out could be lines like Firefox_wallpaper.png: PNG image data, 1914 x 818, 8-bit/color RGB, non-interlaced FW6AK115310.pdf: PDF document, version 1.3 itinerary-hotel-3S69Q2.RTF: Rich Text Format data, version 1, ANSI ) Please be more specific if You can. /Lakris |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|