The UNIX and Linux Forums  


Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
how do I identify files with characters beyond a certain range. kcsunsun01dev Shell Programming and Scripting 3 02-14-2008 05:51 PM
UNIx junk emails sireesha15 UNIX for Dummies Questions & Answers 6 12-14-2006 03:08 PM
Email ends with Junk Characters Amruta Pitkar UNIX for Dummies Questions & Answers 5 08-16-2006 09:43 PM
Identify a remote machine as windows or unix u449064 UNIX for Advanced & Expert Users 5 08-05-2006 01:03 PM
how does unix identify C and other language code! a25khan UNIX for Dummies Questions & Answers 2 01-21-2004 10:44 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 11-14-2007
sureshg_sampat sureshg_sampat is offline
Registered User
  
 

Join Date: Nov 2006
Location: Czech Republic
Posts: 49
Identify records having junk characters in unix

Hi Friends,

I need to have a command in Unix which output all teh records havingg junk characters in a file....

I know a command cat -tv <Filename> which opens the file and we can check for any junk character in it.

But my requirement is to fetch ONLY THOSE records having junk characters.
Please suggest

Thanks in advance,
Suresh.
  #2 (permalink)  
Old 11-14-2007
matrixmadhan matrixmadhan is offline Forum Advisor  
Technorati Master
  
 

Join Date: Mar 2005
Location: leaf node in B+ tree
Posts: 2,958
What do you mean by junk characters ?

characters within specific ascii range ???
  #3 (permalink)  
Old 11-14-2007
porter porter is offline Forum Advisor  
Registered User
  
 

Join Date: Jan 2007
Posts: 2,965

Code:
#!/bin/sh

while read N
do
        if hasjunk "$N"
        then
             echo "$N"
        fi
done

  #4 (permalink)  
Old 11-14-2007
sureshg_sampat sureshg_sampat is offline
Registered User
  
 

Join Date: Nov 2006
Location: Czech Republic
Posts: 49
Hi ,

Junk characters means somethin like this when I did a cat on the unix file

|ש××××ª× ×¢×× ×ר××¦× ××שר×ת ×××× ××× ×××ר×× ×- ×©× × ×ש××××××× ××××¢× ××××¤× ×××§× ×× ×××ר ×××ש×××××××××× ××× × ×סר××× ××©× × ××××××× ×©×× ×× × ××§××××. ×××§×©× ×ס×ר××××× ×©× × ×©×"×: 482304481-×ש××× ×©× 3 ×××××ת ×©× ××¡×¨× ×¨×§ ×©× ××××× ×©× ×סר ×××¢× ×ª× ×××ר ×××ש××××§××× ×ª×× 6 ×××× ××× ×¢××¨× ×××ש××××ס××¨× ×ס×פ×ת. ××¢××¨×ª× ×ת××× ×ת ×ת ××¢×× ×××. ×××ת|

Thanks and Regards,
Suresh
  #5 (permalink)  
Old 11-14-2007
matrixmadhan matrixmadhan is offline Forum Advisor  
Technorati Master
  
 

Join Date: Mar 2005
Location: leaf node in B+ tree
Posts: 2,958
© - ascii value - 169

this link should be useful to you,

Unicode/UTF-8-character table - starting from code position 0080

something like this should do it,


Code:
#! /opt/third-party/bin/perl

open(FILE, "<", $ARGV[0]) || die ("unable to open <$!>\n");

while( read(FILE, $data, 1) == 1 ) {
  $ordVal = ord($data);
  if( $ordVal == 169 ) {
    # similarly for other characters as well,
    # better option would be to build a range for that
    # do the processing here
  }
}

close(FILE);

exit 0

  #6 (permalink)  
Old 11-14-2007
Lakris Lakris is offline
Registered User
  
 

Join Date: Oct 2007
Posts: 246
Maybe the file command can help You?
Otherwise You must be more specific, You may be using character sets that come out strange in terminal but ok in any other application.

Example:
file *|grep text
in a random directory it would give me something like
ecl: ASCII text
gitt: Bourne-Again shell script text executable
HELP: ASCII English text
t2s: POSIX shell script text executable
time2Long.java: ASCII Java program text

(and lines sorted out could be lines like
Firefox_wallpaper.png: PNG image data, 1914 x 818, 8-bit/color RGB, non-interlaced
FW6AK115310.pdf: PDF document, version 1.3
itinerary-hotel-3S69Q2.RTF: Rich Text Format data, version 1, ANSI
)
Please be more specific if You can.

/Lakris
  #7 (permalink)  
Old 11-14-2007
prowla prowla is offline
Read Only
  
 

Join Date: Nov 2007
Posts: 165
Use a language which has the isctrl() function; I think Perl does.
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 02:18 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0