Sponsored Content
Full Discussion: Unicode file validation
Top Forums Shell Programming and Scripting Unicode file validation Post 302618029 by Corona688 on Tuesday 3rd of April 2012 01:06:08 PM
Old 04-03-2012
Bumping up posts or double posting is not permitted in these forums.

Please read the rules, which you agreed to when you registered, if you have not already done so.

You may receive an infraction for this. If so, don't worry, just try to follow the rules more carefully. The infraction will expire in the near future

Thank You.

The UNIX and Linux Forums.

Also, your attachment doesn't seem to be working.
 

9 More Discussions You Might Find Interesting

1. Programming

How to display unicode characters / unicode string

I have a stream of characters like "\u8BBE\u5907\u7BA1" and i want to display it. I tried following things already without any luck. 1) printf("%s",L("\u8BBE\u5907\u7BA1")); 2) printf("%lc",0x8BBE); 3) setlocale followed by fwide followed by wprintf 4) also changed the local manually... (3 Replies)
Discussion started by: jackdorso
3 Replies

2. UNIX for Dummies Questions & Answers

grep and UNICODE (utf-16) file

I'm using shell scripting in Applescript. When searching a file with the ANSEL character set (for GEDCOM files) using (grep '1 CHAR ANSEL' filepath) gives the expected result. When searching a UNICODE formatted file (utf-16), searching for text known to exist in the file using (grep '1 CHAR... (4 Replies)
Discussion started by: Whiterock
4 Replies

3. Shell Programming and Scripting

Find Unicode Character in File

I have a very large file in Unix that I would like to search for all instances of the unicode character 0x17. I need to remove these characters because the character is causing my SAX Parser to throw an exception. Does anyone know how to find a unicode character in a file? Thank you for your... (1 Reply)
Discussion started by: azelinsk
1 Replies

4. AIX

Dont want to change the codepage of a unicode file

I have a unicode file which needs to be modifed in an AIX environment from within a shell (ksh). I am concerned that the modification may involve a change in the file's codepage. Is my concern correct ? If so, whats the way around ? Thanks in advance. (0 Replies)
Discussion started by: shibajighosh
0 Replies

5. Shell Programming and Scripting

sed replacement in unicode file

Hi there, I have a file generated by a windows registry (it's unicode) and can't get to do some replacements on it. I want to join lines that end with backslash with the next one. santiago@ks354286:~$ cat win.reg ÿþWindows Registry Editor Version 5.00 ... (10 Replies)
Discussion started by: chebarbudo
10 Replies

6. Shell Programming and Scripting

Converting Unicode file to UTF8 format

Hi, I have a file in my desktop which is a unicode format. After this file is transferred to Unix using FTP, we are seeing some special character (like rectangle box type) at the first line. The same file is saved as UTF8 (using textpad tool, selecting encode to UTF-8 option) on my desktopand... (7 Replies)
Discussion started by: vfrg
7 Replies

7. Shell Programming and Scripting

How to remove Unicode <feff> from top of file?

Experts, this has been dumped on me at the last minute.... i am having issue on few files where im getting files from source with BOM mark at the top of every file and i need to check for its existence and remove it. <feff> header Coulmn1|column2......n i know i can simply do sed on... (5 Replies)
Discussion started by: biztank
5 Replies

8. Shell Programming and Scripting

Reading/Viewing an Unicode file

WE have a file coming from a server that has characters for 4-5 languages. If I download the file to my windows PC and open in Notepad ++, I can clearly see the text in different languages. Notepad++ is able to reder text that is in Portugese, French, Thai etc. My objective it to do the following:... (2 Replies)
Discussion started by: vskr72
2 Replies

9. Shell Programming and Scripting

Wget download file content in unicode

Hi All, I am trying to download a XML from a URL through wget and successful in that but the problem is that I have to check for some special characters inside that XML. But when I download through wget it transfers the content of the XML in plain text and I'm not able to search for those... (2 Replies)
Discussion started by: dips_ag
2 Replies
CGI::Untaint(3pm)					User Contributed Perl Documentation					 CGI::Untaint(3pm)

NAME
CGI::Untaint - process CGI input parameters SYNOPSIS
use CGI::Untaint; my $q = new CGI; my $handler = CGI::Untaint->new( $q->Vars ); my $handler2 = CGI::Untaint->new({ INCLUDE_PATH => 'My::Untaint', }, $apr->parms); my $name = $handler->extract(-as_printable => 'name'); my $homepage = $handler->extract(-as_url => 'homepage'); my $postcode = $handler->extract(-as_postcode => 'address6'); # Create your own handler... package MyRecipes::CGI::Untaint::legal_age; use base 'CGI::Untaint::integer'; sub is_valid { shift->value > 21; } package main; my $age = $handler->extract(-as_legal_age => 'age'); DESCRIPTION
Dealing with large web based applications with multiple forms is a minefield. It's often hard enough to ensure you validate all your input at all, without having to worry about doing it in a consistent manner. If any of the validation rules change, you often have to alter them in many different places. And, if you want to operate taint-safe, then you're just adding even more headaches. This module provides a simple, convenient, abstracted and extensible manner for validating and untainting the input from web forms. You simply create a handler with a hash of your parameters (usually $q->Vars), and then iterate over the fields you wish to extract, performing whatever validations you choose. The resulting variable is guaranteed not only to be valid, but also untainted. CONSTRUCTOR
new my $handler = CGI::Untaint->new( $q->Vars ); my $handler2 = CGI::Untaint->new({ INCLUDE_PATH => 'My::Untaint', }, $apr->parms); The simplest way to contruct an input handler is to pass a hash of parameters (usually $q->Vars) to new(). Each parameter will then be able to be extracted later by calling an extract() method on it. However, you may also pass a leading reference to a hash of configuration variables. Currently the only such variable supported is 'INCLUDE_PATH', which allows you to specify a local path in which to find extraction handlers. See "LOCAL EXTRACTION HANDLERS". METHODS
extract my $homepage = $handler->extract(-as_url => 'homepage'); my $state = $handler->extract(-as_us_state => 'address4'); my $state = $handler->extract(-as_like_us_state => 'address4'); Once you have constructed your Input Handler, you call the 'extract' method on each piece of data with which you are concerned. The takes an -as_whatever flag to state what type of data you require. This will check that the input value correctly matches the required specification, and return an untainted value. It will then call the is_valid() method, where applicable, to ensure that this doesn't just _look_ like a valid value, but actually is one. If you want to skip this stage, then you can call -as_like_whatever which will perform the untainting but not the validation. error my $error = $handler->error; If the validation failed, this will return the reason why. LOCAL EXTRACTION HANDLERS
As well as as the handlers supplied with this module for extracting data, you may also create your own. In general these should inherit from 'CGI::Untaint::object', and must provide an '_untaint_re' method which returns a compiled regular expression, suitably bracketed such that $1 will return the untainted value required. e.g. if you often extract single digit variables, you could create package My::Untaint::digit; use base 'CGI::Untaint::object'; sub _untaint_re { qr/^(d)$/ } 1; You should specify the path 'My::Untaint' in the INCLUDE_PATH configuration option. (See new() above.) When extract() is called CGI::Untaint will also check to see if you have an is_valid() method also, and if so will run this against the value extracted from the regular expression (available as $self->value). If this returns a true value, then the extracted value will be returned, otherwise we return undef. is_valid() can also modify the value being returned, by assigning $self->value($new_value) e.g. in the above example, if you sometimes need to ensure that the digit extracted is prime, you would supply: sub is_valid { (1 x shift->value) !~ /^1?$|^(11+?)1+$/ }; Now, when users call extract(), it will also check that the value is valid(), i.e. prime: my $number = $handler->extract(-as_digit => 'value'); A user wishing to skip the validation, but still ensure untainting can call my $number = $handler->extract(-as_like_digit => 'value'); Test::CGI::Untaint If you create your own local handlers, then you may wish to explore Test::CGI::Untaint, available from the CPAN. This makes it very easy to write tests for your handler. (Thanks to Profero Ltd.) AVAILABLE HANDLERS
This package comes with the following simplistic handlers: printable - a printable string integer - an integer hex - a hexadecimal number (as a string) To really make this work for you you either need to write, or download from CPAN, other handlers. Some of the handlers available on CPAN include: asin - an Amazon ID boolean - boolean value country - a country code or name creditcard - a credit card number date - a date (into a Date::Simple) datetime - a date (into a DateTime) email - an email address hostname - a DNS host name html - sanitized HTML ipaddress - an IP address isbn - an ISBN uk_postcode - a UK Postcode url - a URL zipcode - a US zipcode BUGS
None known yet. SEE ALSO
CGI. perlsec. Test::CGI::Untaint. AUTHOR
Tony Bowden BUGS and QUERIES Please direct all correspondence regarding this module to: bug-CGI-Untaint@rt.cpan.org COPYRIGHT and LICENSE Copyright (C) 2001-2005 Tony Bowden. All rights reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.10.0 2005-09-20 CGI::Untaint(3pm)
All times are GMT -4. The time now is 09:22 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy