Sponsored Content
Full Discussion: iconv and xmllint
Top Forums UNIX for Advanced & Expert Users iconv and xmllint Post 302139424 by cbkihong on Saturday 6th of October 2007 12:53:05 PM
Old 10-06-2007
Yes, that's why it is good if you are doing validation and normally would expect everything to pass.

This shortcut will be quite messy otherwise, if indeed some records have problems. That's why I have another suggestion of using libiconv, as I know you can instruct it to ignore bytes that cannot be converted and proceed, and do so without stopping the iconv process. This cannot be achieved with the iconv executable alone because there are no "hooks" that allow you do so from the command line.

Loading of character tables is very expensive operation, so starting iconv many times is bound to be slow. If you really have records of that volume, you should really invest in a C program with libiconv that acts on a concatenated sequence of records. I have some good feeling that it could work based on my earlier exploration of libiconv although I have not made anything similar myself.
 

10 More Discussions You Might Find Interesting

1. Programming

about iconv

I want to use iconv.h to convert some text to another charset. The code is below: #include <stdio.h> #include <stdlib.h> #include <iconv.h> int main() { iconv_t cd; char instr="汉字"; char *inbuf; char *outbuf; unsigned int insize=7; ... (4 Replies)
Discussion started by: yong
4 Replies

2. Shell Programming and Scripting

xmllint output to a file

Hello All, I have an XML file which has some errors in its tag definition according to an xsd. When i validate this xml file against an xsd, i wish to only take the errors in a file and not the complete xml. for eg. Raman.xml has some errors induced in it. RamanValidator.xsd holds the schema... (5 Replies)
Discussion started by: damansingh
5 Replies

3. Shell Programming and Scripting

XMLLINT COMMAND IN UNIX TO VALIDATE XML AGAINST XSD

Hi i am baby to unix shell script. how do i validate xml agaist xsd and transforms xml using xslt. Thanks Mohan (2 Replies)
Discussion started by: mohan.cheepu
2 Replies

4. Shell Programming and Scripting

Help with iconv command

Hi , I am using iconv command to convert a file in UTF-16 format to UTF-8 format. This command will work for few files but for some showing an error as bad input character. But if i copy the contents of the file for which it is showing "bad input character" to a new file and perform the... (0 Replies)
Discussion started by: Shruthi8818
0 Replies

5. UNIX for Dummies Questions & Answers

Help with iconv command

Hi , I am using iconv command to convert a file in UTF-16 format to UTF-8 format. This command will work for few files but for some showing an error as bad input character. But if i copy the contents of the file for which it is showing "bad input character" to a new file and perform the... (2 Replies)
Discussion started by: Shruthi8818
2 Replies

6. UNIX for Dummies Questions & Answers

Xmllint pretty print, batch files

I have about 20 xml files I want to use xmllint to pretty print: xmllint --format file01.xml > pretty_file01.xml xmllint --format file02.xml > pretty_file02.xml etc Is there a way I can just use "xmllint --format" on all the current xml files so I don't have to run this command 20 times?? :( (5 Replies)
Discussion started by: pxalpine
5 Replies

7. Shell Programming and Scripting

Parse XML using xmllint

Hi All, Need help to parse the xml file in shell script using xmllint. Below is the sample xml file. <CARS> <AUDI> <Speed="45"/> <speed="55"/> <speed="75"/> <speed="95"/> </AUDI> <BMW> <Speed="30"/> <speed="75"/> <speed="120"/> <speed="135"/> </BMW>... (6 Replies)
Discussion started by: prasanna2166
6 Replies

8. Shell Programming and Scripting

Help with xmllint

Have like 50 xml files in a folder. They all have a Node named <Number>.How to display the values of <Number> with the count and filename in the folder. I am using Mac . (7 Replies)
Discussion started by: Anethar
7 Replies

9. Shell Programming and Scripting

Xmllint: get one result per line

Hi, I'm trying to get some values from an xmlfile and want be able to process them. I'm using xmllint(v20901 on debian jessie) and this program directly outputs all results concatenated right after each other. I did not find a solution in the man page to get a different format or some output... (2 Replies)
Discussion started by: stomp
2 Replies

10. Shell Programming and Scripting

Xmllint parser error : EntityRef: expecting ';'

Hi I have an XML file which contains html urls in that node values. When i use xmllint to parse that, i am getting error (because of the sympols in the url). i have used --html option but it throws other tag errors. Please guide me. sample file.xml <abc> <bcd> <cde> <a>sometext</a>... (2 Replies)
Discussion started by: ananan
2 Replies
ICONV(1)						     Linux Programmer's Manual							  ICONV(1)

NAME
iconv - character set conversion SYNOPSIS
iconv [OPTION...] [-f encoding] [-t encoding] [inputfile ...] iconv -l DESCRIPTION
The iconv program converts text from one encoding to another encoding. More precisely, it converts from the encoding given for the -f option to the encoding given for the -t option. Either of these encodings defaults to the encoding of the current locale. All the input- files are read and converted in turn; if no inputfile is given, the standard input is used. The converted text is printed to standard out- put. The encodings permitted are system dependent. For the libiconv implementation, they are listed in the iconv_open(3) manual page. Options controlling the input and output format: -f encoding, --from-code=encoding Specifies the encoding of the input. -t encoding, --to-code=encoding Specifies the encoding of the output. Options controlling conversion problems: -c When this option is given, characters that cannot be converted are silently discarded, instead of leading to a conversion error. --unicode-subst=formatstring When this option is given, Unicode characters that cannot be represented in the target encoding are replaced with a placeholder string that is constructed from the given formatstring, applied to the Unicode code point. The formatstring must be a format string in the same format as for the printf command or the printf() function, taking either no argument or exactly one unsigned integer argument. --byte-subst=formatstring When this option is given, bytes in the input that are not valid in the source encoding are replaced with a placeholder string that is constructed from the given formatstring, applied to the byte's value. The formatstring must be a format string in the same format as for the printf command or the printf() function, taking either no argument or exactly one unsigned integer argument. --widechar-subst=formatstring When this option is given, wide characters in the input that are not valid in the source encoding are replaced with a placeholder string that is constructed from the given formatstring, applied to the byte's value. The formatstring must be a format string in the same format as for the printf command or the printf() function, taking either no argument or exactly one unsigned integer argument. Options controlling error output: -s, --silent When this option is given, error messages about invalid or unconvertible characters are omitted, but the actual converted text is unaffected. The iconv -l or iconv --list command lists the names of the supported encodings, in a system dependent format. For the libiconv implementa- tion, the names are printed in upper case, separated by whitespace, and alias names of an encoding are listed on the same line as the encoding itself. EXAMPLES
iconv -f ISO-8859-1 -t UTF-8 converts input from the old West-European encoding ISO-8859-1 to Unicode. iconv -f KOI8-R --byte-subst="<0x%x>" --unicode-subst="<U+%04X>" converts input from the old Russian encoding KOI8-R to the locale encoding, substituting an angle bracket notation with hexadecimal numbers for invalid bytes and for valid but unconvertible characters. iconv --list lists the supported encodings. SEE ALSO
iconv_open(3), locale(7) GNU
January 22, 2006 ICONV(1)
All times are GMT -4. The time now is 07:15 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy