Sponsored Content
Full Discussion: iconv and xmllint
Top Forums UNIX for Advanced & Expert Users iconv and xmllint Post 302139403 by cbkihong on Saturday 6th of October 2007 05:07:33 AM
Old 10-06-2007
Invoking iconv on a single large attempt is far more efficient than many small ones. You can see from this:

Code:
admin@np64gw:/dev/shm$ time perl -e 'while (<>) { open(ICONV, "| iconv -f big5 -t utf8 >/dev/null"); print ICONV $_; close ICONV }' <XLink.txt

real    0m4.224s
user    0m2.200s
sys     0m0.652s
admin@np64gw:/dev/shm$ time iconv -f big5 -t utf8 XLink.txt >/dev/null          
real    0m0.009s
user    0m0.008s
sys     0m0.000s

So, if you have some methods to concatenate the records into 1 single file before passing to iconv, it will go a lot faster. iconv will return the file position that has the error, so if you have some indexing performed that allows you to accurate map a file position to record number, that would likely work. If you are just doing validation and expect all records should pass normally, this may work for you.

But can you reprogram that part of the script in C? I guess with libiconv you can better control the process in case there are many alien bytes sneaked in.
 

10 More Discussions You Might Find Interesting

1. Programming

about iconv

I want to use iconv.h to convert some text to another charset. The code is below: #include <stdio.h> #include <stdlib.h> #include <iconv.h> int main() { iconv_t cd; char instr="汉字"; char *inbuf; char *outbuf; unsigned int insize=7; ... (4 Replies)
Discussion started by: yong
4 Replies

2. Shell Programming and Scripting

xmllint output to a file

Hello All, I have an XML file which has some errors in its tag definition according to an xsd. When i validate this xml file against an xsd, i wish to only take the errors in a file and not the complete xml. for eg. Raman.xml has some errors induced in it. RamanValidator.xsd holds the schema... (5 Replies)
Discussion started by: damansingh
5 Replies

3. Shell Programming and Scripting

XMLLINT COMMAND IN UNIX TO VALIDATE XML AGAINST XSD

Hi i am baby to unix shell script. how do i validate xml agaist xsd and transforms xml using xslt. Thanks Mohan (2 Replies)
Discussion started by: mohan.cheepu
2 Replies

4. Shell Programming and Scripting

Help with iconv command

Hi , I am using iconv command to convert a file in UTF-16 format to UTF-8 format. This command will work for few files but for some showing an error as bad input character. But if i copy the contents of the file for which it is showing "bad input character" to a new file and perform the... (0 Replies)
Discussion started by: Shruthi8818
0 Replies

5. UNIX for Dummies Questions & Answers

Help with iconv command

Hi , I am using iconv command to convert a file in UTF-16 format to UTF-8 format. This command will work for few files but for some showing an error as bad input character. But if i copy the contents of the file for which it is showing "bad input character" to a new file and perform the... (2 Replies)
Discussion started by: Shruthi8818
2 Replies

6. UNIX for Dummies Questions & Answers

Xmllint pretty print, batch files

I have about 20 xml files I want to use xmllint to pretty print: xmllint --format file01.xml > pretty_file01.xml xmllint --format file02.xml > pretty_file02.xml etc Is there a way I can just use "xmllint --format" on all the current xml files so I don't have to run this command 20 times?? :( (5 Replies)
Discussion started by: pxalpine
5 Replies

7. Shell Programming and Scripting

Parse XML using xmllint

Hi All, Need help to parse the xml file in shell script using xmllint. Below is the sample xml file. <CARS> <AUDI> <Speed="45"/> <speed="55"/> <speed="75"/> <speed="95"/> </AUDI> <BMW> <Speed="30"/> <speed="75"/> <speed="120"/> <speed="135"/> </BMW>... (6 Replies)
Discussion started by: prasanna2166
6 Replies

8. Shell Programming and Scripting

Help with xmllint

Have like 50 xml files in a folder. They all have a Node named <Number>.How to display the values of <Number> with the count and filename in the folder. I am using Mac . (7 Replies)
Discussion started by: Anethar
7 Replies

9. Shell Programming and Scripting

Xmllint: get one result per line

Hi, I'm trying to get some values from an xmlfile and want be able to process them. I'm using xmllint(v20901 on debian jessie) and this program directly outputs all results concatenated right after each other. I did not find a solution in the man page to get a different format or some output... (2 Replies)
Discussion started by: stomp
2 Replies

10. Shell Programming and Scripting

Xmllint parser error : EntityRef: expecting ';'

Hi I have an XML file which contains html urls in that node values. When i use xmllint to parse that, i am getting error (because of the sympols in the url). i have used --html option but it throws other tag errors. Please guide me. sample file.xml <abc> <bcd> <cde> <a>sometext</a>... (2 Replies)
Discussion started by: ananan
2 Replies
APPROX(8)                                                     System Manager's Manual                                                    APPROX(8)

NAME
approx - proxy server for Debian archive files SYNOPSIS
approx [OPTION]... DESCRIPTION
approx responds to HTTP requests made by apt-get(8). It maintains a cache of Debian archive files that have been previously downloaded, so that it can respond with a local copy when possible. If a file not in the cache is requested, approx will download it from a remote Debian repository and deliver the contents to the client, simultaneously caching it for future use. Over time, the approx server cache will grow to contain multiple, unneeded versions of Debian packages. The approx-gc(8) program removes these from the cache. OPTIONS
-c file, --config file Specify an additional configuration file. May be used multiple times. USAGE
approx is invoked by inetd(8). EXAMPLES
Suppose that a client machine's /etc/apt/sources.list file contains the following lines: deb http://apt:9999/debian testing main deb http://apt:9999/security testing/updates main deb-src http://apt:9999/debian unstable main In this example, apt is the hostname of the approx server machine on the local network. Each distribution, such as "debian" or "security", is mapped to a remote repository in the approx server's configuration file. For example, the approx.conf file on the approx server might contain the lines debian http://ftp.debian.org/debian security http://security.debian.org The mapping scheme is very simple. If the approx.conf file contains the line repository http://remote-host/initial/path then any request to the approx server of the form http://approx-server/repository/rest/of/URL is rewritten to http://remote-host/initial/path/rest/of/URL when there is a "cache miss", and that file is cached as /var/cache/approx/repository/rest/of/URL (Note that the repository name on the left-hand side is not included in the rewritten URL unless it is explicitly mentioned in the right- hand side's initial path.) FILES
/etc/approx/approx.conf Configuration file for approx and related programs. /var/cache/approx Default cache directory for archive files. SEE ALSO
approx.conf(5), inetd(8), approx-import(8), approx-gc(8), apt-get(8), sources.list(5) AUTHOR
Eric Cooper <ecc@cmu.edu> May 2011 APPROX(8)
All times are GMT -4. The time now is 01:59 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy