Sponsored Content
Top Forums Shell Programming and Scripting Using find in a directory containing large number of files Post 302545328 by shoaibjameel123 on Monday 8th of August 2011 05:12:48 AM
Old 08-08-2011
Thanks. Sorry, I guess I was a bit vague here. When I wrote
Quote:
shell script which removes all the XML tags including the text inside the tags from some 4 million XML files
I meant the script deletes contents inside the tags like
Code:
<text>

Code:
 <?xml version="1.0" encoding="iso-8859-1" ?>

This means my script removes only the above tags including all the text inside the tags (like "text" and "?xml version="1.0" encoding="iso-8859-1" ?") and keeps the main paragraphs of the files.

---------- Post updated at 05:12 PM ---------- Previous update was at 05:09 PM ----------

Oh great!

You've pointed out one more fault. It is indeed deleting everything. This I can fix myself.
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

moving large number of files

I have a task to move more than 35000 files every two hours, from the same directory to another directory based on a file that has the list of filenames I tried the following logics (1) find . -name \*.dat > list for i in `cat list` do mv $i test/ done (2) cat list|xargs -i mv "{}"... (7 Replies)
Discussion started by: bryan
7 Replies

2. UNIX for Dummies Questions & Answers

Problem using find with prune on large number of files

Hi all; I'm having a problem when want to list a large number of files in current directory using find together with the prune option. First i used this command but it list all the files including those in sub directories: find . -name "*.dat" | xargs ls -ltr Then i modified the command... (2 Replies)
Discussion started by: ashikin_8119
2 Replies

3. Shell Programming and Scripting

Need help combining large number of text files

Hi, i have more than 1000 data files(.txt) like this first file format: 178.83 554.545 179.21 80.392 second file: 178.83 990.909 179.21 90.196 etc. I want to combine them to the following format: 178.83,554.545,990.909,... 179.21,80.392,90.196,... (7 Replies)
Discussion started by: mr_monocyte
7 Replies

4. Shell Programming and Scripting

Concatenation of a large number of files

Hellow i have a large number of files that i want to concatenate to one. these files start with the word 'VOICE_' for example VOICE_0000000000 VOICE_1223o23u0 VOICE_934934927349 I use the following code: cat /ODS/prepaid/CDR_FLOW/MEDIATION/VOICE_* >> /ODS/prepaid/CDR_FLOW/WORK/VOICE ... (10 Replies)
Discussion started by: chriss_58
10 Replies

5. Shell Programming and Scripting

Find line number of bad data in large file

Hi Forum. I was trying to search the following scenario on the forum but was not able to. Let's say that I have a very large file that has some bad data in it (for ex: 0.0015 in the 12th column) and I would like to find the line number and remove that particular line. What's the easiest... (3 Replies)
Discussion started by: pchang
3 Replies

6. UNIX for Dummies Questions & Answers

Delete large number of files

Hi. I need to delete a large number of files listed in a txt file. There are over 90000 files in the list. Some of the directory names and some of the file names do have spaces in them. In the file, each line is a full path to a file: /path/to/the files/file1 /path/to/some other/files/file 2... (4 Replies)
Discussion started by: inakajin
4 Replies

7. Shell Programming and Scripting

How to count number of files in directory and write to new file with number of files and their name?

Hi! I just want to count number of files in a directory, and write to new text file, with number of files and their name output should look like this,, assume that below one is a new file created by script Number of files in directory = 25 1. a.txt 2. abc.txt 3. asd.dat... (20 Replies)
Discussion started by: Akshay Hegde
20 Replies

8. Shell Programming and Scripting

Sftp large number of files

Want to sftp large number of files ... approx 150 files will come to server every minute. (AIX box) Also need make sure file has been sftped successfully... Please let me know : 1. What is the best / faster way to transfer files? 2. should I use batch option -b so that connectivity will be... (3 Replies)
Discussion started by: vegasluxor
3 Replies

9. Shell Programming and Scripting

Find Large Files Recursively From Specific Directory

Hi. I found many scripts in the web of achieving this. But I like to use this one find /EDWH-DMT03 -xdev -size +10000 -exec ls -la {} \;|sort -n -k 5 > LARGE.rst But the problem is, why it still list out files with 89 bytes as the output? Is there anything wrong with the command? My... (7 Replies)
Discussion started by: aimy
7 Replies
MKDoc::XML::Stripper(3pm)				User Contributed Perl Documentation				 MKDoc::XML::Stripper(3pm)

NAME
MKDoc::XML::Stripper - Remove unwanted XML / XHTML tags and attributes SYNOPSIS
use MKDoc::XML::Stripper; my $stripper = new MKDoc::XML::Stripper; $stripper->allow (qw /p class id/); my $ugly = '<p class="para" style="color:red">Hello, <strong>World</strong>!</p>'; my $neat = $stripper->process_data ($ugly); print $neat; Should print: <p class="para">Hello, World!</p> SUMMARY
MKDoc::XML::Stripper is a class which lets you specify a set of tags and attributes which you want to allow, and then cheekily strip any XML of unwanted tags and attributes. In MKDoc, this is used so that editors use structural XHTML rather than presentational tags, i.e. strip anything which looks like a <font> tag, a 'style' attribute or other tags which would break separation of structure from content. DISCLAIMER
This module does low level XML manipulation. It will somehow parse even broken XML and try to do something with it. Do not use it unless you know what you're doing. API
my $stripper = MKDoc::XML::Stripper->new() Instantiates a new MKDoc::XML::Stripper object. $stripper->load_def ($def_name); Loads a definition located somewhere in @INC under MKDoc/XML/Stripper. Available definitions are: xhtml10frameset xhtml10strict xhtml10transitional mkdoc16 - MKDoc 1.6. XHTML structural markup You can also load your own definition file, for instance: $stripper->load_def ('my_def.txt'); Definitions are simple text files as follows: # allow p with 'class' and id p class p id # allow more stuff td class td id td style # etc... $stripper->allow ($tag, @attributes) Allows "<$tag>" to appear in the stripped XML. Additionally, allows @attributes to appear as attributes of <$tag>, so for instance: $stripper->allow ('p', 'class', 'id'); Will allow the following: <p> <p class="foo"> <p id="bar"> <p class="foo" id="bar"> However any extra attributes will be stripped, i.e. <p class="foo" id="bar" style="font-color: red"> Will be rewritten as <p class="foo" id="bar"> $stripper->disallow ($tag) Explicitly disallows a tag and all its associated attributes. By default everything is disallowed. $stripper->process_data ($some_xml); Strips $some_xml according to the rules that were given with the allow() and disallow() methods and returns the result. Does not modify $some_xml in place. $stripper->process_file ('/an/xml/file.xml'); Strips '/an/xml/file.xml' according to the rules that were given with the allow() and disallow() methods and returns the result. Does not modify '/an/xml/file.xml' in place. NOTES
MKDoc::XML::Stripper does not really parse the XML file you're giving to it nor does it care if the XML is well-formed or not. It uses MKDoc::XML::Tokenizer to turn the XML / XHTML file into a series of MKDoc::XML::Token objects and strictly operates on a list of tokens. For this same reason MKDoc::XML::Stripper does not support namespaces. AUTHOR
Copyright 2003 - MKDoc Holdings Ltd. Author: Jean-Michel Hiver This module is free software and is distributed under the same license as Perl itself. Use it at your own risk. SEE ALSO
MKDoc::XML::Tokenizer MKDoc::XML::Token perl v5.10.1 2004-10-06 MKDoc::XML::Stripper(3pm)
All times are GMT -4. The time now is 03:52 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy