Sponsored Content
Top Forums Shell Programming and Scripting Help with Splitting a Large XML file based on size AND tags Post 302907992 by Chubler_XL on Thursday 3rd of July 2014 12:43:27 AM
Old 07-03-2014
Sorry I should have tried my code on more than 1 large URL as I have forgotten to reset the bytes variable please accept this updated version:

Code:
#!/bin/bash
export ORACLE_HOME=.........
export ORACLE_SID=...........
export PATH=........
. ./params        # contains the parameter sizelimit
...

if [ $(stat -c%s $FILE) -gt $sizelimit ]
then
    awk -v limit=$sizelimit '
        BEGIN { num=1 }
        {
          if ((bytes+=length)>limit) {
             close(FILENAME "." num)
             bytes=length
             num++
          }
          printf "%s%s",$0,RS > FILENAME "." num
        } ' RS="</URL>" $FILE
else
   echo "$FILE: already less than the limit of $sizelimit"
fi

This User Gave Thanks to Chubler_XL For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,... (6 Replies)
Discussion started by: kam66
6 Replies

2. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

3. Shell Programming and Scripting

Problem with splitting large file based on pattern

Hi Experts, I have to split huge file based on the pattern to create smaller files. The pattern which is expected in the file is: Master..... First... second.... second... third.. third... Master... First.. second... third... Master... First... second.. second.. second..... (2 Replies)
Discussion started by: saisanthi
2 Replies

4. Shell Programming and Scripting

Splitting large file and renaming based on field

I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this: HMMER3/b NAME 1-cysPrx_C ACC ... (2 Replies)
Discussion started by: fozrun
2 Replies

5. Shell Programming and Scripting

Help required in Splitting a xml file into multiple and appending it in another .xml file

HI All, I have to split a xml file into multiple xml files and append it in another .xml file. for example below is a sample xml and using shell script i have to split it into three xml files and append all the three xmls in a .xml file. Can some one help plz. eg: <?xml version="1.0"?>... (4 Replies)
Discussion started by: ganesan kulasek
4 Replies

6. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . . (6 Replies)
Discussion started by: sumguy
6 Replies

7. Shell Programming and Scripting

Split XML file based on tags

Hello All , Please help me with below requirement I want to split a xml file based on tag.here is the file format <data-set> some-information </data-set> <data-set1> some-information </data-set1> <data-set2> some-information </data-set2> I want to split the above file into 3... (5 Replies)
Discussion started by: Pratik4891
5 Replies

8. Shell Programming and Scripting

Splitting xml file into several xml files using perl

Hi Everyone, I'm new here and I was checking this old post: /shell-programming-and-scripting/180669-splitting-file-into-several-smaller-files-using-perl.html (cannot paste link because of lack of points) I need to do something like this but understand very little of perl. I also check... (4 Replies)
Discussion started by: mcosta
4 Replies

9. Shell Programming and Scripting

Splitting a single xml file into multiple xml files

Hi, I'm having a xml file with multiple xml header. so i want to split the file into multiple files. Sample.xml consists multiple headers so how can we split these multiple headers into multiple files in unix. eg : <?xml version="1.0" encoding="UTF-8"?> <ml:individual... (3 Replies)
Discussion started by: Narendra921631
3 Replies

10. Shell Programming and Scripting

Issue splitting file based on XML tags

more a-d.txt1 <a-dets> <a-serv> <aserv>mymac14,mymac15:MYAPP:mydom:/web/domain/mydom/config <NMGR>:MYAPP:/web/bea_apps/perf/NMGR/NMGR1034 <a-rep-string> 11.12.10.01=192.10.00.26 10.20.18.10=192.10.00.27 </a-rep-string> </a-serv> <w-serv>... (2 Replies)
Discussion started by: mohtashims
2 Replies
extendedFILE(5) 					Standards, Environments, and Macros					   extendedFILE(5)

NAME
extendedFILE - enable extended FILE facility usage SYNOPSIS
$ ulimit -n N_file_descriptors $ LD_PRELOAD_32=/usr/lib/extendedFILE.so.1 application [arg...] DESCRIPTION
The extendedFILE.so.1 is not a library but an enabler of the extended FILE facility. The extended FILE facility allows 32-bit processes to use any valid file descriptor with the standard I/O (see stdio(3C)) C library func- tions. Historically, 32-bit applications have been limited to using the first 256 numerical file descriptors for use with standard I/O streams. By using the extended FILE facility this limitation is lifted. Any valid file descriptor can be used with standard I/O. See the NOTES section of enable_extended_FILE_stdio(3C). The extended FILE facility is enabled from the shell level before an application is launched. The file descriptor limit must also be raised. The syntax for raising the file descriptor limit is $ ulimit -n max_file_descriptors $ LD_PRELOAD_32=/usr/lib/extendedFILE.so.1 application [arg...] where max_file_descriptors is the maximum number of file descriptors desired. See limit(1). The maximum value is the same as the maximum value for open(2). ENVIRONMENT VARIABLES
The following environment variables control the behavior of the extended FILE facility. _STDIO_BADFD This variable takes an integer representing the lowest file descriptor, which will be made unallocatable. This action provides a protection mechanism so that applications that abuse interfaces do not experience silent data cor- ruption. The value must be between 3 and 255 inclusive. _STDIO_BADFD_SIGNAL This variable takes an integer or string representing any valid signal. See signal.h(3HEAD) for valid values or strings. This environment variable causes the specified signal to be sent to the application if certain exceptional cases are detected during the use of this facility. The default signal is SIGABRT. EXAMPLES
Example 1 Limit the number of file descriptors and FILE standard I/O structures. The following example limits the number of file descriptors and FILE standard I/O structures to 1000. $ ulimit -n 1000 $ LD_PRELOAD_32=/usr/lib/extendedFILE.so.1 application [arg...] Example 2 Enable the extended FILE facility. The following example enables the extended FILE facility. See enable_extended_FILE_stdio(3C) for more examples. $ ulimit -n 1000 $ _STDIO_BADFD=100 _STDIO_BADFD_SIGNAL=SIGABRT LD_PRELOAD_32=/usr/lib/extendedFILE.so.1 application [arg ...] Example 3 Set up the extended FILE environment and start the application. The following shell script first sets up the proper extended FILE environment and then starts the application: #!/bin/sh if [ $# = 0 ]; then echo "usage: $0 application [arguments...]" exit 1 fi ulimit -n 1000 # _STDIO_BADFD=196; export _STDIO_BADFD # _STDIO_BADFD_SIGNAL=SIGABRT; export _STDIO_BADFD_SIGNAL LD_PRELOAD_32=/usr/lib/extendedFILE.so.1; export LD_PRELOAD_32 "$@" FILES
/usr/lib/extendedFILE.so.1 enabling library ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsl (32-bit) | +-----------------------------+-----------------------------+ |Interface Stability |Stable | +-----------------------------+-----------------------------+ |MT-Level |Safe | +-----------------------------+-----------------------------+ SEE ALSO
limit(1), open(2), enable_extended_FILE_stdio(3C), fdopen(3C), fopen(3C), popen(3C), signal.h(3HEAD), stdio(3C), attributes(5) WARNINGS
The following displayed message Application violated extended FILE safety mechanism. Please read the man page for extendedFILE. Aborting is an indication that your application is modifying the internal file descriptor field of the FILE structure from standard I/O. Continued use of this extended FILE facility could harm your data. Do not use the extended FILE facility with your application. SunOS 5.11 18 Apr 2006 extendedFILE(5)
All times are GMT -4. The time now is 06:52 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy