Sponsored Content
Top Forums Shell Programming and Scripting Using AWK to separate data from a large XML file into multiple files Post 302362688 by jp2542a on Saturday 17th of October 2009 12:05:28 AM
Old 10-17-2009
Try this awk code.

Code:
 /<METADATA>/ {
        getline
        while ( $0 !~ /<\/METADATA>/ ) {
                print > "fields.xml"
                getline
        }
        count=1
        nextline
}

/<ROW/ {
        rfile="row" count ".xml"
        getline
        while ($0 !~ "<\/ROW" ) {
                print > rfile
                getline
        }
        close(rfile)
        count++
        nextline
}

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

multiple smaller files from one large file

I have a file with a simple list of ids. 750,000 rows. I have to break it down into multiple 50,000 row files to submit in a batch process.. Is there an easy script I could write to accomplish this task? (2 Replies)
Discussion started by: rtroscianecki
2 Replies

2. Shell Programming and Scripting

handling multiple files using awk command and wants to get separate out file for each

hai all I am new to the world of shell scripting I wanted to extract two columns from multiple files say around 25 files and i wanted to get the separate outfile for each input file tired using the following command to extract two columns from 25 files awk... (2 Replies)
Discussion started by: hema dhevi
2 Replies

3. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Hi, I'd like to process multiple files. For example: file1.txt file2.txt file3.txt Each file contains several lines of data. I want to extract a piece of data and output it to a new file. file1.txt ----> newfile1.txt file2.txt ----> newfile2.txt file3.txt ----> newfile3.txt Here is... (3 Replies)
Discussion started by: Liverpaul09
3 Replies

4. UNIX for Dummies Questions & Answers

awk to match multiple regex and create separate output files

Howdy Folks, I have a list that looks like this: (file2.txt) AAA BBB CCC DDD and there are 24 of these short words. I am matching these patterns to another file with 755795 lines (file1.txt). I have this code for matching: awk -v f2=file2.txt ' BEGIN { while(... (2 Replies)
Discussion started by: heecha
2 Replies

5. Shell Programming and Scripting

How to split a data file into separate files with the file names depending upon a column's value?

Hi, I have a data file xyz.dat similar to the one given below, 2345|98|809||x|969|0 2345|98|809||y|0|537 2345|97|809||x|544|0 2345|97|809||y|0|651 9685|98|809||x|321|0 9685|98|809||y|0|357 9685|98|709||x|687|0 9685|98|709||y|0|234 2315|98|809||x|564|0 2315|98|809||y|0|537... (2 Replies)
Discussion started by: nithins007
2 Replies

6. Shell Programming and Scripting

create separate files from one excel file with multiple sheets

Hi, I have one requirement, create separate files (".csv") from one excel file(xlsx) with multiple sheets. These ".csv" files are my source files. So anybody please suggest me the process. Thanks in Advance. Regards, Harris (3 Replies)
Discussion started by: harris
3 Replies

7. Shell Programming and Scripting

Process multiple large files with awk

Hi there, I'm camor and I'm trying to process huge files with bash scripting and awk. I've got a dataset folder with 10 files (16 millions of row each one - 600MB), and I've got a sorted file with all keys inside. For example: a sample_1 200 a.b sample_2 10 a sample_3 10 a sample_1 10 a... (4 Replies)
Discussion started by: camor
4 Replies

8. Shell Programming and Scripting

Splitting a single xml file into multiple xml files

Hi, I'm having a xml file with multiple xml header. so i want to split the file into multiple files. Sample.xml consists multiple headers so how can we split these multiple headers into multiple files in unix. eg : <?xml version="1.0" encoding="UTF-8"?> <ml:individual... (3 Replies)
Discussion started by: Narendra921631
3 Replies

9. Shell Programming and Scripting

awk - Multiple files - 1 file with multi-line data

Greetings experts, Have 2 input files, of which 1 file has 1 record per line; in 2nd file, multiple lines constitute 1 record; Hence declared the RS=";" Now in the first file which ends with ";" at each line of the line; But \nis also being considered as part of the data due to which I am... (1 Reply)
Discussion started by: chill3chee
1 Replies

10. Shell Programming and Scripting

Split large xml into mutiple files and with header and footer in file

Split large xml into mutiple files and with header and footer in file tried below it splits unevenly and also i need help in adding header and footer command : csplit -s -k -f my_XML_split.xml extrfile.xml "/<Document>/" {1} sample xml <?xml version="1.0" encoding="UTF-8"?><Recipient>... (36 Replies)
Discussion started by: karthik
36 Replies
GETLINE(3)						     Linux Programmer's Manual							GETLINE(3)

NAME
getline, getdelim - delimited string input SYNOPSIS
#include <stdio.h> ssize_t getline(char **lineptr, size_t *n, FILE *stream); ssize_t getdelim(char **lineptr, size_t *n, int delim, FILE *stream); Feature Test Macro Requirements for glibc (see feature_test_macros(7)): Before glibc 2.10: getline(), getdelim(): _GNU_SOURCE Since glibc 2.10: getline(), getdelim(): _POSIX_C_SOURCE >= 200809L || _XOPEN_SOURCE >= 700 DESCRIPTION
getline() reads an entire line from stream, storing the address of the buffer containing the text into *lineptr. The buffer is null-termi- nated and includes the newline character, if one was found. If *lineptr is NULL, then getline() will allocate a buffer for storing the line, which should be freed by the user program. (In this case, the value in *n is ignored.) Alternatively, before calling getline(), *lineptr can contain a pointer to a malloc(3)-allocated buffer *n bytes in size. If the buffer is not large enough to hold the line, getline() resizes it with realloc(3), updating *lineptr and *n as necessary. In either case, on a successful call, *lineptr and *n will be updated to reflect the buffer address and allocated size respectively. getdelim() works like getline(), except a line delimiter other than newline can be specified as the delimiter argument. As with getline(), a delimiter character is not added if one was not present in the input before end of file was reached. RETURN VALUE
On success, getline() and getdelim() return the number of characters read, including the delimiter character, but not including the termi- nating null byte. This value can be used to handle embedded null bytes in the line read. Both functions return -1 on failure to read a line (including end-of-file condition). ERRORS
EINVAL Bad arguments (n or lineptr is NULL, or stream is not valid). VERSIONS
These functions are available since libc 4.6.27. CONFORMING TO
Both getline() and getdelim() were originally GNU extensions. They were standardized in POSIX.1-2008. EXAMPLE
#define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> int main(void) { FILE *fp; char *line = NULL; size_t len = 0; ssize_t read; fp = fopen("/etc/motd", "r"); if (fp == NULL) exit(EXIT_FAILURE); while ((read = getline(&line, &len, fp)) != -1) { printf("Retrieved line of length %zu : ", read); printf("%s", line); } free(line); exit(EXIT_SUCCESS); } SEE ALSO
read(2), fgets(3), fopen(3), fread(3), gets(3), scanf(3), feature_test_macros(7) COLOPHON
This page is part of release 3.25 of the Linux man-pages project. A description of the project, and information about reporting bugs, can be found at http://www.kernel.org/doc/man-pages/. GNU
2010-06-12 GETLINE(3)
All times are GMT -4. The time now is 01:50 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy