Phrase XML with Huge Data Post: 302967638

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to extract data from a huge file?

Hi, I have a huge file of bibliographic records in some standard format.I need a script to do some repeatable task as follows: 1. Needs to create folders as the strings starts with "item_*" from the input file 2. Create a file "contents" in each folders having "license.txt(tab...

2. Shell Programming and Scripting

Splitting huge XML Files into fixsized wellformed parts

Hi, I need to split xml-files with sizes greater than 2 gb into smaler chunks. As I dont want to end up with billions of files, I want those splitted files to have configurable sizes like 250 MB. Each file should be well formed having an exact copy of the header (and footer as the closing of the...

3. Shell Programming and Scripting

splitting huge xml into multiple files

hi all i have a some huge html files (500MB to 1GB). Each file has multiple <html></html> tags <html> ................. .................... .................... </html> <html> ................. .................... .................... </html> <html> ....................

4. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA...

5. Shell Programming and Scripting

convert huge .xml file in .csv with specific column.

I have huge xml file in server and i want to convert it to .csv with specific column ... i have search in blog but i didn't get any usefully command. Thanks in advance

6. Shell Programming and Scripting

How to find a phrase and pull all lines that follow until the phrase occurs again?

I want to burst a report by using the page number value in the report header. Each section starts with *PAGE NO:* 1 Each section might have several pages, but the next section always starts back at 1. So I want to find the "*PAGE NO:* 1" value and pull all lines that follow until "*PAGE NO:* 1"...

7. Shell Programming and Scripting

Aggregation of huge data

Hi Friends, I have a file with sample amount data as follows: -89990.3456 8788798.990000128 55109787.20 -12455558989.90876 I need to exclude the '-' symbol in order to treat all values as an absolute one and then I need to sum up.The record count is around 1 million. How...

8. Solaris

The Fastest for copy huge data

Dear Experts, I would like to know what's the best method for copy data around 3 mio (spread in a hundred folders, size each file around 1kb) between 2 servers? I already tried using Rsync and tar command. But using these command is too long. Please advice. Thanks Edy

9. UNIX and Linux Applications

How to delete a data starting with a phrase in a table - SQL?

Hello, I am trying to remove some rows in a table, which are including a phrase at a defined column but i could not find the unique result for this. What I wish to do is to remove all lines including http://xx.yy at link column ...

LEARN ABOUT DEBIAN

splitxyz

SPLITXYZ(l)															       SPLITXYZ(l)

NAME

       splitxyz - filter to divide (x,y,z[,distance,heading]) data into (x,y,z) track segments.

SYNOPSIS

       splitxyz  [  xyz[dh]file  ]  -Ccourse_change  [	-Aazimuth/tolerance ] [ -Dminimum_distance ] [ -Fxy_filter/z_filter ] [ -Ggap_distance ] [
       -H[nrec] ] [ -M ] [ -Nnamestem ] [ -S ] [ -V ] [ -Z ] [ -: ] [ -bi[s][n] ] [ -bo[s][n] ]

DESCRIPTION

       splitxyz reads a series of (x,y[,z]) records [or optionally (x,y,z,d,h); see -S option] from standard input  [or  xyz[dh]file]  and  splits
       this into separate lists of (x,y[,z]) series, such that each series has a nearly constant azimuth through the x,y plane.  There are options
       to choose only those series which have a certain orientation, to set a minimum length for series, and to high- or  low-pass  filter  the  z
       values  and/or  the x,y values. splitxyz is a useful filter between data extraction and pswiggle plotting, and can also be used to divide a
       large x,y,z dataset into segments. The output is always in the ASCII format; input may be ASCII or binary (see -b).

       xyz[dh]file(s)
	      3 (but see -Z) [or 5] column ASCII file [or binary, see -b] holding (x,y,z[,d,h]) data values. To use (x,y,z,d,h) input,	sorted	so
	      that  d  is  non-decreasing,  specify the -S option; default expects (x,y,z) only.  If no file is specified, splitxyz will read from
	      standard input.

       -C     Terminate a segment when a course change exceeding course_change degrees of heading is detected.

OPTIONS

       -A     Write out only those segments which are within +/- tolerance degrees of azimuth in heading, measured  clockwise  from  North,  [0  -
	      360]. [Default writes all acceptable segments, regardless of orientation].

       -D     Do not write a segment out unless it is at least minimum_distance units long.  [Default = 100 distance units].

       -F     Filter  the  z values and/or the x,y values, assuming these are functions of d coordinate.  xy_filter and z_filter are filter widths
	      in distance units. If a filter width is zero, the filtering is not performed. The absolute value of the width is the full width of a
	      cosine-arch  low-pass filter. If the width is positive, the data are low-pass filtered; if negative, the data are high-pass filtered
	      by subtracting the low-pass value from the observed value. If z_filter is non-zero, the entire series of input z values is  filtered
	      before  any  segmentation  is  performed, so that the only edge effects in the filtering will happen at the beginning and end of the
	      complete data stream. If xy_filter is non-zero, the data is first divided into segments and then the x,y values of each segment  are
	      filtered	separately.  This may introduce edge effects at the ends of each segment, but prevents a low-pass x,y filter from rounding
	      off the corners of track segments. [Default = no filtering].

       -G     Do not let a segment have a gap exceeding gap_distance; instead, split it into two segments. [Default = 10 distance units].

       -H     Input file(s) has Header record(s). Number of header records can be changed by editing your .gmtdefaults file. If used, GMT  default
	      is 1 header record.  Not used with binary data.

       -M     Use  Map	units.	Then  x,y are in degrees of longitude, latitude, and distances in kilometers. [Default: distances are cartesian in
	      same units as x,y].

       -N     Create Named output files, writing each segment to a separate file  in  the  working  directory  named  namestem.profile#,  where  #
	      increases  consecutively	from  1.  [Default  writes  entire output to stdout, separating segments by sub-headings that start with >
	      marks].

       -S     d and h is supplied. In this case, input contains x,y,z,d,h.  [Default expects (x,y,z) input, and d,h are  computed  from  delta	x,
	      delta y, according to -M option]

       -V     Selects verbose mode, which will send progress reports to stderr [Default runs "silently"].

       -Z     Data have x,y only (no z-column).

       -:     Toggles  between	(longitude,latitude)  and  (latitude,longitude)  input/output. [Default is (longitude,latitude)].  Applies to geo-
	      graphic coordinates only.

       -bi    Selects binary input. Append s for single precision [Default is double].	Append n for the number of columns in the binary  file(s).
	      [Default is 2, 3, or 5 input columns as set by -S, -Z].

       -bo    Selects binary output. Append s for single precision [Default is double].

EXAMPLES

       Suppose you want to make a wiggle plot of magnetic anomalies on segments oriented approximately east-west from a cruise called cag71 in the
       region -R300/315/12/20.	You want to use a 100km low-pass filter to smooth the tracks and a 500km high-pass filter to detrend the  magnetic
       anomalies. Try this:

       gmtlist cag71 -R300/315/12/20 -Fxyzdh | splitxyz -A90/15 -F100/-500 -M -S -V | pswiggle -R300/315/12/20 -Jm0.6 -Ba5f1:.cag71: -T1 -W3 -G200
       -Z200 > cag71_wiggles.ps

       MGD-77 users: For this application we recommend that you extract d, h from gmtlist rather than have splitxyz compute them separately.
       Suppose you have been given a binary, double-precision file containing lat, lon, gravity values from a survey, and you  want  to  split	it
       into profiles named survey.profile# (when gap exceeds 100 km). Try this:

       splitxyz survey.bin -Nsurvey -V -G100 -: -M -bi3

SEE ALSO

       gmt(1gmt), gmtlist(1gmt), pswiggle(1gmt)

								    1 Jan 2004							       SPLITXYZ(l)