Sponsored Content
Top Forums Shell Programming and Scripting How to remove a subset of data from a large dataset based on values on one line Post 302576277 by davegen on Thursday 24th of November 2011 07:12:35 AM
Old 11-24-2011
Thanks, that's really helpful! I hate to ask for more but could I alter that to make it run through every position?
 

10 More Discussions You Might Find Interesting

1. Programming

I have C++ exe file( no source code) and need to run many large dataset under unix, b

I have C++ exe file( no source code) and need to run many large dataset under unix, but how to know the memeroy usage for one dataset?http://www.codeproject.com/script/Forums/Images/New.gif I think "top" is not good and if using the profiler, it seems no free download, any ideas? (1 Reply)
Discussion started by: Danielwang1986
1 Replies

2. Shell Programming and Scripting

remove a specific line in a LARGE file

Hi guys, i have a really big file, and i want to remove a specific line. sed -i '5d' fileThis doesn't really work, it takes a lot of time... The whole script is supposed to remove every word containing less than 5 characters and currently looks like this: #!/bin/bash line="1"... (2 Replies)
Discussion started by: blubbiblubbkekz
2 Replies

3. Shell Programming and Scripting

Remove duplicate line detail based on column one data

My input file: AVI.out <detail>named as the RRM .</detail> AVI.out <detail>Contains 1 RRM .</detail> AR0.out <detail>named as the tellurite-resistance.</detail> AWG.out <detail>Contains 2 HTH .</detail> ADV.out <detail>named as the DENR family.</detail> ADV.out ... (10 Replies)
Discussion started by: patrick87
10 Replies

4. Shell Programming and Scripting

How to extract a subset from a huge dataset

Hi, All I have a huge file which has 450G. Its tab-delimited format is as below x1 A 50020 1 x1 B 50021 8 x1 C 50022 9 x1 A 50023 10 x2 D 50024 5 x2 C 50025 7 x2 F 50026 8 x2 N 50027 1 : : Now, I want to extract a subset from this file. In this subset, column 1 is x10, column 2 is... (3 Replies)
Discussion started by: cliffyiu
3 Replies

5. Shell Programming and Scripting

Find line number of bad data in large file

Hi Forum. I was trying to search the following scenario on the forum but was not able to. Let's say that I have a very large file that has some bad data in it (for ex: 0.0015 in the 12th column) and I would like to find the line number and remove that particular line. What's the easiest... (3 Replies)
Discussion started by: pchang
3 Replies

6. UNIX for Advanced & Expert Users

How to extract subset file from dataset?

Hello I have a data set which looks like this : progeny sire dam gender 12 1 3 M 13 2 4 F 14 2 5 F 15 6 5 ... (13 Replies)
Discussion started by: sajmar
13 Replies

7. Shell Programming and Scripting

How to read file line by line and compare subset of 1st line with 2nd?

Hi all, I have a log file say Test.log that gets updated continuously and it has data in pipe separated format. A sample log file would look like: <date1>|<data1>|<url1>|<result1> <date2>|<data2>|<url2>|<result2> <date3>|<data3>|<url3>|<result3> <date4>|<data4>|<url4>|<result4> What I... (3 Replies)
Discussion started by: pat_pramod
3 Replies

8. Shell Programming and Scripting

Selecting random columns from large dataset in UNIX

Dear folks I have a large data set which contains 400K columns. I decide to select 50K determined columns from the whole 400K columns. Is there any command in unix which could do this process for me? I need to also mention that I store all of the columns id in one file which may help to select... (5 Replies)
Discussion started by: sajmar
5 Replies

9. Shell Programming and Scripting

Reoccuring peak values in large data file and print the line..

Hi i have some large data files that contain several fields and rows the data in a field have a numeric value that is in a sine wave pattern what i would like todo is locate each peak and pick the highest value and print that complete line. the data looks something like this it is field nr4 which... (4 Replies)
Discussion started by: ninjaunx
4 Replies

10. Shell Programming and Scripting

Parsing a subset of data from a large matrix

I do have a large matrix of the following format and it is tab delimited ch-ab1-20 ch-bb2-23 ch-ab1-34 ch-ab1-24 er-cc1-45 bv-cc1-78 ch-ab1-20 0 2 3 4 5 6 ch-bb2-23 3 0 5 ... (6 Replies)
Discussion started by: Kanja
6 Replies
H5TOVTK(1)							      h5utils								H5TOVTK(1)

NAME
h5tovtk - convert datasets in HDF5 files to VTK format SYNOPSIS
h5tovtk [OPTION]... [HDF5FILE]... DESCRIPTION
h5tovtk is a program to generate VTK data files from multidimensional datasets in HDF5 files. VTK, the Visualization ToolKit, is an open- source, freely available software system for 3D computer graphics, image processing, and visualization. VTK itself is a programming library, but it is also the basis for a number of end-user graphical visualization programs. HDF5 is a free, portable binary format and supporting library developed by the National Center for Supercomputing Applications at the Uni- versity of Illinois in Urbana-Champaign. A single h5 file can contain multiple datasets; by default, h5tovtk takes the first dataset, but this can be changed via the -d option, or by using the syntax HDF5FILE:DATASET. 1d/2d/3d datasets are converted into 3d VTK datasets. Normally, a single scalar VTK dataset is output, but vectors and fields can be out- put via the -o option below. A typical invocation is of the form 'h5tovtk foo.h5', which will output a VTK data file foo.vtk from the data in foo.h5. OPTIONS
-h Display help on the command-line options and usage. -V Print the version number and copyright info for h5tovtk. -v Verbose output. -o file Save all the input datasets to a single VTK file. If there is only one dataset, it is output to a VTK scalar dataset; if there are three datasets, they are output as a VTK vector dataset; all other numbers of datasets are combined into a VTK field dataset. Otherwise, the default behavior is to save each dataset to a separate VTK file, with the .h5 suffix of the input filename replaced by .vtk in the output filename. Only three-dimensional datasets may be written to the VTK file. If you have a four (or more) dimensional data set, then you must take a three-dimensional "slice" of the multi-dimensional data. To do this, you specify coordinates in one (or more) slice dimen- sion(s), via the -xyzt options. -1, -2, -4 Use 1 , 2, or 4 bytes to store each data point in the output file. Fewer bytes require less storage and memory, but will decrease the resolution in the values. -1 will break up the data values into one of 256 possible values (on a linear scale from the minimum to the maximum value in your data), -2 will allow 65536 possible values, and -4 (the default) will use 4-byte floating-point numbers for an "exact" representation. -a Output in ASCII format; otherwise, VTK's more compact, but less readable and somewhat less portable binary format is used. -n For binary output (see -a above), by default the data is written in bigendian byte order, which is normally the order that VTK expects. However, some external tools and a few VTK classes use the native byte ordering instead (which may not be bigendian), and the -n option causes h5tovtk to output binary data in the native ordering. -m min, -M max When -1 or -2 are used, the input data are converted to a linear integer scale. Normally, the bottom and top of this scale corre- spond to the minimum and maximum values in the data. Using the -m and -M options, you can make the bottom and top of the scale cor- respond to min and max instead, respectively. Data values below or above this range will be treated as if they were min or max respectively. See also the -Z option. -Z For -1 or -2 output, center the linear integer scale on the value zero in the data. -r Invert the output values (map the minimum to the maximum and vice versa). -x ix, -y iy, -z iz, -t it This tells h5tovtk to use a particular slice of a multi-dimensional dataset. e.g. -x uses the subset (with one less dimension) at an x index of ix (where the indices run from zero to one less than the maximum index in that direction). Here, x/y/z correspond to the first/second/third dimensions of the HDF5 dataset. The -t option specifies a slice in the last dimension, whichever that might be. See also the -0 option to shift the origin of the x/y/z slice coordinates to the dataset center. -0 Shift the origin of the x/y/z slice coordinates to the dataset center, so that e.g. -0 -x 0 (or more compactly -0x0) returns the central x plane of the dataset instead of the edge x plane. (-t coordinates are not affected.) -d name Use dataset name from the input files; otherwise, the first dataset from each file is used. Alternatively, use the syntax HDF5FILE:DATASET, which allows you to specify a different dataset for each file. You can use the h5ls command (included with hdf5) to find the names of datasets within a file. BUGS
Send bug reports to S. G. Johnson, stevenj@alum.mit.edu. AUTHORS
Written by Steven G. Johnson. Copyright (c) 2005 by the Massachusetts Institute of Technology. h5utils March 9, 2002 H5TOVTK(1)
All times are GMT -4. The time now is 11:38 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy