Sponsored Content
Top Forums Shell Programming and Scripting Remove lines that are subsets of other lines in File Post 302941866 by MisterJellyBean on Wednesday 22nd of April 2015 07:17:47 AM
Old 04-22-2015
Hello RudiC,


Well, I managed to trim down the dataset with "sort -u input > output", but this will only remove pure duplicates. But still, running my script on this filtered dataset will take ages :/

I presume 'sed' could help me but I can't figure out what regex I should feed him..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove lines from file

file: 1 xxxxxxx 2 xxx xxx 5 xxx xxx ... 180 xxxxxx 200 xxx how to remove any lines with the first number range 1-180 (9 Replies)
Discussion started by: bluemoon1
9 Replies

2. UNIX for Dummies Questions & Answers

vi to remove lines in file

All, I have a text file with several entries like below: personname personname.domain.com I know there is a way to use vi to remove only the personname.domain.com line. Can someone help? I believe that it involves /s/g/ something...I just can't remember the exact syntax. Thanks (2 Replies)
Discussion started by: kjbaumann
2 Replies

3. Shell Programming and Scripting

remove lines from file

Hi gurus, i'm trying to remove a number of lines from a large file using the following command: sed '1,5000d' oldfile > newfile Somehow the lines in the old file are not deleted... Am I doing this wrongly? Any suggestions? :confused: Thanks! :) wee (10 Replies)
Discussion started by: lweegp
10 Replies

4. Shell Programming and Scripting

remove : lines from file

A small question I have a test.txt file I have contents as: a:google b:yahoo : c:facebook : d:hotmail How do I remove the line with : my output should be a:google b:yahoo c:facebook d:hotmail (5 Replies)
Discussion started by: aronmelon
5 Replies

5. Shell Programming and Scripting

remove blank lines and merge lines in shell

Hi, I'm not a expert in shell programming, so i've come here to take help from u gurus. I'm trying to tailor a csv file that i got to make it work for the LOAD FROM command. I've a datatable csv of the below format - --in file format xx,xx,xx ,xx , , , , ,,xx, xxxx,, ,, xxx,... (11 Replies)
Discussion started by: dvah
11 Replies

6. Shell Programming and Scripting

Remove lines from file

Hey Gang- I have a list of servers. I want to exclude servers that begin with and end with certain characters. Is there an easy command to do this? Example wvm1234dev wvm1234pro uvm1122dev uvm1122bku uvm1344dev I want to exclude any lines that start with "wvm" OR "uvm" AND end... (7 Replies)
Discussion started by: idiotboy
7 Replies

7. UNIX for Dummies Questions & Answers

Want to remove all lines but not latest 50 lines from a file

Hi, I have a huge file which has Lacs of lines. File system got full. I want your guys help to suggest me a solution so that I can remove all lines from that file but not last 50,000 lines. I want solution which can remove lines from existing file so that I can have some space left with. (28 Replies)
Discussion started by: prashant2507198
28 Replies

8. Shell Programming and Scripting

Remove lines in file

I have a file that contains the following: Party_Id1;Party_id2;Party_id3; 1;2;3; 0 0 4;5;6; 0 7;8;9; How can I adjust the file so it looks like this: Party_Id1;Party_id2;Party_id3; 1;2;3; 4;5;6; 7;8;9; I Think the '0' is something like a carriage return, I don't know. But how... (2 Replies)
Discussion started by: katled
2 Replies

9. Shell Programming and Scripting

Two files, remove lines from second based on lines in first

I have two files, a keepout.txt and a database.csv. They're unsorted, but could be sorted. keepout: user1 buser3 anuser19 notheruser27 database: user1,2343,"information about",field,blah,34 user2,4231,"mo info",etc,stuff,43 notheruser27,4344,"hiya",thing,more thing,423... (4 Replies)
Discussion started by: esoffron
4 Replies

10. Shell Programming and Scripting

awk to remove lines that do not start with digit and combine line or lines

I have been searching and trying to come up with an awk that will perform the following on a converted text file (original is a pdf). 1. Since the first two lines are (begin with) text they are removed 2. if $1 is a number then all text is merged (combined) into one line until the next... (3 Replies)
Discussion started by: cmccabe
3 Replies
H5MATH(1)							      h5utils								 H5MATH(1)

NAME
h5math - combine/create HDF5 files with math expressions SYNOPSIS
h5math [OPTION]... OUTPUT-HDF5FILE [INPUT-HDF5FILES...] DESCRIPTION
h5math takes any number of HDF5 files as input, along with a mathematical expression, and combines them to produce a new HDF5 file. HDF5 is a free, portable binary format and supporting library developed by the National Center for Supercomputing Applications at the Uni- versity of Illinois in Urbana-Champaign. A single h5 file can contain multiple data sets; by default, h5math creates a dataset called "h5math", but this can be changed via the -d option, or by using the syntax HDF5FILE:DATASET. The -a option can be used to append new datasets to an existing HDF5 file. The same syntax is used to specify the dataset used in the input file(s); by default, the first dataset (alphabetically) is used. A simple example of h5math's usage is: h5math -e "d1 + 2*d2" out.h5 foo.h5 bar.h5:blah which produces a new file, out.h5, by adding the first dataset in foo.h5 with twice the "blah" dataset in bar.h5. In the expression (spec- ified by -e), the first input dataset (from left to right) is referred to as d1, the second as d2, and so on. In addition to input datasets, you can also use the x/y/z coordinates of each point in the expression, referenced by "x" "y" and "z" vari- ables (for the first three dimensions) as well as a "t" variable that refers to the last dimension. By default, these are integers start- ing at 0 at the corner of the dataset, but the -0 option will change the x/y/z origin to the center of the dataset (t is unaffected), and the -r res option will specify the "resolution", dividing the x/y/z coordinates by res. All of the input datasets must have the same dimensions, which are also the dimensions of the output. If there are no input files, and you are defining the output purely by a mathematical formula, you can specify the dimensions of the output explicitly via the -n size option, where size is e.g. "2x2x2". Sometimes, however, you want to use only a smaller-dimensional "slice" of multi-dimensional data. To do this, you specify coordinates in one (or more) slice dimension(s), via the -xyzt options. OPTIONS
-h Display help on the command-line options and usage. -V Print the version number and copyright info for h5math. -v Verbose output. -a If the HDF5 output file already exists, append the data as a new dataset rather than overwriting the file (the default behavior). An existing dataset of the same name within the file is overwritten, however. -e expression Specify the mathematical expression that is used to construct the output (generally in " quotes to group the expression as one item in the shell), in terms of the variables for the input datasets and the coordinates as described above. Expressions use a C-like infix notation, with most standard operators and mathematical functions (+, sin, etc.) being supported. This functionality is provided (and its features determined) by GNU libmatheval. -f filename Name of a text file to read the expression from, if no -e expression is specified. Defaults to stdin. -x ix, -y iy, -z iz, -t it This tells h5math to use a particular slice of a multi-dimensional dataset. e.g. -x uses the subset (with one less dimension) at an x index of ix (where the indices run from zero to one less than the maximum index in that direction). Here, x/y/z correspond to the first/second/third dimensions of the HDF5 dataset. The -t option specifies a slice in the last dimension, whichever that might be. See also the -0 option to shift the origin of the x/y/z slice coordinates to the dataset center. -0 Shift the origin of the x/y/z slice coordinates to the dataset center, so that e.g. -0 -x 0 (or more compactly -0x0) returns the central x plane of the dataset instead of the edge x plane. (-t coordinates are not affected.) This also shifts the origin of the x/y/z variables in the expression so that 0 is the center of the dataset. -r res Use a resolution res for x/y/z (but not t) variables in the expression, so that the data "grid" coordinates are divided by res. The default res is 1. For example, if the x dimension has 21 grid steps, setting a res of 20 will mean that x variables in the expression run from 0.0 to 1.0 (or -0.5 to 0.5 if -0 is specified), instead of 0 to 20. -r does not affect the coordinates used for slices, which are always integers. -n size The output dataset must be the same size as the input datasets. If there are no input datasets (if you are defining the output purely by a formula), then you must specify the output size manually with this option: size is of the form MxNxLx... (with M, N, L being integers) and may be of any dimensionality. -d name Write to dataset name in the output; otherwise, the output dataset is called "data" by default. Also use dataset name in the input; otherwise, the first input dataset (alphabetically) in a file is used. Alternatively, use the syntax HDF5FILE:DATASET (which over- rides the -d option). BUGS
Send bug reports to S. G. Johnson, stevenj@alum.mit.edu. AUTHORS
Written by Steven G. Johnson. Copyright (c) 2005 by the Massachusetts Institute of Technology. h5utils May 23, 2005 H5MATH(1)
All times are GMT -4. The time now is 02:21 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy