Using 'sed' to delete or ignore columns in a dataset

02-29-2008

Registered User

6, 0

Join Date: Feb 2008

Last Activity: 13 June 2008, 8:00 AM EDT

Posts: 6

Thanks Given: 0

Thanked 0 Times in 0 Posts

Using 'sed' to delete or ignore columns in a dataset

Hi,

I've already posted elsewhere but am posting again here coz im a newbie. I hope you forgive me this time.

I want to know if its possible to delete or ignore columns in a large dataset using 'sed'. For example, I have the following dataset: -

20060714,X.XX,1,043004,Q,T,24.0000,1,25.5000,4,
20060714,X.XX,1,081209,Q,T,24.0000,1,25.5000,5,

As you can see, there are 10 columns here and the table that I am inserting into has 8 columns.

I want to delete the 3rd column (i.e. the 1's) and I want to delete the comma between Q and T. Finally I want to delete the comma at the end.

Is this possible with sed? Can any1 help me with this please?

I'll be extremely grateful if someone can help with this!! You can PM me or post back here.

Many Thanks, asif.

aarif

View Public Profile for aarif

Find all posts by aarif

H5FROMTXT(1) h5utils H5FROMTXT(1) NAME
h5fromtxt - convert text input to an HDF5 file SYNOPSIS
h5fromtxt [OPTION]... [HDF5FILE] DESCRIPTION
h5fromtxt takes a series of numbers from standard input and outputs a multi-dimensional numeric dataset in an HDF5 file. HDF5 is a free, portable binary format and supporting library developed by the National Center for Supercomputing Applications at the Uni- versity of Illinois in Urbana-Champaign. A single h5 file can contain multiple data sets; by default, h5fromtxt creates a dataset called "data", but this can be changed via the -d option, or by using the syntax HDF5FILE:DATASET. The -a option can be used to append new datasets to an existing HDF5 file. All characters besides the numbers (and associated decimal points, etcetera) in the input are ignored. By default, the data is assumed to be a two-dimensional MxN dataset where M is the number of rows (delimited by newlines) and N is the number of columns. In this case, it is an error for the number of columns to vary between rows. If M or N is 1 then the data is written as a one-dimensional dataset. Alternatively, you can specify the dimensions of the data explicitly via the -n size option, where size is e.g. "2x2x2". In this case, newlines are ignored and the data is taken as an array of the given size stored in row-major ("C") order (where the last index varies most quickly as you step through the data). e.g. a 2x2x2 array would be have the elements listed in the order: (0,0,0), (0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), (1,1,0), (1,1,1). A simple example is: h5fromtxt foo.h5 <<EOF 1 2 3 4 5 6 7 8 EOF which reads in a 2x4 space-delimited array from standard input. OPTIONS
-h Display help on the command-line options and usage. -V Print the version number and copyright info for h5fromtxt. -v Verbose output. -a If the HDF5 output file already exists, append the data as a new dataset rather than overwriting the file (the default behavior). An existing dataset of the same name within the file is overwritten, however. -n size Instead of trying to infer the dimensions of the array from the rows and columns of the input, treat the data as a sequence of num- bers in row-major order forming an array of dimensions size. size is of the form MxNxLx... (with M, N, L being numbers) and may be of any dimensionality. -T Transpose the input when it is written, reversing the dimensions. -d name Write to dataset name in the output; otherwise, the output dataset is called "data" by default. Alternatively, use the syntax HDF5FILE:DATASET. BUGS
Send bug reports to S. G. Johnson, stevenj@alum.mit.edu. AUTHORS
Written by Steven G. Johnson. Copyright (c) 2005 by the Massachusetts Institute of Technology. h5utils March 9, 2002 H5FROMTXT(1)

UNIX for Dummies Questions & Answers

Using 'sed' to delete or ignore columns in a dataset

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Ignore dollar value in sed

Discussion started by: Master_Mind

2. Shell Programming and Scripting

Selecting random columns from large dataset in UNIX

Discussion started by: sajmar

3. Shell Programming and Scripting

Ignore escape sequence in sed

Discussion started by: jothi basu

4. UNIX for Dummies Questions & Answers

Sed: delete columns 7,15,16

Discussion started by: Vrc2250

5. Shell Programming and Scripting

awk based script to ignore all columns from a file which contains character strings

Discussion started by: ks_reddy

6. Solaris

flarecreate for zfs root dataset and ignore multiple dataset

Discussion started by: uxravi

7. Shell Programming and Scripting

Ignore first word using sed in PERL

Discussion started by: thankful123

8. Programming

Extracting differences between two columns dataset (SQL command)

Discussion started by: labrazil

9. UNIX for Dummies Questions & Answers

Using 'sed' to delete or ignore columns in a dataset

Discussion started by: aarif

10. Shell Programming and Scripting

Make sed ignore lines

Discussion started by: Scarlos

LEARN ABOUT DEBIAN

h5fromtxt