07-04-2012
Command line / script option to filter a data set by values of one column
Hi all!
I have a data set in this tab separated format : Label, Value1, Value2
An instance is "data.txt" :
0 1 1
-1 2 3
0 2 2
I would like to parse this data set and generate two files, one that has only data with the label 0 and the other with label -1, so my outputs should be, for example :
data0.txt
0 1 1
0 2 2
and data-1.txt
-1 2 3
These are large datasets, and I do not know in advance how many labels there are. Assuming the labels are l1...ln, I would like the outputs stored in data_<label>.txt where <label> is one of l1...ln
Can someone here suggest a quick way to script / command-line this?
Thanks in advance!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Let say in a file I have lines of data like this :
13;2073;461496;15075341;3;001f7d3a;2042063674;
13;2074;461446;15080241;6;001ed33a;2042020154;
13;2075;461401;15085270;6;001f593b;2042054459;
13;2076;461381;15087160;6;001f7483;2042061443;
13;2077;461419;15083419;6;001eca1a;2042017818;
I... (3 Replies)
Discussion started by: luna_soleil
3 Replies
2. UNIX for Dummies Questions & Answers
i'm confused what this means.
i was asked to design a menu or command line option driven script that reads out of a DB and displays info such as
read_data.pl -u <user> -e <event>
which would print commands run by <user>with the <event> in the db.
any suggestions? i've been using... (2 Replies)
Discussion started by: kpddong
2 Replies
3. Shell Programming and Scripting
could someone show me a sample command line option driven script?
i want to see an easy way to write one and how i can execute it using command line options such as typing in
read.pl -i <id> -c <cmds> -s <start> -e <end>
would read out all the commands run by ID . from start time to... (7 Replies)
Discussion started by: kpddong
7 Replies
4. Programming
I have a C++ program. I read command line arguments, but if the value is not supplied, I default or make a calculation. Let's say I set it to a default value.
I can code this in several ways. Here I show three ways. What would be the best way for maintaining this code? The program will get very... (2 Replies)
Discussion started by: kristinu
2 Replies
5. Shell Programming and Scripting
Hi Experts,
I have a data with multiple entry , I want to filter PKG= & the last column "00060110" or "00088150" in the output
file:
###############################################################################################
PKG= P8SDB :: VGS = vgP8SOra vgP8SDB1 vgP8S001... (5 Replies)
Discussion started by: rveri
5 Replies
6. Shell Programming and Scripting
Hi all,
I am new to shell script.I need your help to write a shell script.
I need to write a shell script to extract data from a .csv file where columns are ',' separated.
The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (3 Replies)
Discussion started by: Vivekit82
3 Replies
7. Linux
I have a .CSV file with the below format:
"column 1","column 2","column 3","column 4","column 5","column 6","column 7","column 8","column 9","column 10
"12310","42324564756","a simple string with a , comma","string with or, without commas","string 1","USD","12","70%","08/01/2013",""... (2 Replies)
Discussion started by: dhruuv369
2 Replies
8. Shell Programming and Scripting
Hi,
I have multiple files that each contain four columns of strings:
File1:
Code:
123 abc gfh 273
456 ddff jfh 837
789 ghi u4u 395
File2:
Code:
123 abc dd fu
456 def 457 nd
891 384 djh 783
I want to compare the strings in Column 1 of File 1 with each other file and Print in... (3 Replies)
Discussion started by: owwow14
3 Replies
9. Shell Programming and Scripting
Hi All,
I am trying to select the rows in a fixed width file based on values in the columns.
I want to select only the rows if column position 3-4 has the value AB
I am using cut command to get the column values. Is it possible to check if cut -c3-4 = AB is true then select only that... (2 Replies)
Discussion started by: ashok.k
2 Replies
10. Shell Programming and Scripting
Hello,
I have a script that is generating a tab delimited output file.
num Name PCA_A1 PCA_A2 PCA_A3
0 compound_00 -3.5054 -1.1207 -2.4372
1 compound_01 -2.2641 0.4287 -1.6120
3 compound_03 -1.3053 1.8495 ... (3 Replies)
Discussion started by: LMHmedchem
3 Replies
paste(1) General Commands Manual paste(1)
Name
paste - merge file data
Syntax
paste file1 file2...
paste -dlist file1 file2...
paste -s [-dlist] file1 file2...
Description
In the first two forms, concatenates corresponding lines of the given input files file1, file2, etc. It treats each file as a column or
columns of a table and pastes them together horizontally (parallel merging).
In the last form, the command combines subsequent lines of the input file (serial merging).
In all cases, lines are glued together with the tab character, or with characters from an optionally specified list. Output is to the
standard output, so it can be used as the start of a pipe, or as a filter, if - is used in place of a file name.
Options
- Used in place of any file name, to read a line from the standard input. (There is no prompting).
-dlist Replaces characters of all but last file with nontabs characters (default tab). One or more characters immediately following -d
replace the default tab as the line concatenation character. The list is used circularly, i. e. when exhausted, it is reused. In
parallel merging (i. e. no -s option), the lines from the last file are always terminated with a new-line character, not from the
list. The list may contain the special escape sequences:
(new-line), (tab), \ (backslash), and (empty string, not a null
character). Quoting may be necessary, if characters have special meaning to the shell (for example, to get one backslash, use
-d"\\" ).
Without this option, the new-line characters of each but the last file (or last line in case of the -s option) are replaced by a
tab character. This option allows replacing the tab character by one or more alternate characters (see below).
-s Merges subsequent lines rather than one from each input file. Use tab for concatenation, unless a list is specified with -d
option. Regardless of the list, the very last character of the file is forced to be a new-line.
Examples
ls | paste -d" " -
list directory in one column
ls | paste - - - -
list directory in four columns
paste -s -d"
" file
combine pairs of lines into lines
Diagnostics
line too long
Output lines are restricted to 511 characters.
too many files
Except for -s option, no more than 12 input files may be specified.
See Also
cut(1), grep(1), pr(1)
paste(1)