07-05-2015
Selecting random columns from large dataset in UNIX
Dear folks
I have a large data set which contains 400K columns. I decide to select 50K determined columns from the whole 400K columns. Is there any command in unix which could do this process for me? I need to also mention that I store all of the columns id in one file which may help to select those columns out of the whole 400K columns.
Regards
Saj
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hello,
I need to select the 3 lines above as well as below a search string, including the search string.
I have been trying various combinations using sed command without any success.
Can anuone help please.
Thanking (2 Replies)
Discussion started by: tansha
2 Replies
2. UNIX for Dummies Questions & Answers
Hi,
I've already posted elsewhere but am posting again here coz im a newbie. I hope you forgive me this time.
I want to know if its possible to delete or ignore columns in a large dataset using 'sed'. For example, I have the following dataset: -
... (0 Replies)
Discussion started by: aarif
0 Replies
3. UNIX for Dummies Questions & Answers
Hi,
I want to know if its possible to delete or ignore columns in a large dataset using 'sed'. For example, I have the following dataset: -
20060714,X.XX,1,043004,Q,T,24.0000,1,25.5000,4,
20060714,X.XX,1,081209,Q,T,24.0000,1,25.5000,5,
As you can see, there are 10 columns here and the... (4 Replies)
Discussion started by: aarif
4 Replies
4. Programming
Hi,
I have a table in my sqlite, here is an example (tab separated)
585 name1 chr1 + 1872 3533 3533 3533 6 1872,2041,2475,2837,3083,3315, 1920,2090,2560,2915,3237,3533, name2
The 10th and 11th columns have information in a comma separated format (not tab).... (0 Replies)
Discussion started by: labrazil
0 Replies
5. Programming
I have C++ exe file( no source code) and need to run many large dataset under unix, but how to know the memeroy usage for one dataset?http://www.codeproject.com/script/Forums/Images/New.gif
I think "top" is not good and if using the profiler, it seems no free download, any ideas? (1 Reply)
Discussion started by: Danielwang1986
1 Replies
6. Shell Programming and Scripting
Hi,
I have a huge file say with 2000000 records. The file has 42 fields. I would like to pick randomly 1000 records from this huge file. Can anyone help me how to do this? (1 Reply)
Discussion started by: ajithshankar@ho
1 Replies
7. Solaris
Hi All,
I want to write a script to create flar images on multiple servers. In non zfs filesystem I am using -X option to refer a file to exclude mounts on different servers.
but on ZFS -X option is not working. I want multiple mounts to be ignore on ZFS base system during flarecreate.
I... (0 Replies)
Discussion started by: uxravi
0 Replies
8. Shell Programming and Scripting
I have a file that needs to be parsed into multiple files every time there line contains a number 1. the problem i face is the lines are random and the file size is random. an example is that on line 4, 65, 187, 202 & 209 are number 1's so there has to be file breaks between all those to create 4... (6 Replies)
Discussion started by: darbs121
6 Replies
9. Shell Programming and Scripting
Hello. I was wondering if anyone could help. I have a file containing a large table in the format:
marker1 marker2 marker3 marker4
position1 position2 position3 position4
genotype1 genotype2 genotype3 genotype4
with marker being a name, position a numeric... (2 Replies)
Discussion started by: davegen
2 Replies
10. Shell Programming and Scripting
Hello to all.
This is first post. Kindly excuse me if I do not adhere to any rules and regulations of this forum.
I have a file containing some rows with three columns each per row(separeted by a space). There are certain rows for which first two columns have same value but the value in... (6 Replies)
Discussion started by: manojmalhotra13
6 Replies
LEARN ABOUT OPENDARWIN
column
COLUMN(1) BSD General Commands Manual COLUMN(1)
NAME
column -- columnate lists
SYNOPSIS
column [-tx] [-c columns] [-s sep] [file ...]
DESCRIPTION
The column utility formats its input into multiple columns. Rows are filled before columns. Input is taken from file operands, or, by
default, from the standard input. Empty lines are ignored.
The options are as follows:
-c Output is formatted for a display columns wide.
-s Specify a set of characters to be used to delimit columns for the -t option.
-t Determine the number of columns the input contains and create a table. Columns are delimited with whitespace, by default, or with
the characters supplied using the -s option. Useful for pretty-printing displays.
-x Fill columns before filling rows.
DIAGNOSTICS
The column utility exits 0 on success, and >0 if an error occurs.
ENVIRONMENT
COLUMNS The environment variable COLUMNS is used to determine the size of the screen if no other information is available.
EXAMPLES
(printf "PERM LINKS OWNER GROUP SIZE MONTH DAY " ;
printf "HH:MM/YEAR NAME
" ;
ls -l | sed 1d) | column -t
SEE ALSO
colrm(1), ls(1), paste(1), sort(1)
HISTORY
The column command appeared in 4.3BSD-Reno.
BSD
June 6, 1993 BSD