Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Form balanced matrix by filtering data Post 302922026 by senhia83 on Tuesday 21st of October 2014 05:23:38 PM
Old 10-21-2014
The following approach might work for a smaller data set but for millions of rows that I have will need some sophisticated approach.
I have broken it down into steps,

Code:
 
awk '{print $1}' mydata | sort | uniq -c | awk '{ if ($1>2) print $2}' > tmp
 
grep -f tmp mydata > mydata_filtered

Then I take my data into R and use the reshape package

Code:
 
library(reshape)
mydata=read.table('mydata_filtered')
y=cast(mydata,mydata$V1~mydata$V2,value=mydata$V3)

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

changing data into matrix form

Hi, I have a file whose structure is like this 7 7 1 2 3 4 5 1 3 4 8 6 1 4 5 6 0 2 6 8 3 8 2 5 7 8 0 5 7 9 4 1 3 8 0 2 2 3 5 6 8 basically first two row tell the number of rows and column but the data following them are not arranged in that format. now i want to create another... (1 Reply)
Discussion started by: g0600014
1 Replies

2. Shell Programming and Scripting

extract data from a data matrix with filter criteria

Here is what old matrix look like, IDs X1 X2 Y1 Y2 10914061 -0.364613333 -0.362922333 0.001691 -0.450094667 10855062 0.845956333 0.860396667 0.014440333 1.483899333... (7 Replies)
Discussion started by: ssshen
7 Replies

3. Shell Programming and Scripting

Cut and paste data in matrix form

I have large formatted data file with five columns. This has to be rearranged in lower order matrix form as shown below for sample data. 1 2 3 4 5 1.0 3.0 2.0 5.0 3.0 2.0 4.0 3.0 1.0 6.0 2.0 3.0 4.0 5.0 1.0 1.0 4.0 2.0 3.0 5.0 3.0 5.0 4.0 2.0 8.0 1.0 3.0 2.0 4.0 5.0 2.0... (7 Replies)
Discussion started by: dhilipumich
7 Replies

4. Ubuntu

How to convert full data matrix to linearised left data matrix?

Hi all, Is there a way to convert full data matrix to linearised left data matrix? e.g full data matrix Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7 Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245 Bh2 0.241058 0 0.240594 0.241931 0.241975 ... (8 Replies)
Discussion started by: evoll
8 Replies

5. Shell Programming and Scripting

convert data into matrix- awk

is it possible to count the number of keys based on state and cell and output it as a simple matrix. Ex: cell1-state1 has 2 keys cell3-state1 has 4 keys. Note: Insert 0 if no data available. input key states cell key1 state1 cell1 key1 state2 cell1 key1 ... (21 Replies)
Discussion started by: quincyjones
21 Replies

6. Shell Programming and Scripting

Reformatting data in matrix form

Hi, Some assistance with respect to the following problem will be very helpful. I want to reformat my dataset in the following manner for subsequent analysis. I have first column values (which repeat for each value of 2nd column) which are names, the second column specifies position ad the... (1 Reply)
Discussion started by: newbie83
1 Replies

7. Shell Programming and Scripting

Transpose Data form Different form

HI Guys, I have data in File A.txt RL03 RL03_A_1 RL03_B_1 RL03_C_1 RL03 -119.8 -119.5 -119.5 RL07 RL07_A_1 RL07_B_1 RL07_C_1 RL07 -119.3 -119.5 -119.5 RL15 RL15_A_1 RL15_C_1 RL15 -120.5 -119.4 RL16... (2 Replies)
Discussion started by: asavaliya
2 Replies

8. Shell Programming and Scripting

[Solved] Converting the data into matrix with 0's and 1's

I have a file that contains 2 columns tag,pos cat input_file tag pos atg 10 ata 16 agt 15 agg 19 atg 17 agg 14 I have used following command to sort the file based on second column sort -k 2 input_file tag pos atg 10 agg 14 agt 15 ata 16 agg 19 atg 17 (2 Replies)
Discussion started by: raj_k
2 Replies

9. Shell Programming and Scripting

How order a data matrix using awk?

is it possible to order the following row clusters from ascending to descending. thanx in advance input 1 2 4 0 1 2 4 0 3 3 3 3 1 5 1 0 1 5 1 0 6 0 0 0 5 1 1 1... (4 Replies)
Discussion started by: quincyjones
4 Replies

10. Shell Programming and Scripting

Match child with parents and form matrix

thank you for letting me join this forum, lots of learning opportunities looks like. Myself a biologist, very new into unix, so please excuse if I use incorrect language. I am using cygwin on windows, it can run perl, awk , sed etc. I have 2 files, the first sample sheet, tells which parent... (10 Replies)
Discussion started by: jalaj841
10 Replies
ARCHIVE_READ(3) 					   BSD Library Functions Manual 					   ARCHIVE_READ(3)

NAME
archive_read -- functions for reading streaming archives LIBRARY
Streaming Archive Library (libarchive, -larchive) SYNOPSIS
#include <archive.h> DESCRIPTION
These functions provide a complete API for reading streaming archives. The general process is to first create the struct archive object, set options, initialize the reader, iterate over the archive headers and associated data, then close the archive and release all resources. Create archive object See archive_read_new(3). To read an archive, you must first obtain an initialized struct archive object from archive_read_new(). Enable filters and formats See archive_read_filter(3) and archive_read_format(3). You can then modify this object for the desired operations with the various archive_read_set_XXX() and archive_read_support_XXX() functions. In particular, you will need to invoke appropriate archive_read_support_XXX() functions to enable the corresponding compression and format support. Note that these latter functions perform two distinct operations: they cause the corresponding support code to be linked into your program, and they enable the corresponding auto-detect code. Unless you have specific constraints, you will generally want to invoke archive_read_support_filter_all() and archive_read_support_format_all() to enable auto-detect for all formats and compression types currently supported by the library. Set options See archive_read_set_options(3). Open archive See archive_read_open(3). Once you have prepared the struct archive object, you call archive_read_open() to actually open the archive and prepare it for reading. There are several variants of this function; the most basic expects you to provide pointers to several functions that can provide blocks of bytes from the archive. There are convenience forms that allow you to specify a filename, file descriptor, FILE * object, or a block of mem- ory from which to read the archive data. Note that the core library makes no assumptions about the size of the blocks read; callback func- tions are free to read whatever block size is most appropriate for the medium. Consume archive See archive_read_header(3), archive_read_data(3) and archive_read_extract(3). Each archive entry consists of a header followed by a certain amount of data. You can obtain the next header with archive_read_next_header(), which returns a pointer to an struct archive_entry structure with information about the current archive element. If the entry is a regular file, then the header will be followed by the file data. You can use archive_read_data() (which works much like the read(2) system call) to read this data from the archive, or archive_read_data_block() which provides a slightly more efficient interface. You may prefer to use the higher-level archive_read_data_skip(), which reads and discards the data for this entry, archive_read_data_to_file(), which copies the data to the provided file descriptor, or archive_read_extract(), which recreates the specified entry on disk and copies data from the archive. In particular, note that archive_read_extract() uses the struct archive_entry structure that you provide it, which may differ from the entry just read from the archive. In particular, many applications will want to override the path- name, file permissions, or ownership. Release resources See archive_read_free(3). Once you have finished reading data from the archive, you should call archive_read_close() to close the archive, then call archive_read_free() to release all resources, including all memory allocated by the library. EXAMPLE
The following illustrates basic usage of the library. In this example, the callback functions are simply wrappers around the standard open(2), read(2), and close(2) system calls. void list_archive(const char *name) { struct mydata *mydata; struct archive *a; struct archive_entry *entry; mydata = malloc(sizeof(struct mydata)); a = archive_read_new(); mydata->name = name; archive_read_support_filter_all(a); archive_read_support_format_all(a); archive_read_open(a, mydata, myopen, myread, myclose); while (archive_read_next_header(a, &entry) == ARCHIVE_OK) { printf("%s ",archive_entry_pathname(entry)); archive_read_data_skip(a); } archive_read_free(a); free(mydata); } ssize_t myread(struct archive *a, void *client_data, const void **buff) { struct mydata *mydata = client_data; *buff = mydata->buff; return (read(mydata->fd, mydata->buff, 10240)); } int myopen(struct archive *a, void *client_data) { struct mydata *mydata = client_data; mydata->fd = open(mydata->name, O_RDONLY); return (mydata->fd >= 0 ? ARCHIVE_OK : ARCHIVE_FATAL); } int myclose(struct archive *a, void *client_data) { struct mydata *mydata = client_data; if (mydata->fd > 0) close(mydata->fd); return (ARCHIVE_OK); } SEE ALSO
tar(1), libarchive(3), archive_read_new(3), archive_read_data(3), archive_read_extract(3), archive_read_filter(3), archive_read_format(3), archive_read_header(3), archive_read_open(3), archive_read_set_options(3), archive_util(3), tar(5) HISTORY
The libarchive library first appeared in FreeBSD 5.3. AUTHORS
The libarchive library was written by Tim Kientzle <kientzle@acm.org>. BUGS
Many traditional archiver programs treat empty files as valid empty archives. For example, many implementations of tar(1) allow you to append entries to an empty file. Of course, it is impossible to determine the format of an empty file by inspecting the contents, so this library treats empty files as having a special ``empty'' format. BSD
February 2, 2012 BSD
All times are GMT -4. The time now is 05:50 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy