10-10-2009
Hello again,
Again, I apologize for the confsion. I made a mistake in the first post, the letters should be recoded to -1, 0, 1. This is the tricky part. I need to recode the letters on a per column, alphabetical order basis. There are several different combinations that can occur within a column:
AA, AC, CC = -1, 0, 1
AA, AG, GG = -1, 0, 1
AA, AT, TT = -1, 0, 1
CC, CG, GG = -1, 0, 1
CC, CT, TT = -1, 0, 1
GG, GT, TT = -1, 0, 1
Therefore anything with a mixed data point (AC, AG, AT, CG, CT, GT) will ALWAYS = 0, AA will ALWAYS = -1, and TT will ALWAYS = 1. The problem come when recoding CC and GG. As you can see, in some rows CC will come first in the alphabet and will be recoded as -1 (When the combo is CC, CG, GG) . However, in some columns CC does not come first in the alphabet and will be coded as 1 (when the combo is AA, AC, CC). The same problem occurs with GG. IS there any solution to this issue? I hope I explained it better this time!!
Thank you so much for your patience!!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
suppose u have a file which consist of many data points separated by asterisk
Question is to extract third part in each line .
0.0002*0.003*-0.93939*0.0202*0.322*0.3332*0.2222*0.22020
0.003*0.3333*0.33322*-0.2220*0.3030*0.2222*0.3331*-0.3030
0.0393*0.3039*-0.03038*0.033*0.4033*0.30384*0.4048... (5 Replies)
Discussion started by: cdfd123
5 Replies
2. Shell Programming and Scripting
I have a file that has been partially recoded so that data points that were formerly letter combinations are now -1, 0, or 1. I need to finish recoding the GG and CC data points. The file looks like this:
ID 1 2 3 4 5 6 7 8
83845676 0 0 0 0 CC -1 CC CC
838469. -1 -1 1 GG CC 0 CC 1
83847041... (10 Replies)
Discussion started by: doobedoo
10 Replies
3. Shell Programming and Scripting
Hi All I have a data set like this tab delimited:
weft fgr-1 345 -1 fgrythdgd
weft fgr-3 456 -2 ghjdklflllff
weft fgr-11 456 -3 ghtjuffl
weft fgr-1 213 -2 ghtyjdkl
weft fgr-34 567 -5 fghytkflf
frgt fgr-36 567 -1 ghrjufjf
frgt fgr-45 678 -2 ghjruir
frgt fgr-34 546 -5 gjjjgkldlld
frgt... (4 Replies)
Discussion started by: Lucky Ali
4 Replies
4. UNIX for Dummies Questions & Answers
hiii, Help me out..i have a huge set of data stored in a file.This file has has 2 columns which is latitude & longitude of a region. Now i have a program which asks for the number of points & based on this number it asks the user to enter that latitude & longitude values which are in the same... (7 Replies)
Discussion started by: reva
7 Replies
5. Programming
Hi,
I am trying to arrange my graphs with GNUPLOT. Although it looked like simple at the beginning, I could not figure out an answer for the following: I want to change the style of my data points (not the line, just exact data points) The terminal assigns first + and then x to them but what I... (0 Replies)
Discussion started by: natasha
0 Replies
6. Shell Programming and Scripting
Hi,
I have a file with one column data (sample below) and I am trying to write a shell script to calculate the difference between consecutive data valuse i.e
Var = Ni -N(i-1)
0.3141
-3.6595
0.9171
5.2001
3.5331
3.7022
-6.1087
-5.1039
-9.8144
1.6516
-2.725
3.982
7.769
8.88 (5 Replies)
Discussion started by: malandisa
5 Replies
7. UNIX for Dummies Questions & Answers
Hi, I need help on finding the value of my data that encompasses certain percentage of my total data points (n). Attached is an example of my data, n=30. What I want to do is for instance is find the minimum threshold that still encompasses 60% (n=18), 70% (n=21) and 80% (n=24).
manually to... (4 Replies)
Discussion started by: ida1215
4 Replies
8. Shell Programming and Scripting
I have a text file that shows the output of my solar inverters. I want to separate this into sections. overview , device 1 , device 2 , device 3. Each device has different number of lines. but they all have unique starting points. Overview starts with 6 #'s, Devices have 4#'s and their data starts... (6 Replies)
Discussion started by: Mikey
6 Replies
9. Shell Programming and Scripting
Hi, I was wondering if someone would be able to help with extrapolating information from a file and filling an existing matrix with that information.
I have made a matrix like this (file 1):
A B C D
1
2
3
4
I have another file with data like this (file 2):
1 A
1 C
3 C
4 B... (1 Reply)
Discussion started by: hubleo
1 Replies
10. Shell Programming and Scripting
I need to rank a large number of data points that exist in multiple files. My data points (Column 3) are based on unique values in columns 1 and 2. I need to rank the values that are in File 1, Column 3.
For instance:
Input File 1
AAA BBB 10
CCC DDD 16
EEE FFF 20
Input File 2
... (47 Replies)
Discussion started by: ncwxpanther
47 Replies
LEARN ABOUT DEBIAN
gd_alter_encoding
gd_alter_encoding(3) GETDATA gd_alter_encoding(3)
NAME
gd_alter_encoding -- modify the binary encoding of data in a dirfile
SYNOPSIS
#include <getdata.h>
int gd_alter_encoding(DIRFILE *dirfile, unsigned int encoding, int fragment_index, int recode);
DESCRIPTION
The gd_alter_encoding() function sets the binary encoding of the format specification fragment given by fragment_index to byte_sex in the
dirfile(5) database specified by dirfile. The binary encoding of a fragment indicate the encoding of data stored in binary files associat-
ed with RAW fields defined in the specified fragment. The binary encoding of a fragment containing no RAW fields is ignored.
The byte_sex argument should be one of the following:
GD_UNENCODED, GD_BZIP2_ENCODED, GD_GZIP_ENCODED, GD_LZMA_ENCODED, GD_SLIM_ENCODED, GD_TEXT_ENCODED.
See gd_cbopen(3) and dirfile-encoding(5) for the meanings of these symbols and details on the supported encoding schemes.
In addition to being simply a valid fragment index, fragment_index may also be the special value GD_ALL_FRAGMENTS, which indicates that the
encoding of all fragments in the database should be changed.
If the recode argument is non-zero, this call will recode the binary data of affected RAW fields to account for the change in binary encod-
ing. If the encoding of the fragment is encoding insensitive, or if the data type is only one byte in size, no change is made. If recode
is zero, affected binary files are left untouched.
RETURN VALUE
Upon successful completion, gd_alter_encoding() returns zero. On error, it returns -1 and sets the dirfile error to a non-zero error val-
ue. Possible error values are:
GD_E_ACCMODE
The specified dirfile was opened read-only.
GD_E_ALLOC
The library was unable to allocate memory.
GD_E_BAD_DIRFILE
The supplied dirfile was invalid.
GD_E_BAD_INDEX
The supplied index was out of range.
GD_E_PROTECTED
The metadata of the given format specification fragment was protected from change, or the binary data of the fragment was protected
from change and binary file recoding was requested.
GD_E_RAW_IO
An I/O error occurred while attempting to recode a binary file.
GD_E_UNCLEAN_DB
An error occurred while moving the recoded file into place. As a result, the database may be in an unclean state. See the NOTES
section below for recovery instructions. In this case, the dirfile will be flagged as invalid, to prevent further database corrup-
tion. It should be immediately closed.
GD_E_UNKNOWN_ENCODING
The encoding scheme of the fragment is unknown.
GD_E_UNSUPPORTED
The encoding scheme of the fragment does not support binary file recoding.
The dirfile error may be retrieved by calling gd_error(3). A descriptive error string for the last error encountered can be obtained from
a call to gd_error_string(3).
NOTES
A binary file recoding occurs out-of-place. As a result, sufficient space must be present on the filesystem for the binary files of all
RAW fields in the fragment both before and after translation. If all fragments are updated by specifying GD_ALL_FRAGMENTS, the recoding
occurs one fragment at a time.
An error code of GD_E_UNCLEAN_DB indicates a system error occurred while moving the re-encoded binary data into place or when deleting the
old data. If this happens, the database may be left in an unclean state. The caller should check the filesystem directly to ascertain the
state of the dirfile data before continuing. For recovery instructions, see the file /usr/share/doc/getdata/unclean_database_recovery.txt.
SEE ALSO
gd_cbopen(3), gd_error(3), gd_error_string(3), gd_encoding(3), dirfile(5), dirfile-format(5)
Version 0.7.0 20 July 2010 gd_alter_encoding(3)