12-02-2010
Thanks for both your replies. Sorry about the lack of detail, but rdcwayx was spot on with what I was attempting. Also thanks for the reference to the useless backticks. I'm trying to teach myself how to program and am bound to make some mistakes. I first tried the example:
awk 'NR==FNR{a[$1];next} $2 in a {print $2,$3,$4}' keyIndvs.txt, tst.txt
but nothing printed to screen and I'm still not 100% what FNR etc was doing so I tried to do the following:
awk '{print $1}' keyIndvs.txt | while read i; do echo $i; awk -F ' ' '{if ($2 == $i) print $2,$3,$4}' tst.txt;done
However, I can't get anything to work and I really don't have a clue why.
In the end I've decided to use R to solve my problem as I'm a bit more familiar with it. I was hoping I could use awk to do the job as the files I have to load into R are huge and it really takes a long time... Anyway here's the code I used if anyone is interested
>R
>indv = read.table('keyIndv.txt',header=FALSE)
>dat = read.table('tst.txt',header=FALSE)
>outF=subset(dat,(V2 %in% indv[,1]))
>write.table(outF,'keyDat.txt')
Thanks again for your help, unfortunately I had to go to R for my solution, but I picked up some awk commands and lessons that may come in handy...
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi -
I'm new to the awk programming language. I'm trying to print a single column of data to several columns, and I found an article on iTWorld.com (ITworld.com - Printing in columns). It looks like the mkCols2 script is very close to what I need to do, but it looks like the end of the code... (2 Replies)
Discussion started by: astroDave
2 Replies
2. Shell Programming and Scripting
I have a H U G E file with over 1million entries in it.
Looks something like this:
USER0001|DEVICE001|VAR1
USER0001|DEVICE001|VAR2
USER0001|DEVICE001|VAR3
USER0001|DEVICE001|VAR4
USER0001|DEVICE001|VAR5
USER0001|DEVICE001|VAR6
USER0001|DEVICE002|VAR1
USER0001|DEVICE002|VAR2... (4 Replies)
Discussion started by: SoMoney
4 Replies
3. Shell Programming and Scripting
- I am looking for different kind of awk solution which I don't think is mentioned before in these forums.
Number of rows in the file are fixed
Their are two columns in file1.txt
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
I am looking for 3... (1 Reply)
Discussion started by: softwarekids23
1 Replies
4. Shell Programming and Scripting
Hi, I'm running via PuTTY, in a BASH shell to do my work. I'm running calculations where steps are reported like this every 100 steps:
NSTEP = 249900 TIME(PS) = 249.900 TEMP(K) = 299.94 PRESS = 21.1
Etot = -12912.5557 EKtot = 4996.8780 EPtot = -17909.4336
... (6 Replies)
Discussion started by: Oriksagtaria
6 Replies
5. Shell Programming and Scripting
Hello all,
this should really be easy for you... I need AWK to print column maxima for each column of such input:
Input:
1 2 3 1
2 1 1 3
2 1 1 2
Output should be:
2 2 3 3
This does the sum, but i need max instead:
{ for(i=1; i<=NF; i++)
sum +=$i }
END {for(i=1; i in sum;... (3 Replies)
Discussion started by: irrevocabile
3 Replies
6. Shell Programming and Scripting
Hi,
I have a requirement with,
No~Dt~Notes
1~2011/08/1~"aaa
bbb
ccc
ddd
eee
fff
ggg
hhh"
Single column alone got splitted into multiple lines.
I require the output as
No~Dt~Notes
1~2011/08/1~"aaa<>bbb<>ccc<>ddd<>eee<>fff<>ggg<>hhh"
mean to say those new lines to be... (1 Reply)
Discussion started by: Bhuvaneswari
1 Replies
7. Shell Programming and Scripting
This is related to one of my previous post.. I have huge file currently I am using loop to read file and checking each line to build this single record, its taking much much time to parse those records.. I thought there should be a way to do this in awk or sed.
I found this code in this forum... (7 Replies)
Discussion started by: Vasan
7 Replies
8. UNIX for Dummies Questions & Answers
I have this output from a loop
a11
1,2
3,4
5,6
7,8
12,8
5,4
3,6
a12
10,11
12,13
15,18
20,22
a13
... (3 Replies)
Discussion started by: maryre89
3 Replies
9. Shell Programming and Scripting
Hi have a large spreadsheet which has 4 columns
APM00111803814 server_2 96085 Corp IT Desktop and Apps
APM00111803814 server_2 96085 Corp IT Desktop and Apps
APM00111803814 server_2 96034 Storage Mgmt Team
APM00111803814 server_2 96152 GWP... (6 Replies)
Discussion started by: kieranfoley
6 Replies
10. Shell Programming and Scripting
Hello:
I've input data:
Input data
--- 3:60069:C:T 60069 C T 1 0 0 1 0 0 1 0 0 1 0 0 1
--- 3:60079:A:G 60079 A G 1 0 0 0.988 0.012 0 1 0 0 1 0 0 1
--- rs186476240:60157:G:A 60157 G A 1 0 0 1 0 0 1 0 0 1 0 0 1
I edit/make first few columns before numbers (6th column) and want to... (4 Replies)
Discussion started by: genome
4 Replies
tst(3) InterNetNews Documentation tst(3)
NAME
tst - ternary search trie functions
SYNOPSIS
#include <inn/tst.h>
struct tst;
struct tst *tst_init(int node_line_width);
void tst_cleanup(struct tst *tst);
int tst_insert(struct tst *tst, const unsigned char *key, void *data, int option, void **exist_ptr);
void *tst_search(struct tst *tst, const unsigned char *key);
void *tst_delete(struct tst *tst, const unsigned char *key);
DESCRIPTION
tst_init allocates memory for members of struct tst, and allocates the first node_line_width nodes. A NULL pointer is returned by tst_init
if any part of the memory allocation fails. On success, a pointer to a struct tst is returned.
The value for node_line_width must be chosen very carefully. One node is required for every character in the tree. If you choose a value
that is too small, your application will spend too much time calling malloc(3) and your node space will be too spread out. Too large a
value is just a waste of space.
tst_cleanup frees all memory allocated to nodes, internal structures, as well as tst itself.
tst_insert inserts the string key into the tree. Behavior when a duplicate key is inserted is controlled by option. If key is already in
the tree then TST_DUPLICATE_KEY is returned, and the data pointer for the existing key is placed in exist_ptr. If option is set to
TST_REPLACE then the existing data pointer for the existing key is replaced by data. Note that the old data pointer will still be placed
in exist_ptr.
If a duplicate key is encountered and option is not set to TST_REPLACE then TST_DUPLICATE_KEY is returned. If key is zero length then
TST_NULL_KEY is returned. A successful insert or replace returns TST_OK. A return value of TST_ERROR indicates that a memory allocation
error occurred while trying to grow the node free.
Note that the data argument must never be NULL. If it is, then calls to tst_search will fail for a key that exists because the data value
was set to NULL, which is what tst_search returns. If you just want a simple existence tree, use the tst pointer as the data pointer.
tst_search finds the string key in the tree if it exists and returns the data pointer associated with that key.
If key is not found then NULL is returned, otherwise the data pointer associated with key is returned.
tst_delete deletes the string key from the tree if it exists and returns the data pointer assocaited with that key.
If key is not found then NULL is returned, otherwise the data pointer associated with key is returned.
HISTORY
Converted to POD from Peter A. Friend's ternary search trie documentation by Alex Kiernan <alex.kiernan@thus.net> for InterNetNews 2.4.0.
$Id: tst.pod 9074 2010-05-31 19:01:32Z iulius $
INN 2.5.3 2011-06-10 tst(3)