Sponsored Content
Top Forums Shell Programming and Scripting Failure using regex with awk in 'while read file' loop Post 302940524 by pathunkathunk on Monday 6th of April 2015 11:26:55 PM
Old 04-07-2015
Failure using regex with awk in 'while read file' loop

I have a file1.txt with several 100k lines, each of which has a column 9 containing one of 60 "label" identifiers. Using an labels.txt file containing a list of labels, I'd like to extract 200 random lines from file1.txt for each of the labels in index.txt.

Using a contrived mini-example:
Code:
$ cat file1.txt 
H	0	328	100.0	-	0	0	38D150M140D	M01433:68:000000000-AAT0D:1:1111:13371:3239;barcodelabel=c8;	OTU_1;size=17947;
H	1	325	100.0	+	0	0	150M175D	M01433:68:000000000-AAT0D:1:1105:27659:19941;barcodelabel=c12;	OTU_2;size=101;
H	4	411	99.3	+	0	0	24D150M237D	M01433:68:000000000-AAT0D:1:2107:16393:23698;barcodelabel=g10;	OTU_5;size=64;
H	2	283	98.7	+	0	0	150M133D	M01433:68:000000000-AAT0D:1:2104:21919:3018;barcodelabel=c12;	OTU_3;size=80;
H	1	277	98.5	-	0	0	15I135M142D	M01433:68:000000000-AAT0D:1:2108:12616:12185;barcodelabel=c12;	OTU_2;size=101;
H	0	295	100.0	+	0	0	14D150M131D	M01433:68:000000000-AAT0D:1:1108:4978:15986;barcodelabel=g10;	OTU_1;size=17947;
H	29	312	97.6	-	0	0	25I125M187D	M01433:68:000000000-AAT0D:1:1109:20934:22671;barcodelabel=g15;	OTU_30;size=8;
H	0	315	99.3	-	0	0	88D150M77D	M01433:68:000000000-AAT0D:1:2114:17509:23920;barcodelabel=g10;	OTU_1;size=17947;

$ cat labels.txt
c12
g10

This is what I'm trying, but it results in empty files:
Code:
$ while read file
> do
> awk '/${file}/' file1.txt | gshuf -n 200 > ${file}.txt
> done < labels.txt

Desired output--two random lines for each label in labels.txt (i.e. may vary except for "label=c12" or "label=g12", respectively):
Code:
$ cat c12.txt
H	1	325	100.0	+	0	0	150M175D	M01433:68:000000000-AAT0D:1:1105:27659:19941;barcodelabel=c12;	OTU_2;size=101;
H	2	283	98.7	+	0	0	150M133D	M01433:68:000000000-AAT0D:1:2104:21919:3018;barcodelabel=c12;	OTU_3;size=80;

$ cat g10.txt
H	0	295	100.0	+	0	0	14D150M131D	M01433:68:000000000-AAT0D:1:1108:4978:15986;barcodelabel=g10;	OTU_1;size=17947;
H	0	315	99.3	-	0	0	88D150M77D	M01433:68:000000000-AAT0D:1:2114:17509:23920;barcodelabel=g10;	OTU_1;size=17947;

It seems like the problem is with the " awk '/${file}/' "? I say this because I can extract lines for each label but only if I explicitly specify the label regex (in this case g10.txt also has two random lines with "label=c12" instead of g10):
Code:
$ while read file
> do
> awk '/c12/' file1.txt | gshuf -n 2 > ${file}.txt
> done < labels.txt
$ cat c12.txt 
H	1	277	98.5	-	0	0	15I135M142D	M01433:68:000000000-AAT0D:1:2108:12616:12185;barcodelabel=c12;	OTU_2;size=101;
H	1	325	100.0	+	0	0	150M175D	M01433:68:000000000-AAT0D:1:1105:27659:19941;barcodelabel=c12;	OTU_2;size=101;

Thanks for any pointers.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read from a file and use the strings in a loop

Hello all, I need some help to create a script which contain a few strings on every line, and use those strings in a loop to fire some commands. for exmaple the file looks like tom dave bill andy paul I want to read one line at a time and use it in loop like command tom command dave... (3 Replies)
Discussion started by: xboxer21
3 Replies

2. UNIX for Dummies Questions & Answers

How to read a file in unix using do....done loop

Hi , can some give me idea about how to use do...done while loop in UNIX to read the contents of a file.. (2 Replies)
Discussion started by: sreenusola
2 Replies

3. Shell Programming and Scripting

How to Read the entire file using while loop

Guys, I am trying to read the whole file using while loop but when i run the ssh part of the script it reads only the first line and exit after that. There are in total 134 lines in the file, but when the output is redirected, it does only for one line and comes to command prompt. pls help..... (11 Replies)
Discussion started by: sdosanjh
11 Replies

4. SCO

file system not getting mounted in read write mode after system power failure

After System power get failed File system is not getting mounted in read- write mode (1 Reply)
Discussion started by: gtkpmbpl
1 Replies

5. Shell Programming and Scripting

IF awk in a while read line-loop

Hi As a newbe in scripting, i struggle hard with my first script. What i want to do is, bringing data of two files together. file1: .... 05/14/12-04:00:00 41253 4259 5135 5604 5812 5372 05/14/12-04:10:00 53408 5501 6592 7402 7354 6639 05/14/12-04:20:00 58748 6037 7292 8223... (13 Replies)
Discussion started by: IMPe
13 Replies

6. UNIX for Dummies Questions & Answers

read regex from ID file, print regex and line below from source file

I have a file of protein sequences with headers (my source file). Based on a list of IDs (which are included in some of the headers), I'd like to print out only the specified sequences, with only the ID as header. In other words, I'd like to search source.txt for the terms in IDs.txt, and print... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

7. Shell Programming and Scripting

Using awk instead of while loop to read file

Hello, I have a huge file, I am currently using while loop to read and do some calculation on it, but it is taking a lot of time. I want to use AWK to read and do those calculations. Please suggest. currently doing: cat input2 | while read var1 do num=`echo $var1 | awk... (6 Replies)
Discussion started by: anand2308
6 Replies

8. Shell Programming and Scripting

For loop inside awk to read and print contents of files

Hello, I have a set of files Xfile0001 - Xfile0021, and the content of this files (one at a time) needs to be printed between some line (lines start with word "Generated") that I am extracting from another file called file7.txt and all the output goes into output.txt. First I tried creating a for... (5 Replies)
Discussion started by: jaldo0805
5 Replies

9. Shell Programming and Scripting

Use while loop to read file and use ${file} for both filename input into awk and as string to print

I have files named with different prefixes. From each I want to extract the first line containing a specific string, and then print that line along with the prefix. I've tried to do this with a while loop, but instead of printing the prefix I print the first line of the file twice. Files:... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

10. Shell Programming and Scripting

Failure: if grep "$Var" "$line" inside while read line loop

Hi everybody, I am new at Unix/Bourne shell scripting and with my youngest experiences, I will not become very old with it :o My code: #!/bin/sh set -e set -u export IFS= optl="Optl" LOCSTORCLI="/opt/lsi/storcli/storcli" ($LOCSTORCLI /c0 /vall show | grep RAID | cut -d " "... (5 Replies)
Discussion started by: Subsonic66
5 Replies
MPI_KMEANS(1)						      General Commands Manual						     MPI_KMEANS(1)

NAME
mpi_kmeans - K-Means clustering tool SYNOPSIS
mpi_kmeans [options] DESCRIPTION
mpi_kmeans is a program that uses k-means clustering to produce a list of cluster centers. The resulting data can be used by mpi_assign(1) to assign points to those cluster centers. OPTIONS
A summary of options is included below. Generic Options: --help Produce help message Input/Output Options: --data file Training file, one datum per line (default: "data.txt") --output file Output file, one cluster center per line (default: "output.txt") K-Means Options: --k num Number of clusters to generate (default: 100) --restarts num Number of k-means restarts (default: 0 = single run) --maxiter num Maximum number of k-means iterations (default: 0 = infinity) EXAMPLES
mpi_kmeans --k 2 --data example.txt --output clusters.txt SEE ALSO
mpi_assign(1) AUTHOR
mpi_kmeans was written by Peter Gehler <peter.gehler@tuebingen.mpg.de>. This manual page was written by Christian Kastner <debian@kvr.at>, for the Debian project (and may be used by others). April 11, 2011 MPI_KMEANS(1)
All times are GMT -4. The time now is 06:07 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy