Sponsored Content
Top Forums Shell Programming and Scripting Failure using regex with awk in 'while read file' loop Post 302940524 by pathunkathunk on Monday 6th of April 2015 11:26:55 PM
Old 04-07-2015
Failure using regex with awk in 'while read file' loop

I have a file1.txt with several 100k lines, each of which has a column 9 containing one of 60 "label" identifiers. Using an labels.txt file containing a list of labels, I'd like to extract 200 random lines from file1.txt for each of the labels in index.txt.

Using a contrived mini-example:
Code:
$ cat file1.txt 
H	0	328	100.0	-	0	0	38D150M140D	M01433:68:000000000-AAT0D:1:1111:13371:3239;barcodelabel=c8;	OTU_1;size=17947;
H	1	325	100.0	+	0	0	150M175D	M01433:68:000000000-AAT0D:1:1105:27659:19941;barcodelabel=c12;	OTU_2;size=101;
H	4	411	99.3	+	0	0	24D150M237D	M01433:68:000000000-AAT0D:1:2107:16393:23698;barcodelabel=g10;	OTU_5;size=64;
H	2	283	98.7	+	0	0	150M133D	M01433:68:000000000-AAT0D:1:2104:21919:3018;barcodelabel=c12;	OTU_3;size=80;
H	1	277	98.5	-	0	0	15I135M142D	M01433:68:000000000-AAT0D:1:2108:12616:12185;barcodelabel=c12;	OTU_2;size=101;
H	0	295	100.0	+	0	0	14D150M131D	M01433:68:000000000-AAT0D:1:1108:4978:15986;barcodelabel=g10;	OTU_1;size=17947;
H	29	312	97.6	-	0	0	25I125M187D	M01433:68:000000000-AAT0D:1:1109:20934:22671;barcodelabel=g15;	OTU_30;size=8;
H	0	315	99.3	-	0	0	88D150M77D	M01433:68:000000000-AAT0D:1:2114:17509:23920;barcodelabel=g10;	OTU_1;size=17947;

$ cat labels.txt
c12
g10

This is what I'm trying, but it results in empty files:
Code:
$ while read file
> do
> awk '/${file}/' file1.txt | gshuf -n 200 > ${file}.txt
> done < labels.txt

Desired output--two random lines for each label in labels.txt (i.e. may vary except for "label=c12" or "label=g12", respectively):
Code:
$ cat c12.txt
H	1	325	100.0	+	0	0	150M175D	M01433:68:000000000-AAT0D:1:1105:27659:19941;barcodelabel=c12;	OTU_2;size=101;
H	2	283	98.7	+	0	0	150M133D	M01433:68:000000000-AAT0D:1:2104:21919:3018;barcodelabel=c12;	OTU_3;size=80;

$ cat g10.txt
H	0	295	100.0	+	0	0	14D150M131D	M01433:68:000000000-AAT0D:1:1108:4978:15986;barcodelabel=g10;	OTU_1;size=17947;
H	0	315	99.3	-	0	0	88D150M77D	M01433:68:000000000-AAT0D:1:2114:17509:23920;barcodelabel=g10;	OTU_1;size=17947;

It seems like the problem is with the " awk '/${file}/' "? I say this because I can extract lines for each label but only if I explicitly specify the label regex (in this case g10.txt also has two random lines with "label=c12" instead of g10):
Code:
$ while read file
> do
> awk '/c12/' file1.txt | gshuf -n 2 > ${file}.txt
> done < labels.txt
$ cat c12.txt 
H	1	277	98.5	-	0	0	15I135M142D	M01433:68:000000000-AAT0D:1:2108:12616:12185;barcodelabel=c12;	OTU_2;size=101;
H	1	325	100.0	+	0	0	150M175D	M01433:68:000000000-AAT0D:1:1105:27659:19941;barcodelabel=c12;	OTU_2;size=101;

Thanks for any pointers.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read from a file and use the strings in a loop

Hello all, I need some help to create a script which contain a few strings on every line, and use those strings in a loop to fire some commands. for exmaple the file looks like tom dave bill andy paul I want to read one line at a time and use it in loop like command tom command dave... (3 Replies)
Discussion started by: xboxer21
3 Replies

2. UNIX for Dummies Questions & Answers

How to read a file in unix using do....done loop

Hi , can some give me idea about how to use do...done while loop in UNIX to read the contents of a file.. (2 Replies)
Discussion started by: sreenusola
2 Replies

3. Shell Programming and Scripting

How to Read the entire file using while loop

Guys, I am trying to read the whole file using while loop but when i run the ssh part of the script it reads only the first line and exit after that. There are in total 134 lines in the file, but when the output is redirected, it does only for one line and comes to command prompt. pls help..... (11 Replies)
Discussion started by: sdosanjh
11 Replies

4. SCO

file system not getting mounted in read write mode after system power failure

After System power get failed File system is not getting mounted in read- write mode (1 Reply)
Discussion started by: gtkpmbpl
1 Replies

5. Shell Programming and Scripting

IF awk in a while read line-loop

Hi As a newbe in scripting, i struggle hard with my first script. What i want to do is, bringing data of two files together. file1: .... 05/14/12-04:00:00 41253 4259 5135 5604 5812 5372 05/14/12-04:10:00 53408 5501 6592 7402 7354 6639 05/14/12-04:20:00 58748 6037 7292 8223... (13 Replies)
Discussion started by: IMPe
13 Replies

6. UNIX for Dummies Questions & Answers

read regex from ID file, print regex and line below from source file

I have a file of protein sequences with headers (my source file). Based on a list of IDs (which are included in some of the headers), I'd like to print out only the specified sequences, with only the ID as header. In other words, I'd like to search source.txt for the terms in IDs.txt, and print... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

7. Shell Programming and Scripting

Using awk instead of while loop to read file

Hello, I have a huge file, I am currently using while loop to read and do some calculation on it, but it is taking a lot of time. I want to use AWK to read and do those calculations. Please suggest. currently doing: cat input2 | while read var1 do num=`echo $var1 | awk... (6 Replies)
Discussion started by: anand2308
6 Replies

8. Shell Programming and Scripting

For loop inside awk to read and print contents of files

Hello, I have a set of files Xfile0001 - Xfile0021, and the content of this files (one at a time) needs to be printed between some line (lines start with word "Generated") that I am extracting from another file called file7.txt and all the output goes into output.txt. First I tried creating a for... (5 Replies)
Discussion started by: jaldo0805
5 Replies

9. Shell Programming and Scripting

Use while loop to read file and use ${file} for both filename input into awk and as string to print

I have files named with different prefixes. From each I want to extract the first line containing a specific string, and then print that line along with the prefix. I've tried to do this with a while loop, but instead of printing the prefix I print the first line of the file twice. Files:... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

10. Shell Programming and Scripting

Failure: if grep "$Var" "$line" inside while read line loop

Hi everybody, I am new at Unix/Bourne shell scripting and with my youngest experiences, I will not become very old with it :o My code: #!/bin/sh set -e set -u export IFS= optl="Optl" LOCSTORCLI="/opt/lsi/storcli/storcli" ($LOCSTORCLI /c0 /vall show | grep RAID | cut -d " "... (5 Replies)
Discussion started by: Subsonic66
5 Replies
h5jam(1)						      General Commands Manual							  h5jam(1)

NAME
h5jam - Add a user block to a HDF5 file SYNOPSIS
h5jam -u user_block -i in_file.h5 [-o out_file.h5] [--clobber] DESCRIPTION
h5jam concatenates a user_block file and an HDF5 file to create an HDF5 file with a user block. The user block can be either binary or text. The output file is padded so that the HDF5 header begins on byte 512, 1024, etc.. (See the HDF5 File Format.) If out_file.h5 is given, a new file is created with the user_block followed by the contents of in_file.h5. In this case, infile.h5 is unchanged. If out_file.h5 is not specified, the user_block is added to in_file.h5. If in_file.h5 already has a user block, the contents of user_block will be added to the end of the existing user block, and the file shifted to the next boundary. If --clobber is set, any existing user block will be overwritten. EXAMPLE USAGE
Create new file, newfile.h5, with the text in file mytext.txt as the user block for the HDF5 file file.h5. h5jam -u mytext.txt -i file.h5 -o newfile.h5 Add text in file mytext.txt to front of HDF5 dataset, file.h5. h5jam -u mytext.txt -i file.h5 Overwrite the user block (if any) in file.h5 with the contents of mytext.txt. h5jam -u mytext.txt -i file.h5 --clobber RETURN VALUE
h5jam returns the size of the output file, or -1 if an error occurs. CAVEATS
This tool copies all the data (sequentially) in the file(s) to new offsets. For a large file, this copy will take a long time. The most efficient way to create a user block is to create the file with a user block (see H5Pset_user_block), and write the user block data into that space from a program. The user block is completely opaque to the HDF5 library and to the h5jam and h5unjam tools. The user block is simply read or written as a string of bytes, which could be text or any kind of binary data. It is up to the user to know what the contents of the user block means and how to process it. When the user block is extracted, all the data is written to the output, including any padding or unwritten data. This tool moves the HDF5 file through byte copies, i.e., it does not read or interpret the HDF5 objects. SEE ALSO
h5dump(1), h5ls(1), h5diff(1), h5import(1), gif2h5(1), h52gif(1), h5perf(1), h5unjam(1). h5jam(1)
All times are GMT -4. The time now is 04:12 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy