Failure using regex with awk in 'while read file' loop Post: 302940524

Sponsored Content

Top Forums Shell Programming and Scripting Failure using regex with awk in 'while read file' loop Post 302940524 by pathunkathunk on Monday 6th of April 2015 11:26:55 PM

04-07-2015

Registered User

Failure using regex with awk in 'while read file' loop

I have a file1.txt with several 100k lines, each of which has a column 9 containing one of 60 "label" identifiers. Using an labels.txt file containing a list of labels, I'd like to extract 200 random lines from file1.txt for each of the labels in index.txt.

Using a contrived mini-example:

Code:

$ cat file1.txt 
H	0	328	100.0	-	0	0	38D150M140D	M01433:68:000000000-AAT0D:1:1111:13371:3239;barcodelabel=c8;	OTU_1;size=17947;
H	1	325	100.0	+	0	0	150M175D	M01433:68:000000000-AAT0D:1:1105:27659:19941;barcodelabel=c12;	OTU_2;size=101;
H	4	411	99.3	+	0	0	24D150M237D	M01433:68:000000000-AAT0D:1:2107:16393:23698;barcodelabel=g10;	OTU_5;size=64;
H	2	283	98.7	+	0	0	150M133D	M01433:68:000000000-AAT0D:1:2104:21919:3018;barcodelabel=c12;	OTU_3;size=80;
H	1	277	98.5	-	0	0	15I135M142D	M01433:68:000000000-AAT0D:1:2108:12616:12185;barcodelabel=c12;	OTU_2;size=101;
H	0	295	100.0	+	0	0	14D150M131D	M01433:68:000000000-AAT0D:1:1108:4978:15986;barcodelabel=g10;	OTU_1;size=17947;
H	29	312	97.6	-	0	0	25I125M187D	M01433:68:000000000-AAT0D:1:1109:20934:22671;barcodelabel=g15;	OTU_30;size=8;
H	0	315	99.3	-	0	0	88D150M77D	M01433:68:000000000-AAT0D:1:2114:17509:23920;barcodelabel=g10;	OTU_1;size=17947;

$ cat labels.txt
c12
g10

This is what I'm trying, but it results in empty files:

Code:

$ while read file
> do
> awk '/${file}/' file1.txt | gshuf -n 200 > ${file}.txt
> done < labels.txt

Desired output--two random lines for each label in labels.txt (i.e. may vary except for "label=c12" or "label=g12", respectively):

Code:

$ cat c12.txt
H	1	325	100.0	+	0	0	150M175D	M01433:68:000000000-AAT0D:1:1105:27659:19941;barcodelabel=c12;	OTU_2;size=101;
H	2	283	98.7	+	0	0	150M133D	M01433:68:000000000-AAT0D:1:2104:21919:3018;barcodelabel=c12;	OTU_3;size=80;

$ cat g10.txt
H	0	295	100.0	+	0	0	14D150M131D	M01433:68:000000000-AAT0D:1:1108:4978:15986;barcodelabel=g10;	OTU_1;size=17947;
H	0	315	99.3	-	0	0	88D150M77D	M01433:68:000000000-AAT0D:1:2114:17509:23920;barcodelabel=g10;	OTU_1;size=17947;

It seems like the problem is with the " awk '/${file}/' "? I say this because I can extract lines for each label but only if I explicitly specify the label regex (in this case g10.txt also has two random lines with "label=c12" instead of g10):

Code:

$ while read file
> do
> awk '/c12/' file1.txt | gshuf -n 2 > ${file}.txt
> done < labels.txt
$ cat c12.txt 
H	1	277	98.5	-	0	0	15I135M142D	M01433:68:000000000-AAT0D:1:2108:12616:12185;barcodelabel=c12;	OTU_2;size=101;
H	1	325	100.0	+	0	0	150M175D	M01433:68:000000000-AAT0D:1:1105:27659:19941;barcodelabel=c12;	OTU_2;size=101;

Thanks for any pointers.

pathunkathunk

View Public Profile for pathunkathunk

Find all posts by pathunkathunk

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read from a file and use the strings in a loop

Hello all, I need some help to create a script which contain a few strings on every line, and use those strings in a loop to fire some commands. for exmaple the file looks like tom dave bill andy paul I want to read one line at a time and use it in loop like command tom command dave...

2. UNIX for Dummies Questions & Answers

How to read a file in unix using do....done loop

Hi , can some give me idea about how to use do...done while loop in UNIX to read the contents of a file..

3. Shell Programming and Scripting

How to Read the entire file using while loop

Guys, I am trying to read the whole file using while loop but when i run the ssh part of the script it reads only the first line and exit after that. There are in total 134 lines in the file, but when the output is redirected, it does only for one line and comes to command prompt. pls help.....

4. SCO

file system not getting mounted in read write mode after system power failure

After System power get failed File system is not getting mounted in read- write mode

5. Shell Programming and Scripting

IF awk in a while read line-loop

Hi As a newbe in scripting, i struggle hard with my first script. What i want to do is, bringing data of two files together. file1: .... 05/14/12-04:00:00 41253 4259 5135 5604 5812 5372 05/14/12-04:10:00 53408 5501 6592 7402 7354 6639 05/14/12-04:20:00 58748 6037 7292 8223...

6. UNIX for Dummies Questions & Answers

read regex from ID file, print regex and line below from source file

I have a file of protein sequences with headers (my source file). Based on a list of IDs (which are included in some of the headers), I'd like to print out only the specified sequences, with only the ID as header. In other words, I'd like to search source.txt for the terms in IDs.txt, and print...

7. Shell Programming and Scripting

Using awk instead of while loop to read file

Hello, I have a huge file, I am currently using while loop to read and do some calculation on it, but it is taking a lot of time. I want to use AWK to read and do those calculations. Please suggest. currently doing: cat input2 | while read var1 do num=`echo $var1 | awk...

8. Shell Programming and Scripting

For loop inside awk to read and print contents of files

Hello, I have a set of files Xfile0001 - Xfile0021, and the content of this files (one at a time) needs to be printed between some line (lines start with word "Generated") that I am extracting from another file called file7.txt and all the output goes into output.txt. First I tried creating a for...

9. Shell Programming and Scripting

Use while loop to read file and use ${file} for both filename input into awk and as string to print

I have files named with different prefixes. From each I want to extract the first line containing a specific string, and then print that line along with the prefix. I've tried to do this with a while loop, but instead of printing the prefix I print the first line of the file twice. Files:...

10. Shell Programming and Scripting

Failure: if grep "$Var" "$line" inside while read line loop

Hi everybody, I am new at Unix/Bourne shell scripting and with my youngest experiences, I will not become very old with it :o My code: #!/bin/sh set -e set -u export IFS= optl="Optl" LOCSTORCLI="/opt/lsi/storcli/storcli" ($LOCSTORCLI /c0 /vall show | grep RAID | cut -d " "...

LEARN ABOUT DEBIAN

h5jam

h5jam(1)						      General Commands Manual							  h5jam(1)

NAME

       h5jam - Add a user block to a HDF5 file

SYNOPSIS

       h5jam -u user_block -i in_file.h5 [-o out_file.h5] [--clobber]

DESCRIPTION

       h5jam  concatenates  a  user_block  file  and an HDF5 file to create an HDF5 file with a user block. The user block can be either binary or
       text. The output file is padded so that the HDF5 header begins on byte 512, 1024, etc.. (See the HDF5 File Format.)

       If out_file.h5 is given, a new file is created with the user_block followed by the contents of in_file.h5.   In	this  case,  infile.h5	is
       unchanged.

       If out_file.h5 is not specified, the user_block is added to in_file.h5.

       If  in_file.h5  already	has  a	user  block,  the contents of user_block will be added to the end of the existing user block, and the file
       shifted to the next boundary. If --clobber is set, any existing user block will be overwritten.

EXAMPLE USAGE

       Create new file, newfile.h5, with the text in file mytext.txt as the user block for the HDF5 file file.h5.

	    h5jam -u mytext.txt -i file.h5 -o newfile.h5

       Add text in file mytext.txt to front of HDF5 dataset, file.h5.

	    h5jam -u mytext.txt -i file.h5

       Overwrite the user block (if any) in file.h5 with the contents of mytext.txt.

	    h5jam -u mytext.txt -i file.h5 --clobber

RETURN VALUE

       h5jam returns the size of the output file, or -1 if an error occurs.

CAVEATS

       This tool copies all the data (sequentially) in the file(s) to new offsets. For a large file, this copy will take a long time.

       The most efficient way to create a user block is to create the file with a user block (see H5Pset_user_block), and  write  the  user  block
       data into that space from a program.

       The  user block is completely opaque to the HDF5 library and to the h5jam and h5unjam tools.  The user block is simply read or written as a
       string of bytes, which could be text or any kind of binary data.  It is up to the user to know what the contents of the	user  block  means
       and how to process it.

       When the user block is extracted, all the data is written to the output, including any padding or unwritten data.

       This tool moves the HDF5 file through byte copies, i.e., it does not read or interpret the HDF5 objects.

SEE ALSO

       h5dump(1), h5ls(1), h5diff(1), h5import(1), gif2h5(1), h52gif(1), h5perf(1), h5unjam(1).

																	  h5jam(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read from a file and use the strings in a loop

Discussion started by: xboxer21

2. UNIX for Dummies Questions & Answers

How to read a file in unix using do....done loop

Discussion started by: sreenusola

3. Shell Programming and Scripting

How to Read the entire file using while loop

Discussion started by: sdosanjh

4. SCO

file system not getting mounted in read write mode after system power failure

Discussion started by: gtkpmbpl

5. Shell Programming and Scripting

IF awk in a while read line-loop

Discussion started by: IMPe

6. UNIX for Dummies Questions & Answers

read regex from ID file, print regex and line below from source file

Discussion started by: pathunkathunk

7. Shell Programming and Scripting

Using awk instead of while loop to read file

Discussion started by: anand2308

8. Shell Programming and Scripting

For loop inside awk to read and print contents of files

Discussion started by: jaldo0805

9. Shell Programming and Scripting

Use while loop to read file and use ${file} for both filename input into awk and as string to print

Discussion started by: pathunkathunk

10. Shell Programming and Scripting

Failure: if grep "$Var" "$line" inside while read line loop

Discussion started by: Subsonic66

LEARN ABOUT DEBIAN

h5jam