Sponsored Content
Top Forums Shell Programming and Scripting Match and store numerical prefix to update files Post 302990907 by cmccabe on Thursday 2nd of February 2017 02:38:24 PM
Old 02-02-2017
Code:
#!/bin/bash

# gunzip files
logfile=/home/cmccabe/Desktop/NGS/test/process.log
for f in /home/cmccabe/Desktop/NGS/test/*.vcf ; do
     echo "Start vcf.gz creation: $(date) - file: $f"
     bname=`basename $f`
     gzip $f
     echo "End vcf.gz creation: $(date) - file: $f"
done >> "$logfile"

# find undefined annotations
logfile=/home/cmccabe/Desktop/NGS/test/process.log
for f in /home/cmccabe/Desktop/NGS/test/*.vcf.gz ; do
     echo "Start vcf missing header creation: $(date) - file: $f"
     bname=`basename $f`
     pref=${bname%%.vcf.gz}
    bcftools view -h $f > /home/cmccabe/Desktop/NGS/test/${pref}_header.txt
     echo "End missing header creation: $(date) - file: $f"
done >> "$logfile"

# match files
IAm=${0##*/}

InDir1='/home/cmccabe/Desktop/NGS/test'
InDir2='/home/cmccabe/Desktop/NGS/test'
OutDir='/home/cmccabe/Desktop/NGS/test'

cd "$InDir1"
for file1 in *.txt
do	# Grab file prefix.
	p=${file1%%_*}

	# Find matching file2.
	file2=$(printf '%s' "$InDir2/$p"_*.vcf)
	if [ ! -f "$file2" ]
	then	printf '%s: No single file matching %s found.\n' "$IAm" \
		    "$file1" >&2
		continue
	fi
# store matches
    out=${file1##*/} && ${file2##*/}

# edit the header
logfile=/home/cmccabe/Desktop/NGS/test/process.log
for f in /home/cmccabe/Desktop/NGS/test/*.vcf.gz ; do
     echo "Start vcf edit header creation: $(date) - file: $f"
     bname=`basename $f`
     pref=${bname%%.vcf.gz}
      bcftools reheader -h $file1 $file2 > ${pref}_fixed.vcf.gz
     echo "End edit header creation: $(date) - file: $f"
done >> "$logfile"

Current output
syntax error: unexpected end of file, though the portion in bold creates the files that are need in the directory.

desired output
Code:
file1_header.txt is stored as $file1 matched with file1.vcf.gz stored as $file2
file2_header.txt is stored as $file1 matched with file2.vcf.gz stored as $file2f
file3_header.txt is stored as $file1 matched with file3.vcf.gz stored as $file2

Currently no arguments are being passed to reheader (in italics in the command)

desired arguments passed to reheader
Code:
$file1
$file2

The loop would use each argument in the below to create a new file updated with the fixed header:
Code:
bcftools reheader -h $file1 $file2 > ${pref}_fixed.vcf.gz

Code:
ls -l
total 12288
-rwxrwx---+ 1 cmccabe Domain Users 2031823 Jan 31 11:07 file1.vcf.gz
-rwxrwx---+ 1 cmccabe Domain Users    9312 Feb  2 13:17 file1_header.txt
-rwxrwx---+ 1 cmccabe Domain Users 2361873 Jan 31 11:07 file2.vcf.gz
-rwxrwx---+ 1 cmccabe Domain Users    9315 Feb  2 13:17 file2_header.txt
-rwxrwx---+ 1 cmccabe Domain Users 1816662 Jan 31 11:07 file3.vcf.gz
-rwxrwx---+ 1 cmccabe Domain Users    9313 Feb  2 13:17 file3_header.txt
-rwxrwx---+ 1 cmccabe Domain Users    1356 Feb  2 13:22 process.log
-rwxrwx---+ 1 cmccabe Domain Users    1278 Feb  2 13:17 process.log~

I hope this helps and thank you very much Smilie.
 

10 More Discussions You Might Find Interesting

1. Programming

Fuzzy Match Logic for Numerical Values

I have searched the internet (including these forums) and perhaps I'm not using the right wording. What I'm looking for is a function (preferably C) that analyzes the similitude of two numerical or near-numerical values, and returns either a true/false (match/nomatch) or a return code that... (4 Replies)
Discussion started by: marcus121
4 Replies

2. Shell Programming and Scripting

Match Pattern and store next value into array

Hi, I am trying to write a script which parses a log file and will eventually put the values in an array so that I can perform some math on it. In this file I am only interested in the last 200 lines so here is the command I use to display the contents in a manageable manner. tail -200... (3 Replies)
Discussion started by: Filter500
3 Replies

3. Shell Programming and Scripting

Match columns from two csv files and update field in one of the csv file

Hi, I have a file of csv data, which looks like this: file1: 1AA,LGV_PONCEY_LES_ATHEE,1,\N,1,00020460E1,0,\N,\N,\N,\N,2,00.22335321,0.00466628 2BB,LES_POUGES_ASF,\N,200,200,00006298G1,0,\N,\N,\N,\N,1,00.30887539,0.00050312... (10 Replies)
Discussion started by: djoseph
10 Replies

4. Shell Programming and Scripting

Extract Uniq prefix from a start and end prefix

Dear All, assume i have a file with content: <Start>6000</Start> <Stop>7599</Stop> the output is: 6000 7000 7100 7200 7300 7400 7599 how should we use any awk, sed, perl can do this task, means to extract the uniq prefixes from the start and stop prefix. Thanks Jimmy (3 Replies)
Discussion started by: jimmy_y
3 Replies

5. Shell Programming and Scripting

Bash to match and store line as variable

The bash below loops through a specific directory dir and finds and writes the oldest folder to a variable called $filename. #!/bin/bash # oldest folder stored as variable for analysis, version log created, and quality indicators matched to run dir=/home/cmccabe/Desktop/NGS/test find... (2 Replies)
Discussion started by: cmccabe
2 Replies

6. Shell Programming and Scripting

awk to update file based on partial match in field1 and exact match in field2

I am trying to create a cronjob that will run on startup that will look at a list.txt file to see if there is a later version of a database using database.txt as the source. The matching lines are written to output. $1 in database.txt will be in list.txt as a partial match. $2 of database.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

7. Shell Programming and Scripting

Bash to add portion of text to files in directory using numerical match

In the below bash I am trying to rename eachof the 3 text files in /home/cmccabe/Desktop/percent by matching the numerical portion of each file to lines 3,4, or 5 in /home/cmccabe/Desktop/analysis.txt. There will always be a match between the files. When a match is found each text file in... (2 Replies)
Discussion started by: cmccabe
2 Replies

8. Shell Programming and Scripting

awk to update file with numerical difference if condition is met

In the file1 below if $9 and $12 are . (dot) then the value in $8 of file1 is used as a key (exact match) to lookup in each $2 of file2, when a match is found then the value of $4 in file1 is used to look for a range match within +/- 50 using the values in $4 and after in file2. The number of... (9 Replies)
Discussion started by: cmccabe
9 Replies

9. Shell Programming and Scripting

awk move select fields to match file prefix in two directories

In the awk below I am trying to use the file1 as a match to file2. In file2 the contents of $5,&6,and $7 (always tab-delimited) and are copied to the output under the header Quality metrics. The below executes but the output is empty. I have added comments to help and show my thinking. Thank you... (0 Replies)
Discussion started by: cmccabe
0 Replies

10. Shell Programming and Scripting

Bash to update file on prefix match in two directories

I am trying to use bash to loop through a directory /path/to/data using a prefix match from /path/to/file. That match is obtained and works using the code below (in green)... what I can not seem to do is populate or update the corresponding prefix_file.txt in /path/to/data with the values in each... (3 Replies)
Discussion started by: cmccabe
3 Replies
All times are GMT -4. The time now is 09:28 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy