Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Check ID in a file matches to the name of the file Post 302998806 by nans on Wednesday 7th of June 2017 05:42:07 PM
Old 06-07-2017
Thank you both. Don Cragun's code works for me

@Rudi how is it checking if the id matches ? For example, I tried the code on this file 2.vcf that looks like this

Code:
##fileformat=VCFv4.0
##fileDate=20090805
##source=myImputationProgramV3.1
##reference=1000GenomesPilot-NCBI36
##phasing=partial
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP membership, build 129">
##INFO=<ID=H2,Number=0,Type=Flag,Description="HapMap2 membership">
##FILTER=<ID=q10,Description="Quality below 10">
##FILTER=<ID=s50,Description="Less than 50% of samples have data">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=HQ,Number=2,Type=Integer,Description="Haplotype Quality">
#CHROM POS     ID        REF ALT    QUAL FILTER INFO                              FORMAT      44
20     14370   rs6054257 G      A       29   PASS   NS=3;DP=14;AF=0.5;DB;H2           GT:GQ:DP:HQ 0|0:48:1:51,51 1|0:48:8:51,51 1/1:43:5:.,.
20     17330   .         T      A       3    q10    NS=3;DP=11;AF=0.017               GT:GQ:DP:HQ 0|0:49:3:58,50 0|1:3:5:65,3   0/0:41:3

The code should return that the ID 44 in the file does not match with the file name 2.vcf, it returns a value of 1.
VCF (variant call format) is only text tab delimited file
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Until the file extension matches

Hi All, In my script I am trying to input data from user and I want the promt to appear again if the input data is not the one expected. I tried something like this: echo " \n\n\t Enter the dump filename:\c";read dump pst=${dump##*.} until (test $pst = dmp) do ... (7 Replies)
Discussion started by: Sreejith_VK
7 Replies

2. Shell Programming and Scripting

find matches in file

Hi, im have log file ~100000 lines, 192.168.29.1 at 10/08/09 13:58:55 192.168.60.1 at 10/08/09 14:11:28 192.168.58.171 at 10/08/09 14:12:45 192.168.61.12 at 10/08/09 14:15:44 192.168.60.1 at 10/08/09 14:16:36 192.168.60.1 at 10/08/09 14:17:43 192.168.61.12 at 10/08/09 14:18:08... (9 Replies)
Discussion started by: Trump
9 Replies

3. Shell Programming and Scripting

get value that matches file name pattern

Hi I have files with names that contain the date in several formats as, YYYYMMDD, DD-MM-YY,DD.MM.YY or similar combinations. I know if a file fits in one pattern or other, but i donīt know how to extract the substring contained in the file that matches the pattern. For example, i know that ... (1 Reply)
Discussion started by: pjrm
1 Replies

4. Solaris

Before I delete any file in Unix, How can I check no open file handle is pointing to that file?

I know how to check if any file has a unix process using a file by looking at 'lsof <fullpath/filename>' command. I think using lsof is very expensive. Also to make it accurate we need to inlcude fullpath of the file. Is there another command that can tell if a file has a truely active... (12 Replies)
Discussion started by: kchinnam
12 Replies

5. Shell Programming and Scripting

Does uniq -d only check for consecutive matches?

Hi All I have a rather large text file of approx 1m records in the format:- 20110877837-2.PDF 20100298984-3.PDF et al... I want to run uniq against the file to make sure there are no duplicate names..... uniq -d /path/to/input/file.txt However this is not producing any... (1 Reply)
Discussion started by: Bashingaway
1 Replies

6. UNIX for Dummies Questions & Answers

Pipe binary file matches grep results to file

I am using grep to match a pattern, but the output is strange. $ grep -r -o "pattern" * Gives me: Binary file foo1 matches Binary file foo2 matches Binary file foo3 matches To find the lines before/after, I then have to use the following on each file: $ strings foo1 | grep -A1 -B1... (0 Replies)
Discussion started by: chipperuga
0 Replies

7. Shell Programming and Scripting

FTP a file if the date matches

Hi, I am trying to write a script where I need to pull any file if the date is from yesterday. Can you please help me on how to check the dates for the files on the remote server? Please let me know for any questions. Thanks Ajay (4 Replies)
Discussion started by: ajayakunuri
4 Replies

8. Shell Programming and Scripting

Required 3 lines above the file and below file when string matches

i had requirement like i need to get "error" line of above 3 and below 3 from a file .I tried with the below script.But it's not working. y='grep -n -i error /home/file.txt|cut -c1' echo $y head -$y /home/file.txt| tail -3 >tmp.txt tail -$y /home/file.txt head -3 >>tmp.txt (4 Replies)
Discussion started by: bhas85
4 Replies

9. Shell Programming and Scripting

Replace string of a file with a string of another file for matches using grep,sed,awk

I have a file comp.pkglist which mention package version and release . In 'version change' and 'release change' line there are two versions 'old' and 'new' Version Change: --> Release Change: --> cat comp.pkglist Package list: nss-util-devel-3.28.4-1.el6_9.x86_64 Version Change: 3.28.4 -->... (1 Reply)
Discussion started by: Paras Pandey
1 Replies

10. Shell Programming and Scripting

Match text to lines in a file, iterate backwards until text or text substring matches, print to file

hi all, trying this using shell/bash with sed/awk/grep I have two files, one containing one column, the other containing multiple columns (comma delimited). file1.txt abc12345 def12345 ghi54321 ... file2.txt abc1,text1,texta abc,text2,textb def123,text3,textc gh,text4,textd... (6 Replies)
Discussion started by: shogun1970
6 Replies
REGEXP(6)							   Games Manual 							 REGEXP(6)

NAME
regexp - regular expression notation DESCRIPTION
A regular expression specifies a set of strings of characters. A member of this set of strings is said to be matched by the regular expression. In many applications a delimiter character, commonly bounds a regular expression. In the following specification for regular expressions the word `character' means any character (rune) but newline. The syntax for a regular expression e0 is e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')' e2: e3 | e2 REP REP: '*' | '+' | '?' e1: e2 | e1 e2 e0: e1 | e0 '|' e1 A literal is any non-metacharacter, or a metacharacter (one of .*+?[]()|^$), or the delimiter preceded by A charclass is a nonempty string s bracketed [s] (or [^s]); it matches any character in (or not in) s. A negated character class never matches newline. A substring a-b, with a and b in ascending order, stands for the inclusive range of characters between a and b. In s, the metacharacters an initial and the regular expression delimiter must be preceded by a other metacharacters have no special meaning and may appear unescaped. A matches any character. A matches the beginning of a line; matches the end of the line. The REP operators match zero or more (*), one or more (+), zero or one (?), instances respectively of the preceding regular expression e2. A concatenated regular expression, e1e2, matches a match to e1 followed by a match to e2. An alternative regular expression, e0|e1, matches either a match to e0 or a match to e1. A match to any part of a regular expression extends as far as possible without preventing a match to the remainder of the regular expres- sion. SEE ALSO
awk(1), ed(1), sam(1), sed(1), regexp(2) REGEXP(6)
All times are GMT -4. The time now is 08:03 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy