Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Check ID in a file matches to the name of the file Post 302998779 by nans on Wednesday 7th of June 2017 07:05:17 AM
Old 06-07-2017
Check ID in a file matches to the name of the file

I have a number of text tab files in my directory named 1.vcf 2.vcf etc. Each file file has headers of 120-130 rows starting with "#", it looks like this

Code:
...
##contig=<ID=GL000194.1,length=191469,assembly=hg19>
##contig=<ID=GL000225.1,length=211173,assembly=hg19>
##contig=<ID=GL000192.1,length=547496,assembly=hg19>
##contig=<ID=vcontig,length=337,assembly=hg19>
##reference=human_hg19.fasta
##source=SelectVariants
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  1
1       12012010        rs1000002       A       C       14325.14        .       AC=1;AF=0.500;AN=2;BaseQRankSum=-13...

As these files are created with an automated pipeline, I wish to introduce an id check, to see if each file name (1.vcf,2.vcf..) corresponds to the correct ID within the content file.
The ID is always present is the last line of the header after 'FORMAT'.
The files are always named according to ID.
I have been doing this manually so far, is there a way to script it ?

Last edited by nans; 06-07-2017 at 02:06 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Until the file extension matches

Hi All, In my script I am trying to input data from user and I want the promt to appear again if the input data is not the one expected. I tried something like this: echo " \n\n\t Enter the dump filename:\c";read dump pst=${dump##*.} until (test $pst = dmp) do ... (7 Replies)
Discussion started by: Sreejith_VK
7 Replies

2. Shell Programming and Scripting

find matches in file

Hi, im have log file ~100000 lines, 192.168.29.1 at 10/08/09 13:58:55 192.168.60.1 at 10/08/09 14:11:28 192.168.58.171 at 10/08/09 14:12:45 192.168.61.12 at 10/08/09 14:15:44 192.168.60.1 at 10/08/09 14:16:36 192.168.60.1 at 10/08/09 14:17:43 192.168.61.12 at 10/08/09 14:18:08... (9 Replies)
Discussion started by: Trump
9 Replies

3. Shell Programming and Scripting

get value that matches file name pattern

Hi I have files with names that contain the date in several formats as, YYYYMMDD, DD-MM-YY,DD.MM.YY or similar combinations. I know if a file fits in one pattern or other, but i donīt know how to extract the substring contained in the file that matches the pattern. For example, i know that ... (1 Reply)
Discussion started by: pjrm
1 Replies

4. Solaris

Before I delete any file in Unix, How can I check no open file handle is pointing to that file?

I know how to check if any file has a unix process using a file by looking at 'lsof <fullpath/filename>' command. I think using lsof is very expensive. Also to make it accurate we need to inlcude fullpath of the file. Is there another command that can tell if a file has a truely active... (12 Replies)
Discussion started by: kchinnam
12 Replies

5. Shell Programming and Scripting

Does uniq -d only check for consecutive matches?

Hi All I have a rather large text file of approx 1m records in the format:- 20110877837-2.PDF 20100298984-3.PDF et al... I want to run uniq against the file to make sure there are no duplicate names..... uniq -d /path/to/input/file.txt However this is not producing any... (1 Reply)
Discussion started by: Bashingaway
1 Replies

6. UNIX for Dummies Questions & Answers

Pipe binary file matches grep results to file

I am using grep to match a pattern, but the output is strange. $ grep -r -o "pattern" * Gives me: Binary file foo1 matches Binary file foo2 matches Binary file foo3 matches To find the lines before/after, I then have to use the following on each file: $ strings foo1 | grep -A1 -B1... (0 Replies)
Discussion started by: chipperuga
0 Replies

7. Shell Programming and Scripting

FTP a file if the date matches

Hi, I am trying to write a script where I need to pull any file if the date is from yesterday. Can you please help me on how to check the dates for the files on the remote server? Please let me know for any questions. Thanks Ajay (4 Replies)
Discussion started by: ajayakunuri
4 Replies

8. Shell Programming and Scripting

Required 3 lines above the file and below file when string matches

i had requirement like i need to get "error" line of above 3 and below 3 from a file .I tried with the below script.But it's not working. y='grep -n -i error /home/file.txt|cut -c1' echo $y head -$y /home/file.txt| tail -3 >tmp.txt tail -$y /home/file.txt head -3 >>tmp.txt (4 Replies)
Discussion started by: bhas85
4 Replies

9. Shell Programming and Scripting

Replace string of a file with a string of another file for matches using grep,sed,awk

I have a file comp.pkglist which mention package version and release . In 'version change' and 'release change' line there are two versions 'old' and 'new' Version Change: --> Release Change: --> cat comp.pkglist Package list: nss-util-devel-3.28.4-1.el6_9.x86_64 Version Change: 3.28.4 -->... (1 Reply)
Discussion started by: Paras Pandey
1 Replies

10. Shell Programming and Scripting

Match text to lines in a file, iterate backwards until text or text substring matches, print to file

hi all, trying this using shell/bash with sed/awk/grep I have two files, one containing one column, the other containing multiple columns (comma delimited). file1.txt abc12345 def12345 ghi54321 ... file2.txt abc1,text1,texta abc,text2,textb def123,text3,textc gh,text4,textd... (6 Replies)
Discussion started by: shogun1970
6 Replies
PPI::Node(3)						User Contributed Perl Documentation					      PPI::Node(3)

NAME
PPI::Node - Abstract PPI Node class, an Element that can contain other Elements INHERITANCE
PPI::Node isa PPI::Element SYNOPSIS
# Create a typical node (a Document in this case) my $Node = PPI::Document->new; # Add an element to the node( in this case, a token ) my $Token = PPI::Token::Word->new('my'); $Node->add_element( $Token ); # Get the elements for the Node my @elements = $Node->children; # Find all the barewords within a Node my $barewords = $Node->find( 'PPI::Token::Word' ); # Find by more complex criteria my $my_tokens = $Node->find( sub { $_[1]->content eq 'my' } ); # Remove all the whitespace $Node->prune( 'PPI::Token::Whitespace' ); # Remove by more complex criteria $Node->prune( sub { $_[1]->content eq 'my' } ); DESCRIPTION
The "PPI::Node" class provides an abstract base class for the Element classes that are able to contain other elements PPI::Document, PPI::Statement, and PPI::Structure. As well as those listed below, all of the methods that apply to PPI::Element objects also apply to "PPI::Node" objects. METHODS
scope The "scope" method returns true if the node represents a lexical scope boundary, or false if it does not. add_element $Element The "add_element" method adds a PPI::Element object to the end of a "PPI::Node". Because Elements maintain links to their parent, an Element can only be added to a single Node. Returns true if the PPI::Element was added. Returns "undef" if the Element was already within another Node, or the method is not passed a PPI::Element object. elements The "elements" method accesses all child elements structurally within the "PPI::Node" object. Note that in the base of the PPI::Structure classes, this "DOES" include the brace tokens at either end of the structure. Returns a list of zero or more PPI::Element objects. Alternatively, if called in the scalar context, the "elements" method returns a count of the number of elements. first_element The "first_element" method accesses the first element structurally within the "PPI::Node" object. As for the "elements" method, this does include the brace tokens for PPI::Structure objects. Returns a PPI::Element object, or "undef" if for some reason the "PPI::Node" object does not contain any elements. last_element The "last_element" method accesses the last element structurally within the "PPI::Node" object. As for the "elements" method, this does include the brace tokens for PPI::Structure objects. Returns a PPI::Element object, or "undef" if for some reason the "PPI::Node" object does not contain any elements. children The "children" method accesses all child elements lexically within the "PPI::Node" object. Note that in the case of the PPI::Structure classes, this does NOT include the brace tokens at either end of the structure. Returns a list of zero of more PPI::Element objects. Alternatively, if called in the scalar context, the "children" method returns a count of the number of lexical children. schildren The "schildren" method is really just a convenience, the significant-only variation of the normal "children" method. In list context, returns a list of significant children. In scalar context, returns the number of significant children. child $index The "child" method accesses a child PPI::Element object by its position within the Node. Returns a PPI::Element object, or "undef" if there is no child element at that node. schild $index The lexical structure of the Perl language ignores 'insignificant' items, such as whitespace and comments, while PPI treats these items as valid tokens so that it can reassemble the file at any time. Because of this, in many situations there is a need to find an Element within a Node by index, only counting lexically significant Elements. The "schild" method returns a child Element by index, ignoring insignificant Elements. The index of a child Element is specified in the same way as for a normal array, with the first Element at index 0, and negative indexes used to identify a "from the end" position. contains $Element The "contains" method is used to determine if another PPI::Element object is logically "within" a "PPI::Node". For the special case of the brace tokens at either side of a PPI::Structure object, they are generally considered "within" a PPI::Structure object, even if they are not actually in the elements for the PPI::Structure. Returns true if the PPI::Element is within us, false if not, or "undef" on error. find $class | &wanted The "find" method is used to search within a code tree for PPI::Element objects that meet a particular condition. To specify the condition, the method can be provided with either a simple class name (full or shortened), or a "CODE"/function reference. # Find all single quotes in a Document (which is a Node) $Document->find('PPI::Quote::Single'); # The same thing with a shortened class name $Document->find('Quote::Single'); # Anything more elaborate, we so with the sub $Document->find( sub { # At the top level of the file... $_[1]->parent == $_[0] and ( # ...find all comments and POD $_[1]->isa('PPI::Token::Pod') or $_[1]->isa('PPI::Token::Comment') ) } ); The function will be passed two arguments, the top-level "PPI::Node" you are searching in and the current PPI::Element that the condition is testing. The anonymous function should return one of three values. Returning true indicates a condition match, defined-false (0 or '') indicates no- match, and "undef" indicates no-match and no-descend. In the last case, the tree walker will skip over anything below the "undef"-returning element and move on to the next element at the same level. To halt the entire search and return "undef" immediately, a condition function should throw an exception (i.e. "die"). Note that this same wanted logic is used for all methods documented to have a "&wanted" parameter, as this one does. The "find" method returns a reference to an array of PPI::Element objects that match the condition, false (but defined) if no Elements match the condition, or "undef" if you provide a bad condition, or an error occurs during the search process. In the case of a bad condition, a warning will be emitted as well. find_first $class | &wanted If the normal "find" method is like a grep, then "find_first" is equivalent to the Scalar::Util "first" function. Given an element class or a wanted function, it will search depth-first through a tree until it finds something that matches the condition, returning the first Element that it encounters. See the "find" method for details on the format of the search condition. Returns the first PPI::Element object that matches the condition, false if nothing matches the condition, or "undef" if given an invalid condition, or an error occurs. find_any $class | &wanted The "find_any" method is a short-circuiting true/false method that behaves like the normal "find" method, but returns true as soon as it finds any Elements that match the search condition. See the "find" method for details on the format of the search condition. Returns true if any Elements that match the condition can be found, false if not, or "undef" if given an invalid condition, or an error occurs. remove_child $Element If passed a PPI::Element object that is a direct child of the Node, the "remove_element" method will remove the "Element" intact, along with any of its children. As such, this method acts essentially as a 'cut' function. If successful, returns the removed element. Otherwise, returns "undef". prune $class | &wanted The "prune" method is used to strip PPI::Element objects out of a code tree. The argument is the same as for the "find" method, either a class name, or an anonymous subroutine which returns true/false. Any Element that matches the class|wanted will be deleted from the code tree, along with any of its children. The "prune" method returns the number of "Element" objects that matched and were removed, non-recursively. This might also be zero, so avoid a simple true/false test on the return false of the "prune" method. It returns "undef" on error, which you probably should test for. TO DO
- Move as much as possible to PPI::XS SUPPORT
See the support section in the main module. AUTHOR
Adam Kennedy <adamk@cpan.org> COPYRIGHT
Copyright 2001 - 2011 Adam Kennedy. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. The full text of the license can be found in the LICENSE file included with this module. perl v5.18.2 2011-02-25 PPI::Node(3)
All times are GMT -4. The time now is 08:53 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy