I would be grateful for your help with the following.
I have the following file (file.txt), which is about 10,000 lines long:
The IDs in the first two columns can occur between 1 to 10 times in the file (in either column 1 or column 2).
What I want to achieve:
I want to scan this file line by line, and print IDs to an ever-growing exclusion list if they meet the following criteria:
So applying this to row 1, either ID1 or ID2 should have been added to my exclusion list.
I then want to delete all lines in the file where that ID from the exclusion list appears. This can be up to 10 rows.
Output for file.txt once row 1 has been scanned:
ID3 ID4 0 0 0.4 0.8
ID6 ID2 1 0 0.4 0.8
And exclusionlist.txt:
ID1
I then want to start again at the new row 1, and execute the same process, but keep adding my exclusion from the new row 1 to the same exclusion list.
The commands that I have at my disposal are:
But there are problems inherent in this:
The exclusionlist.txt does not 'keep growing'.
Also, how do I loop it back so that it starts again at line 1?
I would be grateful for any solutions.
Thank you,
A.B.
Last edited by aberg; 08-11-2017 at 01:15 PM..
Reason: extra code tags
Yes, I want to append to the same list (rather than over-writing), and I'm not sure how to do that.
Also, I want this to loop so that it starts again at the new line 1 once the original line 1 (plus any other lines containing the exclusion) has been removed.
Let me paraphrase your request: You select either of the IDs in field1 or 2 depending on conditions in the rest of the line, and then remove all occurrences of the selected ID in the rest of the file. Do you HAVE to populate the exclusion file, i.e. do you need it afterwards? Or would a single pass operation be sufficient, removing ALL the applicable IDs?
Yes, your interpretation is correct, and it is vitally important that I populate an exclusion file. The original file itself should eventually grind itself down to 0 lines, and it is the the exclusion file that I am interested in.
My latest attempt is as follows (it involves having to rename file.txt to 1.txt):
Due to my poor scripting skills, I am having to: (1) rename my file after each loop in order for it to be continuously executed, and (2) ending up with a new exclusion list per loop, rather than a single 'master' exclusion list - I can easily concatenate them all at the end, so this is not a major problem, but it's messy.
The problem I have when I execute this script is that it seems to scan through the whole file on the first pass (rather than just line 1), creating a long exclusion list just from the first run.
Any help/suggestions would be greatly appreciated.
Hi, I'd like to loop an action over all files with given extension within a folder.
The "main" action is: awk -F "\t" 'BEGIN{OFS="\t"}{if ($10=="S") print$0; }' input.txt > output.txt
The input.txt should be every file in the folder with *.subVCF extension; and the output should be a file... (3 Replies)
Dear folks
I have two data set which there names are "final.map" and "1.geno" and look like this structures:
final.map:
gi|358485511|ref|NC_006088.3| 2044
gi|358485511|ref|NC_006088.3| 2048
gi|358485511|ref|NC_006088.3| 2187
gi|358485511|ref|NC_006088.3| 17654
... (2 Replies)
Hi All,
I am new to AWK programming. I have the following for loop in my awk program.
cat printhtml.awk:
BEGIN
-------- <some code here>
END{
----------<some code here>
for(N=0; N<H; N++)
{
for(M=5; M<D; M++) print "\t" D "";
}
-----
}
... (2 Replies)
Hi ,
Please excuse me for opening a new thread i am unable to find out the syntax error
in my if else condition inside for loop in awk command ,
my actual aim is to print formatted html td tag when if condition (True) having string as "failed",
could anyone please advise what is the right... (2 Replies)
limit.csv data
--------------
5600050 38Nhava
400077 27Bomay
rate.txt data
-------------
38NhaVA
27BomaY
27Bomay
below is my script:
for i in `cat limit.csv`
do
b=`awk '{print $1}' $i` (4 Replies)
I am parsing file for the fields using awk command, first i check 26th field for two characters using substr function if it matches then using for loop on array i search 184th field for 4 chars if it matches then i print the required fields but on execution i get the error, please help...... (5 Replies)
Hello all,
Here is what my bash script does: sums number columns, saves the tot in new column, outputs if tot >= threshold val:
> cat getnon0file.sh
#!/bin/bash
this="getnon0file.sh"
USAGE=$this"
InFile="xyz.38"
Min="0.05"
#
awk '{sum=0; for(n=2; n<=NF; n++){sum+=$n};... (4 Replies)
I have two files which I would like to compare and then manipulate in a way.
File1:
pictures.txt 1.1 1.3
dance.txt 1.2 1.4
treehouse.txt 1.3 1.5
File2:
pictures.txt 1.5 ref2313 1.4 ref2345 1.3 ref5432 1.2 ref4244
dance.txt 1.6 ref2342 1.5 ref2352 1.4 ref0695 1.3 ref5738 1.2... (1 Reply)
Hello!
I've got a loop in which I am processing a list of values gotten through a file with read command.
It seems that instead of processing the lines (values) one by one, I process them all together.
the input file is:
20
20
20
80
70
70
20
The code is: (2 Replies)