Sponsored Content
Top Forums UNIX for Beginners Questions & Answers UNIX - 2 tab delimited files, conditional column extraction Post 303014981 by RudiC on Sunday 25th of March 2018 04:17:05 AM
Old 03-25-2018
May I ask why file1 defines multiple but contiguous intervals which could be done in a single line defining min and max? E.g. 10 - 20, 20 - 30, and 30 - 40 could be condensed to 10 - 40 for Sc1 (which is what Scrutinizer's code does internally, assuming as well that intervals come in ascending order).
Would it be correct to assume that multiple interval definitions are to allow gaps between intervals? If yes, try (with e.g. the Sc1 20 30 line missing in the sample file)
Code:
awk -F"\t" '
NR == FNR       {INT[$1] = INT[$1] $2 "-" $3 FS
                 next
                }
                {split (INT[$1], T)
                 OUT = "NA"
                 for (t in T)   {split (T[t], LM, "-")
                                 if ($2 >= LM[1] && $2 < LM[2]) OUT = $4
                                }
                 print OUT
                } 

' file1 file2
NA
NA
25
40
NA
NA

With file1 from post#1, it's
Code:
100
NA
25
40
NA
NA

This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delete first column in tab-delimited text-file

I have a large text-file with tab-delimited genetic data that looks like: KSC112 KSC234 0 0 1 1 A G C T I simply wan to delete the first column, but since the file has 600 000 columns, it is not possible with awk (seems to be limited at 32k columns). Does anyone have an idea how to do this? (2 Replies)
Discussion started by: andmal
2 Replies

2. UNIX for Dummies Questions & Answers

Add a new column to a tab delimited text file

I want to add a new column to a tab delimited text file. It will be the first column and it will just be 1's. How do I go about doing that? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

3. Shell Programming and Scripting

Using sed on 1st column of tab delimited file

Hi all, I'm new to Unix and work primarily in bioinformatics. I am in need of a script which will allow me to replace "1" with "chr1" in only the first column of a file which looks like such: 1 10327 rs112750067 T C . PASS ASP;RSPOS=10327;... (4 Replies)
Discussion started by: Hkins552
4 Replies

4. UNIX for Dummies Questions & Answers

Using awk to log transform a column in a tab-delimited text file?

How do I use awk to log transform the fifth column of a tab-delimited text file? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

5. Shell Programming and Scripting

Extract second column tab delimited file

I have a file which looks like this: 73450 articles and news developmental psychology 2006-03-30 16:22:40 1 http://www.usnews.com 73450 articles and news developmental psychology 2006-03-30 16:22:40 2 http://www.apa.org 73450 articles and news developmental psychology 2006-03-30... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies

6. UNIX for Dummies Questions & Answers

add (append) a column in a tab delimited file

I have a file having the following entries: test1 test2 test3 11 22 33 22 44 66 99 99 44 --- I want to add a column so that the above file becomes: test1 test2 test3 notest 11 22 33 * 22 44 66 * 99 99 44 * --- Thanks (6 Replies)
Discussion started by: mary271
6 Replies

7. Shell Programming and Scripting

Convert a 3 column tab delimited file to a matrix

Hi all, I have a 3 columns input file like this: CPLX9PC-4943 CPLX9PC-4943 1 CPLX9PC-4943 CpxID123 0 CPLX9PC-4943 CpxID126 0 CPLX9PC-4943 CPLX9PC-5763 0.5 CPLX9PC-4943 CpxID13 0 CPLX9PC-4943 CPLX9PC-6163 0 CPLX9PC-4943 CPLX9PC-6164 0.04... (7 Replies)
Discussion started by: AshwaniSharma09
7 Replies

8. Shell Programming and Scripting

Delete an entire column from a tab delimited file

Hi, Can anyone please tell me about how we can delete an entire column from a tab delimited file? Mu input_file.txt looks like this: And I want the output as: I used the below code nawk -v d="1" 'BEGIN{FS=OFS="\t"}{$d=""}{print}' input_file.txtBut in the output, the first column is... (5 Replies)
Discussion started by: sampoorna
5 Replies

9. UNIX for Dummies Questions & Answers

awk - Extract 4 lines in Column to Rows Tab Delimited between tags

I have tried the following to no avail. xargs -n8 < test.txt awk '{if(NR%6!=0){p=""}else{p="\n"};printf $0" "p}' Mod_Alm_log.txt > test.txt I have tried different variations of the above, the problem is mixes lines together. And it includes the tags "%a and %A" I need them to be all tab... (16 Replies)
Discussion started by: mytouchsr
16 Replies

10. UNIX for Beginners Questions & Answers

Replace a column in tab delimited file with column in other tab delimited file,based on match

Hello Everyone.. I want to replace the retail col from FileI with cstp1 col from FileP if the strpno matches in both files FileP.txt ... (2 Replies)
Discussion started by: YogeshG
2 Replies
VCF-ISEC(1)							   User Commands						       VCF-ISEC(1)

NAME
vcf-isec - create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files SYNOPSIS
vcf-isec [OPTIONS] file1.vcf file2.vcf ... DESCRIPTION
About: Create intersections, unions, complements on bgzipped and tabix indexed VCF or tab-delimited files. Note that lines from all files can be intermixed together on the output, which can yield unexpected results. OPTIONS
-C, --chromosomes <list|file> Process the given chromosomes (comma-separated list or one chromosome per line in a file). -c, --complement Output positions present in the first file but missing from the other files. -d, --debug Debugging information -f, --force Continue even if the script complains about differing columns. -o, --one-file-only Print only entries from the left-most file. Without -o, all unique positions will be printed. -n, --nfiles [+-=]<int> Output positions present in this many (=), this many or more (+), or this many or fewer (-) files. -p, --prefix <path> If present, multiple files will be created with all possible isec combinations. (Suitable for Venn Diagram analysis.) -t, --tab <chr:pos:file> Tab-delimited file with indexes of chromosome and position columns. (1-based indexes) -w, --win <int> In repetitive sequences, the same indel can be called at different positions. Consider records this far apart as matching (be it a SNP or an indel). -h, -?, --help This help message. EXAMPLES
bgzip file.vcf; tabix -p vcf file.vcf.gz bgzip file.tab; tabix -s 1 -b 2 -e 2 file.tab.gz vcf-isec 0.1.5 July 2011 VCF-ISEC(1)
All times are GMT -4. The time now is 08:50 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy