Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Merge 4 bim files by keeping only the overlapping variants (unique rs values ) Post 303042724 by fondan on Saturday 4th of January 2020 11:17:26 AM
Old 01-04-2020
Merge 4 bim files by keeping only the overlapping variants (unique rs values )

Dear community, I am facing a problem and I kindly ask your help:


I have 4 different data sets consisted from 3 different types of array.



On each file, column 1 is chromosome position, column 2 is SNP id etc... Lets say I have the following (bim) datasets:


x2014:
Code:
1       rs3094315       0       752566  G       A
1       rs3131972       0       752721  G       A

....more 550.000


x2016:
Code:
0       200610-10       0       0       G       A
0       200610-108      0       0       G       A

...


x2017
Code:
0       200610-10       0       0       G       A
0       200610-108      0       0       G       A

...



x2018:
Code:
0       200610-10       0       0       G       A
0       200610-108      0       0       G       A

.....more 550K rows




How can I merge all files together, without having any duplicate values based on the 2nd column (rs_id)?

Last edited by vbe; 01-04-2020 at 12:40 PM.. Reason: code tage please
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Need to find only unique values for a given tag across the files

Need to find only unique values for a given tag across the files: For eg: Test1: <Tag1>aaa</Tag1> <Tag2>bbb</Tag2> <Tag3>ccc</Tag3> Test2: <Tag1>aaa</Tag1> <Tag2>ddd</Tag2> <Tag3>eee</Tag3> Test3: <Tag1>aaa</Tag1> <Tag2>ddd</Tag2> <Tag3>eee</Tag3> Test4: (8 Replies)
Discussion started by: sudheshnaiyer
8 Replies

2. Shell Programming and Scripting

comparing 2 text files to get unique values??

Hi all, I have got a problem while comparing 2 text files and the result should contains the unique values(Non repeatable). For eg: file1.txt 1 2 3 4 file2.txt 2 3 So after comaping the above 2 files I should get only 1 and 4 as the output. Pls help me out. (7 Replies)
Discussion started by: smarty86
7 Replies

3. Shell Programming and Scripting

merge files with same row values

Hi everyone, I'm just wondering how could I using awk language merge two files by comparison of one their row. I mean, I have one file like this: file#1: 21/07/2009 11:45:00 100.0000000 27.2727280 21/07/2009 11:50:00 75.9856644 25.2492676 21/07/2009 11:55:00 51.9713287 23.2258072... (4 Replies)
Discussion started by: tonet
4 Replies

4. Shell Programming and Scripting

sort split merge -u unique

Hi, this is about sorting a very large file (like 10 gb) to keep lines with unique entries across SOME of the columns. The line originally looked like this: sort -u -k2,2 -k3,3n -k4,4n -k5,5n -k6,6n file_unsorted > file_sorted please note the -u flag. The problem is that this single... (4 Replies)
Discussion started by: jbr950
4 Replies

5. UNIX for Dummies Questions & Answers

How to count specific columns and merge with unique ones?

Hi. I am not sure the title gives an optimal description of what I want to do. I have several text files that contain data in many columns. All the files are organized the same way, but the data in the columns might differ. I want to count the number of times data occur in specific columns,... (0 Replies)
Discussion started by: JamesT
0 Replies

6. UNIX for Dummies Questions & Answers

Merge two files with non-overlapping identities

Hi All, I wish to merge two files: file1: with header rsSNP-ID Chromosome Chr-Pos rs171 1 175261679 rs242 1 20869461 rs538 1 6160958 file2: without header disease:AAT deficiency:M0525101 rs1243168 20109307 1 disease:AAT deficiency:M0525101 rs4900229 20109307 1... (3 Replies)
Discussion started by: luoruicd
3 Replies

7. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

8. Shell Programming and Scripting

Identify the overlapping and non overlapping regions

file1 chr pos1 pos2 pos3 pos4 1)chr1 1000 2000 3000 4000 2)chr1 1380 1480 6800 7800 3)chr1 6700 7700 1200 2200 4)chr2 8500 9500 5670 6670 file2 chr pos1 pos2 pos3 pos4 1)chr2 8500 9500 5000 6000 2)chr1 6700 7700 1200 2200 3)chr1 1380 1480 6700 7700 4)chr1 1000 2000 4900 5900 I... (2 Replies)
Discussion started by: data_miner
2 Replies

9. Shell Programming and Scripting

Count Unique values from multiple lists of files

Looking for a little help here. I have 1000's of text files within a multiple folders. YYYY/ /MM /1000's Files Eg. 2014/01/1000 files 2014/02/1237 files 2014/03/1400 files There are folders for each year and each month, and within each monthly folder there are... (4 Replies)
Discussion started by: whegra
4 Replies

10. Shell Programming and Scripting

How to merge two files with unique values matching.?

I have one script as below: #!/bin/ksh Outputfile1="/home/OutputFile1.xls" Outputfile2="/home/OutputFile2.xls" InputFile1="/home/InputFile1.sql" InputFile2="/home/InputFile2.sql" echo "Select hobby, class, subject, sports, rollNumber from Student_Table" >> InputFile1 echo "Select rollNumber... (3 Replies)
Discussion started by: Sharma331
3 Replies
apache_mod_perl-108~358::mod_perl-2.0.7::docs::api::ApacUser:Contributed Perapache_mod_perl-108~358::mod_perl-2.0.7::docs::api::Apache2::compat(3)

NAME
Apache2::compat -- 1.0 backward compatibility functions deprecated in 2.0 Synopsis # either add at the very beginning of startup.pl use Apache2::compat; # or httpd.conf PerlModule Apache2::compat # override and restore compat functions colliding with mp2 API Apache2::compat::override_mp2_api('Apache2::Connection::local_addr'); my ($local_port, $local_addr) = sockaddr_in($c->local_addr); Apache2::compat::restore_mp2_api('Apache2::Connection::local_addr'); Description "Apache2::compat" provides mod_perl 1.0 compatibility layer and can be used to smooth the transition process to mod_perl 2.0. It includes functions that have changed their API or were removed in mod_perl 2.0. If your code uses any of those functions, you should load this module at the server startup, and everything should work as it did in 1.0. If it doesn't please report the bug, but before you do that please make sure that your code does work properly under mod_perl 1.0. However, remember, that it's implemented in pure Perl and not C, therefore its functionality is not optimized and it's the best to try to port your code not to use deprecated functions and stop using the compatibility layer. Compatibility Functions Colliding with mod_perl 2.0 API Most of the functions provided by Apache2::compat don't interfere with mod_perl 2.0 API. However there are several functions which have the same name in the mod_perl 1.0 and mod_perl 2.0 API, accept the same number of arguments, but either the arguments themselves aren't the same or the return values are different. For example the mod_perl 1.0 code: require Socket; my $sockaddr_in = $c->local_addr; my ($local_port, $local_addr) = Socket::sockaddr_in($sockaddr_in); should be adjusted to be: require Apache2::Connection; require APR::SockAddr; my $sockaddr = $c->local_addr; my ($local_port, $local_addr) = ($sockaddr->port, $sockaddr->ip_get); to work under mod_perl 2.0. As you can see in mod_perl 1.0 API local_addr() was returning a SOCKADDR_IN object (see the Socket perl manpage), in mod_perl 2.0 API it returns an "APR::SockAddr" object, which is a totally different beast. If Apache2::compat overrides the function "local_addr()" to be back- compatible with mod_perl 1.0 API. Any code that relies on this function to work as it should under mod_perl 2.0 will be broken. Therefore the solution is not to override "local_addr()" by default. Instead a special API is provided which overrides colliding functions only when needed and which can be restored when no longer needed. So for example if you have code from mod_perl 1.0: my ($local_port, $local_addr) = Socket::sockaddr_in($c->local_addr); and you aren't ready to port it to to use the mp2 API: my ($local_port, $local_addr) = ($c->local_addr->port, $c->local_addr->ip_get); you could do the following: Apache2::compat::override_mp2_api('Apache2::Connection::local_addr'); my ($local_port, $local_addr) = Socket::sockaddr_in($c->local_addr); Apache2::compat::restore_mp2_api('Apache2::Connection::local_addr'); Notice that you need to restore the API as soon as possible. Both "override_mp2_api()" and "restore_mp2_api()" accept a list of functions to operate on. Available Overridable Functions At the moment the following colliding functions are available for overriding: Apache2::RequestRec::notes Apache2::RequestRec::filename Apache2::RequestRec::finfo Apache2::Connection::local_addr Apache2::Connection::remote_addr Apache2::Util::ht_time Apache2::Module::top_module Apache2::Module::get_config APR::URI::unparse Use in CPAN Modules The short answer: Do not use "Apache2::compat" in CPAN modules. The long answer: "Apache2::compat" is useful during the mod_perl 1.0 code porting. Though remember that it's implemented in pure Perl. In certain cases it overrides mod_perl 2.0 methods, because their API is very different and doesn't map 1:1 to mod_perl 1.0. So if anything, not under user's control, loads "Apache2::compat" user's code is forced to use the potentially slower method. Which is quite bad. Some users may choose to keep using "Apache2::compat" in production and it may perform just fine. Other users will choose not to use that module, by porting their code to use mod_perl 2.0 API. However it should be users' choice whether to load this module or not and not to be enforced by CPAN modules. If you port your CPAN modules to work with mod_perl 2.0, you should follow the porting Perl and XS module guidelines. Users that are stuck with CPAN modules preloading "Apache2::compat", can prevent this from happening by adding $INC{'Apache2/compat.pm'} = __FILE__; at the very beginning of their startup.pl. But this will most certainly break the module that needed this module. API
You should be reading the mod_perl 1.0 API docs for usage of the methods and functions in this package, since what this module is doing is providing a backwards compatibility and it makes no sense to duplicate documentation. Another important document to read is: Migrating from mod_perl 1.0 to mod_perl 2.0 which covers all mod_perl 1.0 constants, functions and methods that have changed in mod_perl 2.0. See Also mod_perl 2.0 documentation. Copyright mod_perl 2.0 and its core modules are copyrighted under The Apache Software License, Version 2.0. Authors The mod_perl development team and numerous contributors. perl v5.16.2 2011-02-apache_mod_perl-108~358::mod_perl-2.0.7::docs::api::Apache2::compat(3)
All times are GMT -4. The time now is 10:42 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy