Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Merge 4 bim files by keeping only the overlapping variants (unique rs values ) Post 303042724 by fondan on Saturday 4th of January 2020 11:17:26 AM
Old 01-04-2020
Merge 4 bim files by keeping only the overlapping variants (unique rs values )

Dear community, I am facing a problem and I kindly ask your help:


I have 4 different data sets consisted from 3 different types of array.



On each file, column 1 is chromosome position, column 2 is SNP id etc... Lets say I have the following (bim) datasets:


x2014:
Code:
1       rs3094315       0       752566  G       A
1       rs3131972       0       752721  G       A

....more 550.000


x2016:
Code:
0       200610-10       0       0       G       A
0       200610-108      0       0       G       A

...


x2017
Code:
0       200610-10       0       0       G       A
0       200610-108      0       0       G       A

...



x2018:
Code:
0       200610-10       0       0       G       A
0       200610-108      0       0       G       A

.....more 550K rows




How can I merge all files together, without having any duplicate values based on the 2nd column (rs_id)?

Last edited by vbe; 01-04-2020 at 12:40 PM.. Reason: code tage please
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Need to find only unique values for a given tag across the files

Need to find only unique values for a given tag across the files: For eg: Test1: <Tag1>aaa</Tag1> <Tag2>bbb</Tag2> <Tag3>ccc</Tag3> Test2: <Tag1>aaa</Tag1> <Tag2>ddd</Tag2> <Tag3>eee</Tag3> Test3: <Tag1>aaa</Tag1> <Tag2>ddd</Tag2> <Tag3>eee</Tag3> Test4: (8 Replies)
Discussion started by: sudheshnaiyer
8 Replies

2. Shell Programming and Scripting

comparing 2 text files to get unique values??

Hi all, I have got a problem while comparing 2 text files and the result should contains the unique values(Non repeatable). For eg: file1.txt 1 2 3 4 file2.txt 2 3 So after comaping the above 2 files I should get only 1 and 4 as the output. Pls help me out. (7 Replies)
Discussion started by: smarty86
7 Replies

3. Shell Programming and Scripting

merge files with same row values

Hi everyone, I'm just wondering how could I using awk language merge two files by comparison of one their row. I mean, I have one file like this: file#1: 21/07/2009 11:45:00 100.0000000 27.2727280 21/07/2009 11:50:00 75.9856644 25.2492676 21/07/2009 11:55:00 51.9713287 23.2258072... (4 Replies)
Discussion started by: tonet
4 Replies

4. Shell Programming and Scripting

sort split merge -u unique

Hi, this is about sorting a very large file (like 10 gb) to keep lines with unique entries across SOME of the columns. The line originally looked like this: sort -u -k2,2 -k3,3n -k4,4n -k5,5n -k6,6n file_unsorted > file_sorted please note the -u flag. The problem is that this single... (4 Replies)
Discussion started by: jbr950
4 Replies

5. UNIX for Dummies Questions & Answers

How to count specific columns and merge with unique ones?

Hi. I am not sure the title gives an optimal description of what I want to do. I have several text files that contain data in many columns. All the files are organized the same way, but the data in the columns might differ. I want to count the number of times data occur in specific columns,... (0 Replies)
Discussion started by: JamesT
0 Replies

6. UNIX for Dummies Questions & Answers

Merge two files with non-overlapping identities

Hi All, I wish to merge two files: file1: with header rsSNP-ID Chromosome Chr-Pos rs171 1 175261679 rs242 1 20869461 rs538 1 6160958 file2: without header disease:AAT deficiency:M0525101 rs1243168 20109307 1 disease:AAT deficiency:M0525101 rs4900229 20109307 1... (3 Replies)
Discussion started by: luoruicd
3 Replies

7. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

8. Shell Programming and Scripting

Identify the overlapping and non overlapping regions

file1 chr pos1 pos2 pos3 pos4 1)chr1 1000 2000 3000 4000 2)chr1 1380 1480 6800 7800 3)chr1 6700 7700 1200 2200 4)chr2 8500 9500 5670 6670 file2 chr pos1 pos2 pos3 pos4 1)chr2 8500 9500 5000 6000 2)chr1 6700 7700 1200 2200 3)chr1 1380 1480 6700 7700 4)chr1 1000 2000 4900 5900 I... (2 Replies)
Discussion started by: data_miner
2 Replies

9. Shell Programming and Scripting

Count Unique values from multiple lists of files

Looking for a little help here. I have 1000's of text files within a multiple folders. YYYY/ /MM /1000's Files Eg. 2014/01/1000 files 2014/02/1237 files 2014/03/1400 files There are folders for each year and each month, and within each monthly folder there are... (4 Replies)
Discussion started by: whegra
4 Replies

10. Shell Programming and Scripting

How to merge two files with unique values matching.?

I have one script as below: #!/bin/ksh Outputfile1="/home/OutputFile1.xls" Outputfile2="/home/OutputFile2.xls" InputFile1="/home/InputFile1.sql" InputFile2="/home/InputFile2.sql" echo "Select hobby, class, subject, sports, rollNumber from Student_Table" >> InputFile1 echo "Select rollNumber... (3 Replies)
Discussion started by: Sharma331
3 Replies
PPIx::EditorTools::RenameVariable(3pm)			User Contributed Perl Documentation		    PPIx::EditorTools::RenameVariable(3pm)

NAME
PPIx::EditorTools::RenameVariable - Lexically replace a variable name in Perl code SYNOPSIS
my $munged = PPIx::EditorTools::RenameVariable->new->rename( code => $code, line => 15, column => 13, replacement => 'stuff', ); my $code_as_strig = $munged->code; my $code_as_ppi = $munged->ppi; my $location = $munged->element->location; DESCRIPTION
This module will lexically replace a variable name. METHODS
new() Constructor. Generally shouldn't be called with any arguments. rename( ppi => PPI::Document $ppi, line => Int, column => Int, replacement => Str ) =item rename( code => Str $code, line => Int, column => Int, replacement => Str ) =item rename( code => Str $code, line => Int, column => Int, to_camel_case => Bool, [ucfirst => Bool] ) =item rename( code => Str $code, line => Int, column => Int, from_camel_case => Bool, [ucfirst => Bool] ) Accepts either a "PPI::Document" to process or a string containing the code (which will be converted into a "PPI::Document") to process. Renames the variable found at line, column with that supplied in the "replacement" parameter and returns a "PPIx::EditorTools::ReturnObject" with the new code available via the "ppi" or "code" accessors, as a "PPI::Document" or "string", respectively. The "PPI::Token" found at line, column is available via the "element" accessor. Instead of specifying an explicit replacement variable name, you may choose to use the "to_camel_case" or "from_camel_case" options that automatically convert to/from camelCase. In that mode, the "ucfirst" option will force uppercasing of the first letter. You can not specify a replacement name and use the "to/from_camel_case" options. Croaks with a "no token" exception if no token is found at the location. Croaks with a "no declaration" exception if unable to find the declaration. SEE ALSO
This class inherits from "PPIx::EditorTools". Also see App::EditorTools, Padre, and PPI. perl v5.14.2 2012-03-11 PPIx::EditorTools::RenameVariable(3pm)
All times are GMT -4. The time now is 07:48 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy