Sponsored Content
Full Discussion: Combine Columns
Top Forums Shell Programming and Scripting Combine Columns Post 302962214 by greycells on Wednesday 9th of December 2015 05:29:40 PM
Old 12-09-2015
Combine Columns

Input
Code:
 
NJ090237_0263_GRP,NJ090237_0263_VIEW,NJ090237_0263_PSGRP,NJ090237_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE5
NJ090237_0264_GRP,NJ090237_0263_VIEW,NJ090237_0264_PSGRP,NJ090237_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE5
NJ090233_0263_GRP,NJ090233_0263_VIEW,NJ090233_0263_PSGRP,NJ090233_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE6
NJ090233_0263_GRP,NJ090233_0263_VIEW,NJ090233_0264_PSGRP,NJ090233_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE6

Basically when column 6 is same in the input file , combine $1,$2,$3,$4,$5 using a ";"
but if the value in any of the columns 1-5 is similar , just use unique value .

Code:
gawk '
  {
    i=$6
    p=(i in A)
  } 
  NR==FNR {
    A[i]=A[i] (p?";":x) $1
    B[i]=B[i] (p?";":x) $2
    C[i]=C[i] (p?";":x) $3
    D[i]=D[i] (p?";":x) $4
    E[i]=E[i] (p?";":x) $5
    next
  } 
  p {
    $1=A[i]
    $2=B[i]
    $3=C[i]
    $4=D[i]
    $5=E[i]
    delete A[i]
    delete B[i]
    delete C[i]
    delete D[i]
    delete E[i]
    print
  }
' FS=, OFS=, input1 input1

I am getting this output .. its combining unique values also

Code:
NJ090237_0263_GRP;NJ090237_0264_GRP,NJ090237_0263_VIEW;NJ090237_0263_VIEW,NJ090237_0263_PSGRP;NJ090237_0264_PSGRP,NJ090237_0263_GOLD_CSGRP;NJ090237_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0;06E:0_08E:0_09E:0_11E:0,0CE5
NJ090233_0263_GRP;NJ090233_0263_GRP,NJ090233_0263_VIEW;NJ090233_0263_VIEW,NJ090233_0263_PSGRP;NJ090233_0264_PSGRP,NJ090233_0263_GOLD_CSGRP;NJ090233_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0;06E:0_08E:0_09E:0_11E:0,0CE6

but output needed is
Code:
NJ090237_0263_GRP;NJ090237_0264_GRP,NJ090237_0263_VIEW,NJ090237_0263_PSGRP;NJ090237_0264_PSGRP,NJ090237_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE5
NJ090233_0263_GRP,NJ090233_0263_VIEW,NJ090233_0263_PSGRP;NJ090233_0264_PSGRP,NJ090233_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE6

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Combine multiple columns from multiple files

Hi there, I was wondering if someone can help me with this. I am trying the combine multiple columns from multiple files into one file. Example file 1: c0t0d0 c0t2d0 # hostname vgname c0t0d1 c0t2d1 # hostname vgname c0t0d2 c0t2d2 # hostname vgname c0t1d0 c0t3d0 # hostname vgname1... (5 Replies)
Discussion started by: martva
5 Replies

2. Shell Programming and Scripting

Single command for add 2 columns and remove 2 columns in unix/performance tuning

Hi all, I have created a script which adding two columns and removing two columns for all files. Filename: Cust_information_1200_201010.txt Source Data: "1","Cust information","123","106001","street","1-203 high street" "1","Cust information","124","105001","street","1-203 high street" ... (0 Replies)
Discussion started by: onesuri
0 Replies

3. UNIX for Dummies Questions & Answers

combine the values from the first two columns within a file

Hello everybody, I have a text file containing 10,000 rows and 5000 columns. The values are separated by a tab. Ex. file_ex.ped 1 mike 0 0 2 1 A A G G C T A G 1 jack 0 0 2 2 T A G T C A A C 1 Mary 0 0 1 2 A T G C A T G C ... I would like a out put file 1 mike 0 0 2 1 AA GG CT AG 1... (7 Replies)
Discussion started by: Unilearn
7 Replies

4. Shell Programming and Scripting

How to combine 2 files into 1 file with 2 columns

Hi Guys, I want to combine 2 files and and put together in 1 file and make two columns out it. See below desired output. Any help will be much appreciated. inputfile1.txt 12345 67890 24580 inputfile2.txt AAAAA BBBBB CCCCC (11 Replies)
Discussion started by: pinpe
11 Replies

5. UNIX for Dummies Questions & Answers

How to combine 2 files with 6 columns?

This may seem obvious but I am having problems doing this as columns get converted to rows when i try to write a script. I have 2 files text1.txt and text2.txt each of which have 6 columns of numbers separated by a space. I need to combine the 2 files so that the output file text3.txt maintains... (2 Replies)
Discussion started by: tgoldstone
2 Replies

6. Shell Programming and Scripting

Combine columns from multiple files

Can anybody help on the script to combine/concatenate columns from multiple files input1 4 135 5 185 6 85 11 30 16 72 17 30 21 52 22 76 input2 2 50 4 50 6 33 8 62 10 25 12 46 14 42 15 46output (2 Replies)
Discussion started by: sdf
2 Replies

7. UNIX for Dummies Questions & Answers

Combine columns from 100 files with same structure

Hi, I have a bunch of files with the following format. PUR.1.9 30910 0.024 0.926 0.050 36587 0.024 0.927 0.049 91857 0.023 0.928 0.049 105797 0.024 0.927 0.049 146659 0.024 0.927 0.049 152695 0.024 0.927 0.049 192118 0.022 0.930 0.048 193310 0.018 0.936 0.046 PUR.2.9 30910 0.028... (6 Replies)
Discussion started by: genehunter
6 Replies

8. Shell Programming and Scripting

Combine columns from many files but keep them aligned in columns-shorter left column issue

Hello everyone, I searched the forum looking for answers to this but I could not pinpoint exactly what I need as I keep having trouble. I have many files each having two columns and hundreds of rows. first column is a string (can have many words) and the second column is a number.The files are... (5 Replies)
Discussion started by: isildur1234
5 Replies

9. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

10. Shell Programming and Scripting

Combine columns - awk

Need some help with this ... please 60644,NJ090237_0263_GRP,NJ090237_0263_VIEW,NJ090237_0263_PSGRP,NJ090237_0263_GOLD_CSGRP,,06E:0_08E:0_09E:0_11E:0,0CE5,TDEV,34,VP_TIER... (3 Replies)
Discussion started by: greycells
3 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 02:19 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy