Sponsored Content
Top Forums Shell Programming and Scripting Join common patterns in multiple lines into one line Post 302886684 by MadeInGermany on Monday 3rd of February 2014 07:42:01 AM
Old 02-03-2014
I see this is the same problem as in your previous post.
It is best solved with linked lists, where you can easily join two lists.
Old C can easily deal with pointers, but the rest (string handling etc.) is not as good as with a script interpreter.
On a script interpreter one can simulate such a list by a helper array that holds an integer that points at the following array element. The helper array goes along with the actual data array.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find multiple patterns on multiple lines and concatenate output

I'm trying to parse COBOL code to combine variables into one string. I have two variable names that get literals moved into them and I'd like to use sed, awk, or similar to find these lines and combine the variables into the final component. These variable names are always VAR1 and VAR2. For... (8 Replies)
Discussion started by: wilg0005
8 Replies

2. Shell Programming and Scripting

How to use SED to join multiple lines?

Hi guys, anyone know how can i join multiples lines using sed till the end of a file and output to another file in a single line? The end of each line will be replaced with a special char "#". I am using the below SED command, however it seems to remove the last 2 lines. Also not all lines... (12 Replies)
Discussion started by: DrivesMeCrazy
12 Replies

3. Shell Programming and Scripting

Get common lines from multiple files

FileA chr1 31237964 NP_001018494.1 PUM1 M340L chr1 31237964 NP_055491.1 PUM1 M340L chr1 33251518 NP_037543.1 AK2 H191D chr1 33251518 NP_001616.1 AK2 H191D chr1 57027345 NP_001004303.2 C1orf168 P270S FileB chr1 ... (9 Replies)
Discussion started by: genehunter
9 Replies

4. Shell Programming and Scripting

Join multiple files based on 1 common column

I have n files (for ex:64 files) with one similar column. Is it possible to combine them all based on that column ? file1 ax100 20 30 40 ax200 22 33 44 file2 ax100 10 20 40 ax200 12 13 44 file2 ax100 0 0 4 ax200 2 3 4 (9 Replies)
Discussion started by: quincyjones
9 Replies

5. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

6. Shell Programming and Scripting

Find common lines between multiple files

Hello everyone A few years Ago the user radoulov posted a fancy solution for a problem, which was about finding common lines (gene variation names) between multiple samples (files). The code was: awk 'END { for (R in rec) { n = split(rec, t, "/") if (n > 1) dup = dup ?... (5 Replies)
Discussion started by: bibb
5 Replies

7. Shell Programming and Scripting

Join multiple lines

Hi I have a source file ( written i C ) where a funtion call is spread over multiple lines, for example : func( a, b, c ); I want this to be joined into one single line : func(a,b,c); How can this be done with awk and sed ? Regards. Hench (2 Replies)
Discussion started by: hench
2 Replies

8. Shell Programming and Scripting

Find common patterns in multiple file

Hi, I need help to find patterns that are common or matched in a specified column in multiple files. File1.txt ID1 555 ID23 8857 ID4 4454 ID05 555 File2.txt ID74 4454 ID96 555 ID322 4454 (4 Replies)
Discussion started by: redse171
4 Replies

9. Shell Programming and Scripting

Join columns across multiple lines in a Text based on common column using BASH

Hello, I have a file with 2 columns ( tableName , ColumnName) delimited by a Pipe like below . File is sorted by ColumnName. Table1|Column1 Table2|Column1 Table5|Column1 Table3|Column2 Table2|Column2 Table4|Column3 Table2|Column3 Table2|Column4 Table5|Column4 Table2|Column5 From... (6 Replies)
Discussion started by: nv186000
6 Replies

10. UNIX for Beginners Questions & Answers

Delete multiple lines between blank lines containing two patterns

Hi all, I'm looking for a way (sed or awk) to delete multiple lines between blank lines containing two patterns ex: user: alpha parameter_1 = 15 parameter_2 = 1 parameter_3 = 0 user: alpha parameter_1 = 15 parameter_2 = 1 parameter_3 = 0 user: alpha parameter_1 = 16... (3 Replies)
Discussion started by: ce9888
3 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 12:54 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy