Sponsored Content
Top Forums Shell Programming and Scripting Print only lines where fields concatenated match strings Post 302758255 by Ophiuchus on Friday 18th of January 2013 04:11:15 PM
Old 01-18-2013
Hello Don,

Many thanks for your answer, I've tested with 53 lines and worked fine.

My real file has more than a million lines, I hope work relative fast and with correct output.

Many thanks again!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Strings from one file which exactly match to the 1st column of other file and then print lines.

Hi, I have two files. 1st file has 1 column (huge file containing ~19200000 lines) and 2nd file has 2 columns (small file containing ~6000 lines). ################################# huge_file.txt a a ab b ################################## small_file.txt a 1.5 b 2.5 ab ... (4 Replies)
Discussion started by: AshwaniSharma09
4 Replies

2. Shell Programming and Scripting

Splitting Concatenated Words With Largest Strings First

hello, I had posted earlier help for a script for splitting concatenated words . The script was supposed to read words from a master file and split concatenated words in the slave/input file. Thanks to the help I got, the following script which works very well was posted. It detects residues by... (14 Replies)
Discussion started by: gimley
14 Replies

3. Shell Programming and Scripting

awk strings search + print next column after match

Hi, I have a file filled with search strings which have a blank in between and look like this: S. g. Ehr. o. Jg. v. d. Chijs g. Ehr. Now i would like to search for the strings and it also shall return the next column after the match. awk -v FILE="search_strings.txt" 'BEGIN {... (10 Replies)
Discussion started by: sdf
10 Replies

4. Shell Programming and Scripting

Print strings that match pattern with awk

I have a file with many lines which contain strings like .. etc. But with no rule regarding field separators or anything else. I want to print ONLY THE STRING from each line , not the entire line !!! For example from the lines : Flow on service executed with success in . Performances... (5 Replies)
Discussion started by: black_fender
5 Replies

5. Shell Programming and Scripting

Returning two lines if they both match strings

Hi I have a problem where I have a large amount of files that I need to scan and return a line and its following line, but only when the following line begins with a string. String one - line one must begin with 'Bill' String two - line two must begin with 'Jones'. If these two... (7 Replies)
Discussion started by: majormajormajor
7 Replies

6. UNIX for Dummies Questions & Answers

awk - (URGENT!) Print lines sort and move lines if match found

URGENT HELP IS NEEDED!! I am looking to move matching lines (01 - 07) from File1 and 77 tab the matching string from File2, to File3.txt. I am almost done but - Currently, script is not printing lines to File3.txt in order. - Also the matching lines are not moving out of File1.txt ... (1 Reply)
Discussion started by: High-T
1 Replies

7. Shell Programming and Scripting

Match the value & print lines from the match

Hello, I have a file contains two columns. I need to print the lines after “xxx” so i'm trying to match "xxx" & cut the lines after that. I'm trying with the grep & cut command, if there any simple way to extract this please help me. Sample file : name id AAA 123 AAB 124 AAC 125... (4 Replies)
Discussion started by: Shenbaga.d
4 Replies

8. Shell Programming and Scripting

awk to print fields that match using conditions and a default value for non-matching in two files

Trying to use awk to match the contents of each line in file1 with $5 in file2. Both files are tab-delimited and there may be a space or special character in the name being matched in file2, for example in file1 the name is BRCA1 but in file2 the name is BRCA 1 or in file1 name is BCR but in file2... (6 Replies)
Discussion started by: cmccabe
6 Replies

9. Shell Programming and Scripting

awk to combine lines if fields match in lines

In the awk below, what I am attempting to do is check each line in the tab-delimeted input, which has ~20 lines in it, for a keyword SVTYPE=Fusion. If the keyword is found I am splitting $3 using the . (dot) and reading the portion before and after the dot in an array a. If it does have that... (12 Replies)
Discussion started by: cmccabe
12 Replies

10. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 08:28 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy