Sponsored Content
Top Forums Shell Programming and Scripting Two files, remove lines from second based on lines in first Post 302886410 by Yoda on Friday 31st of January 2014 09:10:32 AM
Old 01-31-2014
Use hexdump or od to check if you have any such characters in your file.
This User Gave Thanks to Yoda For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove lines, Sorted with Time based columns using AWK & SORT

Hi having a file as follows MediaErr.log 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:12:16 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:22:47 84 Server1 Policy1 Schedule1 master1 05/08/2008 03:41:26 84 Server1 Policy1 ... (1 Reply)
Discussion started by: karthikn7974
1 Replies

2. Shell Programming and Scripting

remove lines based on score criteria

Hi guys, Please guide for Solution. PART-I INPUT FILE (has 2 columns ID and score) TC5584_1 93.9 DV161411_2 79.5 BP132435_5 46.8 EB682112_1 34.7 BP132435_4 29.5 TC13860_2 10.1 OUTPUT FILE (It shudn't contain the line ' BP132435_4 29.5 ' as BP132435 is repeated... (2 Replies)
Discussion started by: smriti_shridhar
2 Replies

3. Shell Programming and Scripting

Remove lines based on contents of another file

So, this issue is driving me nuts! I was hoping to get a lending hand here... I have 2 files: file1.txt contains: this is example1 this is example2 this is example3 this is example4 this is example5 file2.txt contains: example3 example5 Basically, I need a script or command to... (4 Replies)
Discussion started by: bashshadow1979
4 Replies

4. Shell Programming and Scripting

Remove lines from XML based on condition

Hi, I need to remove some lines from an XML file is the value within a tag is empty. Imagine this scenario, <acd><acdID>2</acdID><logon></logon></acd> <acd><acdID></acdID><logon></logon></acd> <acd><acdID></acdID><logon></logon></acd> <acd><acdID></acdID><logon></logon></acd> I... (3 Replies)
Discussion started by: giles.cardew
3 Replies

5. UNIX for Dummies Questions & Answers

remove duplicate lines based on two columns and judging from a third one

hello all, I have an input file with four columns like this with a lot of lines and for example, line 1 and line 5 match because the first 4 characters match and the fourth column matches too. I want to keep the line that has the lowest number in the third column. So I discard line 5.... (5 Replies)
Discussion started by: TheTransporter
5 Replies

6. Shell Programming and Scripting

Remove lines based on column value

Hi All, I just need a quick fix here. I need to delete all lines containing "." in the 6th column. Input: 1 1055498 . G T 5.46 . 1 1902377 . C T 7.80 . 1 1031540 . A G 34.01 PASS 1 ... (2 Replies)
Discussion started by: Hkins552
2 Replies

7. Shell Programming and Scripting

Remove duplicate lines based on field and sort

I have a csv file that I would like to remove duplicate lines based on field 1 and sort. I don't care about any of the other fields but I still wanna keep there data intact. I was thinking I could do something like this but I have no idea how to print the full line with this. Please show any method... (8 Replies)
Discussion started by: cokedude
8 Replies

8. Shell Programming and Scripting

Remove certain lines from file based on start of line except beginning and ending

Hi, I have multiple large files which consist of the below format: I am trying to write an awk or sed script to remove all occurrences of the 00 record except the first and remove all of the 80 records except the last one. Any help would be greatly appreciated. (10 Replies)
Discussion started by: nwalsh88
10 Replies

9. UNIX for Dummies Questions & Answers

Remove lines in a positional file based on string value

Gurus, I am relatively new to Unix scripting and am struck with a problem in my script. I have positional input file which has a FLAG indicator in at position 11 in every record of the file. If the Flag has value =Y, then the record from the input needs to be written to a new file.However if... (3 Replies)
Discussion started by: gsam
3 Replies

10. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 08:06 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy