Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Finding records NOT on another file Post 303025577 by Don Cragun on Tuesday 6th of November 2018 06:04:24 AM
Old 11-06-2018
One could also try:
Code:
awk 'FNR == 1 { fc++ } fc < 3 {d[$0]; next } !($0 in d)' DIFF MATCH ALL

which has been tested.

This requires enough space for the unique records in DIFF and MATCH to be held in memory, but doesn't require space in memory for the unique records in ALL.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

finding null records in data file

I am having a "|" delimited flat file and I have to pick up all the records with the 2nd field having null value. Please suggest. (3 Replies)
Discussion started by: dsravan
3 Replies

2. Shell Programming and Scripting

Count No of Records in File without counting Header and Trailer Records

I have a flat file and need to count no of records in the file less the header and the trailer record. I would appreciate any and all asistance Thanks Hadi Lalani (2 Replies)
Discussion started by: guiguy
2 Replies

3. Programming

Finding number of records in SAS dataset

I am running the following Korn shell script: #!/usr/bin/ksh num_records=`sas "select count(*) from /users/abc/123/sasdata.sas7bdat"` echo "$num_records" The script keeps returning an invalid file error even though I am certain that the file really exists. Does anyone see anything wrong... (1 Reply)
Discussion started by: sasaliasim
1 Replies

4. Shell Programming and Scripting

awk script required for finding records in 1 file with corresponding another file.

Hi, I have a .txt file (uniqfields.txt) with 3 fields separated by " | " (pipe symbol). This file contains unique values with respect to all these 3 fields taken together. There are about 40,000 SORTED records (rows) in this file. Sample records are given below. 1TVAO|OVEPT|VO... (2 Replies)
Discussion started by: RRVARMA
2 Replies

5. UNIX for Dummies Questions & Answers

Grep specific records from a file of records that are separated by an empty line

Hi everyone. I am a newbie to Linux stuff. I have this kind of problem which couldn't solve alone. I have a text file with records separated by empty lines like this: ID: 20 Name: X Age: 19 ID: 21 Name: Z ID: 22 Email: xxx@yahoo.com Name: Y Age: 19 I want to grep records that... (4 Replies)
Discussion started by: Atrisa
4 Replies

6. Shell Programming and Scripting

Finding some records with sed command

Hi for all! sorry guys for my dumb question, but I'm really need help so, we have file with many many fields, like this one: 201001002359 blablabla 87654321 201001002359 123,56 77272588300 blablabla/123 91823778544and I wrote awk command awk '{if($6~/(2588300|2580000|2587021)$/)print}'so,... (8 Replies)
Discussion started by: shizik
8 Replies

7. Shell Programming and Scripting

Finding the records with a specified length

I have a sample txt file which has different variable lengths of 2,10,3,15. What is the command that I need use in order to get the record count that has length '3' Thanks (3 Replies)
Discussion started by: bobby1015
3 Replies

8. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

9. Shell Programming and Scripting

Finding missing records and Dups

I have a fixed width file. The records looks something similar to below: Type ID SSN NAME .....AND SOME MORE FIELDS A1 1234 ..... A1 1234 ..... B1 1234 ..... M2 4567 ..... M2 4567 ..... N2 4567 ..... N2 4567 ..... A1 9999 N2 9999 Now if A1 is present then B1 has to be present.... (2 Replies)
Discussion started by: Saanvi1
2 Replies

10. Shell Programming and Scripting

UNIX scripting for finding duplicates and null records in pk columns

Hi, I have a requirement.for eg: i have a text file with pipe symbol as delimiter(|) with 4 columns a,b,c,d. Here a and b are primary key columns.. i want to process that file to find the duplicates and null values are in primary key columns(a,b) . I want to write the unique records in which... (5 Replies)
Discussion started by: praveenraj.1991
5 Replies
STAG-DIFF(1p)						User Contributed Perl Documentation					     STAG-DIFF(1p)

NAME
stag-diff - finds the difference between two stag files SYNOPSIS
stag-diff -ignore foo-id -ignore bar-id file1.xml file2.xml DESCRIPTION
Compares two data trees and reports whether they match. If they do not match, the mismatch is reported. ARGUMENTS -help|h shows this document -ignore|i ELEMENT these nodes are ignored for the purposes of comparison. Note that attributes are treated as elements, prefixed by the containing element id. For example, if you have <foo ID="wibble"> And you wish to ignore the ID attribute, then you would use the switch -ignore foo-ID You can specify multiple elements to ignore like this -i foo -i bar -i baz You can also specify paths -i foo/bar/bar-id -parser|p FORMAT which parser to use. The default is XML. This can also be autodetected by the file suffix. Other alternatives are sxpr and itext. See Data::Stag for details. -report|r ELEMENT report mismatches as they occur on each element of type ELEMENT multiple elements can be specified -verbose|v used in conjunction with the -report switch shows the tree of the mismatching element OUTPUT If a mismatch is reported, a report is generated displaying the subpart of the tree that could not be matched. This will look like this: REASON: no_matching_node: annotation no_matching_node: feature_set no_matching_node: feature_span no_matching_node: evidence no_matching_node: evidence-id data_mismatch(:15077290 ne :15077291): evidence-id AND evidence-id Due to the nature of tree matching, it can be difficult to specify exactly how trees do not match. To investigate this, you may need to use the -r and -v options. For the above output, I would recommend using stag-diff -r feature_span -v ALGORITHM Both trees are recursively traversed... see the actual code for how this works The order of elements is not important; eg <foo> <bar> <baz>1</baz> </bar> <bar> <baz>2</baz> </bar> </foo> matches <foo> <bar> <baz>2</baz> </bar> <bar> <baz>1</baz> </bar> </foo> The recursive nature of this algorithm means that certain tree comparisons will explode wrt time and memory. I think this will only happen with very deep trees where nodes high up in the tree can only be differentiated by nodes low down in the tree. Both trees are loaded into memory to begin with, so it may thrash with very large documents AUTHOR Chris Mungall cjm at fruitfly dot org SEE ALSO
Data::Stag perl v5.10.0 2008-12-23 STAG-DIFF(1p)
All times are GMT -4. The time now is 08:23 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy