This requires enough space for the unique records in DIFF and MATCH to be held in memory, but doesn't require space in memory for the unique records in ALL.
I have a flat file and need to count no of records in the file less the header and the trailer record.
I would appreciate any and all asistance
Thanks
Hadi Lalani (2 Replies)
I am running the following Korn shell script:
#!/usr/bin/ksh
num_records=`sas "select count(*) from /users/abc/123/sasdata.sas7bdat"`
echo "$num_records"
The script keeps returning an invalid file error even though I am certain that the file really exists. Does anyone see anything wrong... (1 Reply)
Hi,
I have a .txt file (uniqfields.txt) with 3 fields separated by " | " (pipe symbol). This file contains unique values with respect to all these 3 fields taken together. There are about 40,000 SORTED records (rows) in this file. Sample records are given below.
1TVAO|OVEPT|VO... (2 Replies)
Hi everyone.
I am a newbie to Linux stuff. I have this kind of problem which couldn't solve alone. I have a text file with records separated by empty lines like this:
ID: 20
Name: X
Age: 19
ID: 21
Name: Z
ID: 22
Email: xxx@yahoo.com
Name: Y
Age: 19
I want to grep records that... (4 Replies)
Hi for all!
sorry guys for my dumb question, but I'm really need help
so,
we have file with many many fields, like this one:
201001002359 blablabla 87654321 201001002359 123,56 77272588300 blablabla/123 91823778544and I wrote awk command
awk '{if($6~/(2588300|2580000|2587021)$/)print}'so,... (8 Replies)
I have a sample txt file which has different variable lengths of 2,10,3,15.
What is the command that I need use in order to get the record count that has length '3'
Thanks (3 Replies)
I have 2 files
"File 1" is delimited by ";" and "File 2" is delimited by "|".
File 1 below (3 record shown):
Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones
Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull
Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
I have a fixed width file. The records looks something similar to below:
Type ID SSN NAME .....AND SOME MORE FIELDS
A1 1234 .....
A1 1234 .....
B1 1234 .....
M2 4567 .....
M2 4567 .....
N2 4567 .....
N2 4567 .....
A1 9999
N2 9999
Now if A1 is present then B1 has to be present.... (2 Replies)
Hi,
I have a requirement.for eg: i have a text file with pipe symbol as delimiter(|) with 4 columns a,b,c,d. Here a and b are primary key columns..
i want to process that file to find the duplicates and null values are in primary key columns(a,b) . I want to write the unique records in which... (5 Replies)
Discussion started by: praveenraj.1991
5 Replies
LEARN ABOUT DEBIAN
stag-diff
STAG-DIFF(1p) User Contributed Perl Documentation STAG-DIFF(1p)NAME
stag-diff - finds the difference between two stag files
SYNOPSIS
stag-diff -ignore foo-id -ignore bar-id file1.xml file2.xml
DESCRIPTION
Compares two data trees and reports whether they match. If they do not match, the mismatch is reported.
ARGUMENTS
-help|h
shows this document
-ignore|i ELEMENT
these nodes are ignored for the purposes of comparison. Note that attributes are treated as elements, prefixed by the containing
element id. For example, if you have
<foo ID="wibble">
And you wish to ignore the ID attribute, then you would use the switch
-ignore foo-ID
You can specify multiple elements to ignore like this
-i foo -i bar -i baz
You can also specify paths
-i foo/bar/bar-id
-parser|p FORMAT
which parser to use. The default is XML. This can also be autodetected by the file suffix. Other alternatives are sxpr and itext. See
Data::Stag for details.
-report|r ELEMENT
report mismatches as they occur on each element of type ELEMENT
multiple elements can be specified
-verbose|v
used in conjunction with the -report switch
shows the tree of the mismatching element
OUTPUT
If a mismatch is reported, a report is generated displaying the subpart of the tree that could not be matched. This will look like this:
REASON: no_matching_node: annotation
no_matching_node: feature_set
no_matching_node: feature_span
no_matching_node: evidence
no_matching_node: evidence-id
data_mismatch(:15077290 ne :15077291): evidence-id AND evidence-id
Due to the nature of tree matching, it can be difficult to specify exactly how trees do not match. To investigate this, you may need to use
the -r and -v options. For the above output, I would recommend using
stag-diff -r feature_span -v
ALGORITHM
Both trees are recursively traversed... see the actual code for how this works
The order of elements is not important; eg
<foo>
<bar>
<baz>1</baz>
</bar>
<bar>
<baz>2</baz>
</bar>
</foo>
matches
<foo>
<bar>
<baz>2</baz>
</bar>
<bar>
<baz>1</baz>
</bar>
</foo>
The recursive nature of this algorithm means that certain tree comparisons will explode wrt time and memory. I think this will only happen
with very deep trees where nodes high up in the tree can only be differentiated by nodes low down in the tree.
Both trees are loaded into memory to begin with, so it may thrash with very large documents
AUTHOR
Chris Mungall cjm at fruitfly dot org
SEE ALSO
Data::Stag
perl v5.10.0 2008-12-23 STAG-DIFF(1p)