Check/print missing number in a consecutive range and remove duplicate numbers
Hi,
In an ideal scenario, I will have a listing of db transaction log that gets copied to a DR site and if I have them all, they will be numbered consecutively like below.
There will be some scenario where files are not coped for some reason maybe network failure for example so there will be gap in the list of files.
So the list above may be something like below where there is a gap in a supposed to consecutive list.
Does anyone know a quick way of checking for what is the missing number in the consecutive range?
At the moment, what I am doing is I am cutting the list into 3 separate list based on the first character. The first digit is the transaction group, the second digit is the transaction number and the 3rd digit is the db id which is a constant.
Then I am reading each new list, set the first number as a 'base' and then incrementing it by 1, assign it to a variable and then comparing that number with what I read next. If they don't match, the I print that as the missing number or gap. It is a very long tedious process.
I am hoping someone know a trick of checking what is the missing number in the consecutive range and print it.
I don't want to insert the missing number in the existing list, I will be re-directing it to an exception list so I know what transaction log is missing that I will have to re-copy.
Also in some instance, I will have a listing where there will be duplicates in the listing like below
This is where the database log was send to the DR site multiple times. Is there a way to check for which lines are duplicates and how many lines/rows are there?
For example, from the latest listing above, 1_79812_01234567.arc has 3 entries, 2_86756_01234567.arc has 3 entries, 3_82694_01234567.arc has 2 entries and so on.
Any advice will be much appreciated. Thanks in advance.
I'm afraid there's no "quick way" to do what you request. It has to be done like what you describe, line by line, value by value. Why don't you post your attempt to be discussed, analysed, and hopefully improved? And, post the desired output for problem 2.
By the way, a similar problem has been solved here.
To look for 'missing' files, could you get a list of the files you have into a temporary file and then generate another than contains the names you think you should have. You can then do the following:-
You have to be careful that you match exactly between the two files, so if you expect to have a file that is a123 and another that is a12345, then searching in this way will not report if a123 is p[resent but a12345 is missing.
If this is a concern, build your expected list to be like this:-
This will match the string and anchor the ends to beginning and end of line so you get a complete match.
Does that help at all? It's good to share and we might be able to suggest some improvements.
I've uploaded the script that I am using at the moment and some test data. It works like I intend it to, just thought maybe it can be improved somehow. So all the ones that are FAILED are the ones missing in the range of consecutive series
I didn't know I can use sort | uniq -c to check for duplicate, thanks to RudiC.
Checking up on the link below if it can be used instead.
Hi all,
I have two (2) sets of files that are based on some snapshots of database that I want to merge and insert any missing sequential number.
Below are example representation of these files:
file1:
DATE TIME COL1 COL2 COL3 COL4 ID
01/10/2013 0800 100 ... (3 Replies)
Hi,
I have an input file of the following style
input.txt
The 4000 at the end indicates the total no. of columns in that row.
I would like to replace all -1s with consecutive 1 and 2 and print the whole line again.
So, the output would be
output.txt
Thanks in advance. (7 Replies)
I need help with a script that will remove all HTML tags from an HTML document and remove any consecutive duplicate lines, and save it as a text document. The user should have the option of including the name of an html file as an argument for the script, but if none is provided, then the script... (7 Replies)
Hi Experts,
Need help on printing of numbers, which are missing in the range.
Pls find the details below
Input
1000000002
1000000007
1234007940
1234007946
Output
1000000003
1000000004
1000000005
1000000006
1234007941 (2 Replies)
I have a text file in the following format
....
START
1,1
2,1
3,1
..
..
9,1
10,1
END
....
I want to change to the output to
....
START
1,1
2,1
3,1
.. (4 Replies)
I need to edit a list of numbers on the following form:
1 1.0
2 1.4
5 2.1
7 1.9
I want:
1 1.0
2 1.4
3 0.0
4 0.0
5 2.1
6 0.0
7 1.9
(i want to add the missing number in column 1 together with 0.0 in column 2).
I guess it is rather trivial but i didn't even manage to read column... (5 Replies)