I have one array SPLNO with approx 10k numbers.Now i want to search the subscriber number from MDN.TXT file (containing approx 1.5 lac record)from the array.if subscriber number found in array it will perform below operation.my issue is that it's taking more time because for one number it's search whole array of 10k records. therefore for 1.5 lac records it's looping around (1.5lac*10K). please suggest efficient ways.
If you want to find one of 10k items in a large file, there's no straight, easy way to avoid comparing each item against each line in file. What you can do is break when found.
You could also explain what the SPLNOMAXLEN[] values are trying to accomplish. (If you were trying to perform an exact match for one of two numbers ($1 or "91"$1) instead of a string match at the start and then a string length comparison), it would be tremendously faster.)
And, please explain why printing found or not found 1.5 billion times with no indication of what was or was not found is going to be useful to anyone. This appears to be a useless exercise. And, in that case, why does the speed matter?
But, showing us the output you're hoping to produce from those sample input files might help us understand what you're trying to do.
Note that if 1/3 of your MDN.TXT lines are duplicates (as they are in your sample), you might speed things up considerably by getting rid of the duplicates before running the search loop. That alone would probably save you about 25% on your script's running time.
Hi Don,
Please find below the answers to the queries raised
SPLNOMAXLEN[] is for checking the maximum length of the input string,i.e, from MDN.txt & i am not trying for exact match for one of two numbers as input string may contain initial values with our without "91" so this condition is used
and
I will be carrying out further steps based on found & not found like, say populating fields from
. In case of found & not matching will handle cases accordingly.
So output from found will be
MDN.TXT lines will definitely have duplicates as per requirement and i cannot help it but SPLNO.TXT will not have duplicates for sure.
Please let me know in case processing time can be reduced.
Hi siramitsharma,
Forum rules prohibit sending private message asking people to respond to your posts.
Note that with the two lines in SPLNO.TXT:
any of the following lines in MDN.TXT:
would match the 1st line. And any of the following lines in MDN.TXT:
would match the 2nd line.
So maybe you could pre-build a table of the values that could match entries from the first file and instead of performing lots of matches and comparisons in a loop, you could just test if((mdn in table) || ("91"mdn in table)) instead of the four slower tests you are current using to see if there is a match.
Since every entry in the input has the final two fields with the values 1 and 2, respectively, all we need to know is whether or not there is a match; not what values appear if there is a match. This is important because both entries above match lines in MDN.TXT containing the values:
You also need to explain why the 2nd field in the 1st line above has the value 30. Since field 1 is 12 characters (918542054921), the longest possible string that can be matched is 12 characters. And, on the 2nd line we have a 1st field (854215595) with length 9 and a second field containing 12. So, I repeat, what use is the 2nd field in SPLNO.TXT other than to give you two more tests to slow down your loop?
Hi guys,
I have a text file named file1.txt that is formatted like this:
001 , ID , 20000
002 , Name , Brandon
003 , Phone_Number , 616-234-1999
004 , SSNumber , 234-23-234
005 , Model , Toyota
007 , Engine ,V8
008 , GPS , OFF
and I have file2.txt formatted like this:
... (2 Replies)
Hi Friends,
I have a very big text file, that has code for multiple functions. I have scan through the file and write each function in seperate file. All functions starts with
BEGIN DSFNC
Identifier "ABCDDataValidationfnc"
and ends with
END DSFNC
I need create a file(using identifier)... (2 Replies)
Hello,
Some time ago a helpful awk file was provided on the forum which I give below:
NR==FNR{A=$0;next}{for(j in A){split(A,P,"=");for(i=1;i<=NF;i++){if($i==P){$i=P}}}}1
While it works beautifully on English and Latin characters i.e. within the ASCII range of 127, the moment a character beyond... (6 Replies)
Hi,
I am trying to populate an array with data from a text file. I have a working method using awk but it is too slow and inefficent. See below.
The text file has 70,000 lines. As awk is a line editor it reads each line of the file until it gets to the required line and then processes it.... (3 Replies)
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
Write a template main.c file via shell script to make it easier for yourself later.
The issue here isn't writing... (2 Replies)
Hi,
I have an XML file with around 1 billion rows in it and i am trying to find the number of times a particular tag occurs in it. The solution i am using works but takes a lot of time (~1 hr) .Please help me with an efficient way to do this.
Lets say the input file is
<Root>
... (13 Replies)
How to reverse search for a matched string in a file. Get line# of the first matched line. I am getting '2' into 'lineNum' variable.
But it feels like I am using too many commands. Is there a better more efficiant way to do this on Unix?
abc.log
aaaaaaaaaaaaa
bbbbbbbbbbbbb... (11 Replies)
Hello friends!
Help me pls to write correct awk and grep statements for my task:
I have got files with name filename.txt
It has such structure:
Start of file
FROM: address@domen.com (12...890) abc
DATE: 11/23/2009 on Std
SUBJECT: any subject
End of file
So, I must check,
if this file... (4 Replies)
Hello!
I have text file:
From aaa@bbb Fri Jun 1 10:04:29 2010
--____OSPHWOJQGRPHNTTXKYGR____
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
My code '234565'. ... (2 Replies)
hello all
greeting for the day
i have a text file as the following
text.xml
abcd<FIELD>123.456</FIELD>efgh
i need to replace the value between <FIELD> and </FIELD> by using awk command.
please throw some light on this.
thank you very very much
Erik (5 Replies)