fast searching algorithm


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting fast searching algorithm
# 1  
Old 02-25-2006
fast searching algorithm

hello,

i need a searching algorithm in unix. since my input file is very bulky, so need a real fast searching algorithm, to match words. i am already using grep.
# 2  
Old 02-27-2006
one is to write multiple concurrent instances
and the other one,
our own algorithm based on the search pattern by narrowing search domain
# 3  
Old 02-27-2006
grep already uses a good algorithm for searching.

If the file is ordered with respect to your search key, you can use a simple binary search approach on fixed record length files -

Code:
/*****************************************
*   ffind.c --
*   find a value in a file using random access
*   usage ffind <value> <nrec> <filename>
*   requires a fixed length record file
*   possibly with a newline for each record,
*   reclen= data + newline
*   returns:
*    0 if the value is found,
*    1 if not found
**********************************/
#define RECL 10

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

FILE *in=NULL;
char _Wvalue[RECL*2]={0x0};

void usage(void)       /* help */
{
    fprintf(stderr,"%s\n",
        "usage: ffind <value to find> <number records> <filename>");
    exit(EXIT_FAILURE);
}

char  *setfilepos(long pos)  /* read from  in the file*/
{
     if(fseek(in,pos*RECL,SEEK_SET)!=0)
     {  /* note we do not check for EINTR which is possible */
     	perror("Parameter error - incorrect number of records");
     	exit(EXIT_FAILURE);
     }
     memset(_Wvalue,0x0,sizeof(_Wvalue));
     if(fgets(_Wvalue,sizeof(_Wvalue),in)==NULL)
     {
        if(!feof(in))
        {
            perror("File read error");
            exit(EXIT_FAILURE);
        }      
     }
     return _Wvalue;
}

int compar(const char *key,long rec) /* compare key & test value */
{
    char value[RECL+1]={0x0};
    char *p=NULL;

    strcpy(value,setfilepos(rec));
    p=strchr(value,'\n');
    if(p!=NULL) *p=0x0;
    return strcmp(key,value);
}

/* binary search against a fixed Rec Len  file */
int fbsearch(const char *key, long nrecs)
{
    int retval=0;
    long offset=0;
    long guess=0;

    nrecs--;
    guess=(nrecs - offset)/2;
    for(;;)
    {
        retval=compar(key,guess);
        if(retval)
        {
            if(retval>0)
            {
            	offset=guess+1;
            }
            else
            {
                nrecs=guess-1;
            }
            if(offset > nrecs) break;

            guess=offset + (nrecs-offset)/2;
            continue;
        }
        break;
    }
    return retval;
}

int main(int argc, char *argv[])
{
    int result=0;
    if(argc<4)
    {
        usage();
    }
    in=fopen(argv[3],"r");
    if(in==NULL)
    {
        perror("Error opening input file");
        exit(EXIT_FAILURE);
    }
    result=fbsearch(argv[1],atol(argv[2]));
    return (result!=0);
}

# 4  
Old 02-28-2006
jim,

i have few questions

1) using binary search algorithm,
you have additional overhead of sorting and then making useof it
i believe you have given an example assuming ordered sort keys.

2) file is not closed, that should be

3) in compar what is the need of pointer to character 'p'?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Masking algorithm

I have a requirement of masking few specific fields in the UNIX file. The details are as following- File is fixed length file with each record of 250 charater length. 2 fields needs to be masked – the positions are 21:30 and 110:120 The character by character making needs to be done which... (5 Replies)
Discussion started by: n78298
5 Replies

2. Homework & Coursework Questions

Banker's algorithm

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: shell scripts to simulate Banker’s algorithm on a collection of processes (process details are entered as inputs... (4 Replies)
Discussion started by: syah
4 Replies

3. Shell Programming and Scripting

Fast algorithm to compare an IP address against a list of IP sections?

I have two files: file1: 41.138.128.0 41.138.159.255 location 41.138.160.0 41.138.191.255 location 41.138.192.0 41.138.207.255 location 41.138.208.0 41.138.223.255 location 41.138.224.0 41.138.239.255 location 41.138.240.0 41.138.255.255 location 41.138.32.0 ... (7 Replies)
Discussion started by: kevintse
7 Replies

4. Shell Programming and Scripting

Performing fast searching operations with a bash script

Hi, Here is a tough requirement , to be served by bash script. I want to perform 3,00,000 * 10,000 searches. i.e. I have 10,000 doc files and 3,00,000 html files in the file-system. I want to check, which of the doc files are referred in any html files. (ex- <a href="abc.doc">abc</a>)... (3 Replies)
Discussion started by: jitendriya.dash
3 Replies

5. UNIX for Advanced & Expert Users

Algorithm In Pseudocode

A) produce an algorithm in pseudocode and a flowchart that gets n from the user and calculate their sum. B) Write an algorithm in pseudocode and a flowchart that gets number x from he user and calculates x5 ( X to the power of %5). Calculate by using multiplication. ... (1 Reply)
Discussion started by: delsega
1 Replies

6. Shell Programming and Scripting

algorithm

PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 21444 tomusr 213M 61M sleep 29 10 1:20:46 0.1% java/43 21249 root 93M 44M sleep 29 10 1:07:19 0.2% java/56 is there anyway i can use a command to get the total of the SIZE? 306M (Derive from... (5 Replies)
Discussion started by: filthymonk
5 Replies

7. Solaris

Help in Search Algorithm

I am tryin to change the sort fields in mainframes to the equivalent in Unix. I have a large datafile of which i extract only the specified fields ... cut them ... write it into another file with a delimiter... and sort based on these fields... then match these fields to those from input file ...... (1 Reply)
Discussion started by: bourne
1 Replies

8. Programming

FTP's algorithm

what algorithm a FTP application uses i mean whn implemented in socket programming..if you could give a little decription (1 Reply)
Discussion started by: toughguy2handle
1 Replies

9. Programming

Algorithm problem

Looking for an algorithm to compute the number of days between two given dates I came across a professor's C program located here: http://cr.yp.to/2001-275/struct1.c I was wondering if anyone could tell me where the value 678882 in the line int d = dateday - 678882; comes from and also the... (1 Reply)
Discussion started by: williamf
1 Replies

10. Programming

Feedback algorithm

Hi I search an exemple of scheduling Feedback algorithm, or help about how to create one. Thanks (0 Replies)
Discussion started by: messier79
0 Replies
Login or Register to Ask a Question