Suitable data structure large number of heterogeneous records Post: 302506405

Sponsored Content

Top Forums Programming Suitable data structure large number of heterogeneous records Post 302506405 by Corona688 on Sunday 20th of March 2011 03:45:34 PM

03-20-2011

Registered User

There's no way it should be taking 16 hours and eating that much memory. Even one million items is only about 20 levels deep recursion. You must have a rare condition in that code where it just recurs in an infinite loop. With the complete and total lack of any useful comments whatsoever anywhere, I'm left having to reverse-engineer your code, so I'm not completely understanding it yet. But I wonder if this code might serve you better. Wrote it way back in college.

Code:

/* Merges two sorted lists together into one sorted list */
struct node *merge_list(struct node *l1, struct node *l2)
{
  struct node *cur=NULL;  /* Last node in list, where we work from */
  struct node *head=NULL; /* First node in list, our return value */

  /* If there's less than two lists, nothing to merge */
  if(l1==NULL)      return(l2);
  else if(l2==NULL) return(l1);

  while(l1||l2) /* Loop while either list is not empty */
  {
    struct node *tmp;
    if((!l1)||(!l2)) /* If one of the lists is empty */
    {
      /* cur is guaranteed to not be NULL.  This loop will have
         executed at least once by the time we reach here. */
      if(!l1) cur->next=l2; /* Append l2 to the end of the list */
      else    cur->next=l1; /* Append l1 to the end of the list */
      return(head); /* Return the new list */
    }
    else if(l1->sval < l2->sval) /* append first node from l2 */
    {
      if(head==NULL) /* empty list?  Don't append, just set */
      {
        head=cur=l2;
        l2=l2->next;
      }
      else /* append, go to next node */
      {
        cur->next=l2;
        cur=cur->next;
        l2=l2->next;
        cur->next=NULL;
      }
    }
    else /* append first node from l1 */
    {
      if(head==NULL) /* empty list?  don't append, just set */
      {
        head=cur=l1;
        l1=l1->next;
      }
      else /* append, go to next node */
      {
        cur->next=l1;
        cur=cur->next;
        l1=l1->next;
        cur->next=NULL;
      }
    }
  }
  return(head); /* Return the merged list */
}

/* Recursively sorts a linked list with the mergesort algorithm */
struct node *mergesort_list(struct node *list)
{
  /* pointers to first and last nodes of new lists */
  struct node *head[2],*tail[2];

  /* Empty list, return immediately */
  if(list==NULL)
    return(NULL);

  /* Set up list #0 */
  head[0]=tail[0]=list;
  list=list->next;

  /* Only one node, no sorting to be done */
  if(list==NULL)
    return(head[0]);

  /* Set up list #1 */
  head[1]=tail[1]=list;
  list=list->next;

  /* Loop through all the rest of the nodes in list, appending to and 
     splitting the nodes evenly between tail[0] and tail[1] */
  while(list!=NULL)
  {
    tail[0]->next=list;
    tail[0]=tail[0]->next;
    list=list->next;

    if(list!=NULL)
    {
      tail[1]->next=list;
      tail[1]=tail[1]->next;
      list=list->next;
    }    
  }

  /* Properly terminate the lists */
  tail[0]->next=NULL;
  tail[1]->next=NULL;

  /* recursively mergesort the two half-lists. */
  head[0]=mergesort_list(head[0]);
  head[1]=mergesort_list(head[1]);
  /* merge the now-sorted lists together, returning new head pointer. */
  return(merge_list(head[0],head[1]));
}

Testing it in my example above, it ends up being about 5 times slower than the ordinary C qsort, even counting the time it takes to copy every node into an array for qsort. There's something to be said for being able to access your data in an indexed manner -- all that repeated list traversal wastes a huge amount of time.

Last edited by Corona688; 03-20-2011 at 04:55 PM..

This User Gave Thanks to Corona688 For This Post:

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract data from large file 80+ million records

Hello, I have got one file with more than 120+ million records(35 GB in size). I have to extract some relevant data from file based on some parameter and generate other output file. What will be the besat and fastest way to extract the ne file. sample file format :--...

2. Shell Programming and Scripting

How to Pick Random records from a large file

Hi, I have a huge file say with 2000000 records. The file has 42 fields. I would like to pick randomly 1000 records from this huge file. Can anyone help me how to do this?

3. Shell Programming and Scripting

Find line number of bad data in large file

Hi Forum. I was trying to search the following scenario on the forum but was not able to. Let's say that I have a very large file that has some bad data in it (for ex: 0.0015 in the 12th column) and I would like to find the line number and remove that particular line. What's the easiest...

4. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,...

5. Shell Programming and Scripting

AWK print number of records, divide this number

I would like to print the number of records of 2 files, and divide the two numbers awk '{print NR}' file1 > output1 awk '{print NR}' file2 > output2 paste output1 output2 > output awl '{print $1/$2}' output > output_2 is there a faster way?

6. Shell Programming and Scripting

Compare two files with different number of records and output only the Extra records from file1

Hi Freinds , I have 2 files . File 1 |nag|HYd|1|Che |esw|Gun|2|hyd |pra|bhe|3|hyd |omu|hei|4|bnsj |uer|oeri|5|uery File 2 |nag|HYd|1|Che |esw|Gun|2|hyd |uer|oi|3|uery output :

7. Shell Programming and Scripting

Split a large file in n records and skip a particular record

Hello All, I have a large file, more than 50,000 lines, and I want to split it in even 5000 records. Which I can do using sed '1d;$d;' <filename> | awk 'NR%5000==1{x="F"++i;}{print > x}'Now I need to add one more condition that is not to break the file at 5000th record if the 5000th record...

8. Shell Programming and Scripting

Quick way to select many records from a large file

I have a file, named records.txt, containing large number of records, around 0.5 million records in format below: 28433005 1 1 3 2 2 2 2 2 2 2 2 2 2 2 28433004 0 2 3 2 2 2 2 2 2 1 2 2 2 2 ... Another file is a key file, named key.txt, which is the list of some numbers in the first column of...

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract data from large file 80+ million records

Discussion started by: learner16s

2. Shell Programming and Scripting

How to Pick Random records from a large file

Discussion started by: ajithshankar@ho

3. Shell Programming and Scripting

Find line number of bad data in large file

Discussion started by: pchang

4. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Discussion started by: kam66

5. Shell Programming and Scripting

AWK print number of records, divide this number

Discussion started by: programmerc

6. Shell Programming and Scripting

Compare two files with different number of records and output only the Extra records from file1

Discussion started by: i150371485

7. Shell Programming and Scripting

Split a large file in n records and skip a particular record

Discussion started by: ibmtech

8. Shell Programming and Scripting

Quick way to select many records from a large file

Discussion started by: zenongz