Visit Our UNIX and Linux User Community


String pointer does not work


 
Thread Tools Search this Thread
Top Forums Programming String pointer does not work
# 1  
Old 01-30-2014
String pointer does not work

Hello, I am trying to reverse complement one string and reverse another (NO complement!), both with pointer. My code compiled without error, but did not do the job I wanted.
Code:
#include <stdio.h>
#include <stdlib.h>
#include <zlib.h>
#include "kseq.h"
 
// STEP 1: declare the type of file handler and the read() function
KSEQ_INIT(gzFile, gzread)
 
char *rev_comp(char *seq);
char *strrev(char *str);
 
int main(int argc, char *argv[])
{
    gzFile fp;
    kseq_t *seq;
    int l;
 
    char *entry_rc;
    char *qual_rv;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s <infile.fastq> ", argv[0]);
        return 1;
    }
 
    fp = gzopen(argv[1], "r");                      // STEP 2: open the file handler
    seq = kseq_init(fp);                              // STEP 3: initialize seq
 
   
    while ((l = kseq_read(seq)) >= 0) {          // STEP 4: read sequence
      if (strlen(seq->seq.s) ==  strlen(seq->qual.s))   
{
//  entry_rc = malloc(sizeof(char)*strlen(seq->seq.s));          //Line 39:  This line is not needed! But Why????
   qual_rv =  malloc(sizeof(char)*strlen(seq->qual.s));          //Line 40: Not sure this is the right way either.
    entry_rc = rev_comp(seq->seq.s);                             //Line 41: Seems fine to have the DNA rev_complemented!
    qual_rv = strrev(seq->qual.s);                               //Line 42: This line did not work!
    printf(">%s\n%s\n%s\n%s\n%s\n", seq->name.s, seq->seq.s, seq->qual.s, entry_rc, qual_rv );
    }
 }
 
    kseq_destroy(seq);                         // STEP 5: destroy seq
    gzclose(fp);                                    // STEP 6: close the file handler
 
    return 0;
}
 
/*************************** Functions ***************************************************/
/*1. reverse and complement the sequence*/
 
char *rev_comp(char *seq) {
    if( seq == NULL || !(*seq) )
            return NULL;
 
    int i, j = strlen(seq)-1;
    char *seq_rc;
 
    seq_rc = malloc(sizeof(char) * (j+1));
 
    for(i=0; j>=0; i++, j--) {
        *(seq_rc+i) = *(seq+j);
        if (*(seq_rc+i) == 'A') *(seq_rc+i) ='T';
        else if (*(seq_rc+i) == 'C') *(seq_rc+i) ='G';
            else if (*(seq_rc+i) == 'G') *(seq_rc+i) ='C';
                else if (*(seq_rc+i) == 'T') *(seq_rc+i) ='A';
                    else
                         *(seq_rc+i) ='N';
    }
 
    return seq_rc;
}
 
 
/*2. reverse the quality string */
 
char *strrev( char *s)
  {
  char  c;
  char* s0 = s - 1;
  char* s1 = s;
 
  /* Find the end of the string */
  while (*s1) ++s1;
 
  /* Reverse it */
  while (s1-- > ++s0)
    {
    c   = *s0;
    *s0 = *s1;
    *s1 =  c;
    }
 
  return s;
  }

The problem seems to be with Line 42.
I believe strrev() function is fine as it was tested to be working well with another program. It may be related to the fact of seq->qual.s, which is a string pointer like seq->seq.s, which function rev_comp() worked well with. From my previous post, I seemed to understand the difference between string pointer and string array, so I tried use malloc to allocate memory for both of them, especially here I want reverse the string seq->qual.s in place with strrev() function.

My questions are:
1) What is wrong with Line 42 and Line 40?
2) Why Line 39 is not needed (tested!) and Line 41 worked well with string pointer as parameter?
Code:
infile:
@HWI-EAS121
CGTTACGAGATCGGAAGAGCGGTTCAGCAG
+HWI-EAS121
aaaaa`b_aa`aa`YaX]aZ`aZM^Z]YRa
@HWI-EAS122
GGGTGGGCATTTCCACTCGCAGTATGGGTT
+HWI-EAS122
a``^\__`_```^a``a`^a_^__]a_]\]
@HWI-EAS123
CGTTTATGTTTTTGAATATGTCTTATCTTA
+HWI-EAS123
abaa`^aaaaabbbaababbbbbb`bbbb_

I have attached the kseq.h header file which is needed to handle zipped or unzipped sequence files. What I expected to have 2nd, 6th, 10th row of string reverse complemented (DNA), and the 4th, 8th, 12th row of strings reversed of the infile, but the program did not do the job. Here is the output:
Code:
output:
>HWI-EAS121
CGTTACGAGATCGGAAGAGCGGTTCAGCAG
aRY]Z^MZa`Za]XaY`aa`aa_b`aaaaa
CTGCTGAACCGCTCTTCCGATCTCGTAACG
aRY]Z^MZa`Za]XaY`aa`aa_b`aaaaa
>HWI-EAS122
GGGTGGGCATTTCCACTCGCAGTATGGGTT
]\]_a]__^_a^`a``a^```_`__\^``a
AACCCATACTGCGAGTGGAAATGCCCACCC
]\]_a]__^_a^`a``a^```_`__\^``a
>HWI-EAS123
CGTTTATGTTTTTGAATATGTCTTATCTTA
_bbbb`bbbbbbabaabbbaaaaa^`aaba
TAAGATAAGACATATTCAAAAACATAAACG
_bbbb`bbbbbbabaabbbaaaaa^`aaba

I know this post is a little too long, but the idea of code is to call two functions to manipulate the string within the file. Could not figure out the bug by myself while trying to catch string pointer. Thanks a lot in advance!
# 2  
Old 01-30-2014
rev_comp malloc's memory for it's result, so you don't need Line 39 to do that.

strrev has a side-effect: it actually changes the input. then it returns the same pointer you give it! look how the two lines are the same

Code:
>HWI-EAS121
CGTTACGAGATCGGAAGAGCGGTTCAGCAG
aRY]Z^MZa`Za]XaY`aa`aa_b`aaaaa # qual.s
CTGCTGAACCGCTCTTCCGATCTCGTAACG
aRY]Z^MZa`Za]XaY`aa`aa_b`aaaaa # qual_rv

easiest solution...

Code:
if (strlen(seq->seq.s) ==  strlen(seq->qual.s)) {
    entry_rc = rev_comp(seq->seq.s);
    printf(">%s\n%s\n%s\n%s\n%s\n", seq->name.s, seq->seq.s, seq->qual.s, entry_rc, strrev(seq->qual.s));
    free(entry_rc);
}

seq->qual.s will be reversed now in memory, but you don't seem to use it again ...
# 3  
Old 01-30-2014
Thanks Scott!
Quote:
rev_comp malloc's memory for it's result, so you don't need Line 39 to do that.
Yes, this part is understood now. Ah, always I can't debug until someone points it out!
look how the two lines are the same. That's the problem I was trying to solve with Line 42, and your suggestion as strrev(seq->qual.s)which I have tried, but did not work.
By the way what do you mean
Quote:
seq->qual.s will be reversed now in memory, but you don't seem to use it again ..
by this sentence? I was trying to replace the string in place, but not sure I am doing it the right way. Thanks again!
# 4  
Old 01-30-2014
strlen(string) is one too short to store a string, since it doesn't count the NULL terminator as part of the length.

If you want the string to inhabit new memory, you have to actually put it in the new memory. For qual_rv, you do:

Code:
pointer=(pointer to new memory via malloc);
pointer=(pointer to the same old memory after overwriting it);

...which makes the malloc pretty pointless, no?

In summary:

Pointers do absolutely nothing magic for you. Assigning a pointer does not assign its contents.
# 5  
Old 01-30-2014
Assigning a pointer does not assign its contents. I had thought I understand this, but actually I do not! Here is another example I thought I understood, but I am so unsure even it compiles fine and runs well as I expected.
Code:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>

//_____________Version 1_______________
void *strrev1(char* begin){
    char temp;
    char* end;
    end = begin + strlen(begin)-1;

    while(end>begin){
        temp = *end;
        *end = *begin;
        *begin = temp;
        end--;
        begin++;
    } 
}

//_____________Version 2_______________
char *strrev2( char *s)
  {
  char  c;
  char* s0 = s - 1;
  char* s1 = s;
 
  /* Find the end of the string */
  while (*s1) ++s1;
 
  /* Reverse it */
  while (s1-- > ++s0)
    {
    c   = *s0;
    *s0 = *s1;
    *s1 =  c;
    }
 
  return s;
  }

//_____________Version 3_______________
char *strrev3(char* seq){
  if( seq == NULL || !(*seq) ) 
            return NULL;

    int i, j = strlen(seq)-1;
    char *seq_rv;

    seq_rv = malloc(sizeof(char) * (j+1));

   for(i=0; j>=0; i++, j--) 
        *(seq_rv+i) = *(seq+j);

    return seq_rv;
//    free(seq_rv);                        //Line 57: Is this needed?
}

int main(){
    char *string, *copy1, *copy2, *copy3;            //using pointer for strings
    string=malloc(sizeof(char)*100);
    copy1=malloc(sizeof(char)*100);
    copy2=malloc(sizeof(char)*100);
    copy3=malloc(sizeof(char)*100);

 scanf("%s",string);
    strcpy(copy1, string);
    strcpy(copy2, string);
    strcpy(copy3, string);

    strrev1(string);

    printf("\n%s\n", copy1);
    printf("%s\n", string);
    printf("%s\n", strrev2(copy2));
    printf("%s\n", strrev3(copy3));

free(string);
free(copy1);
free(copy2);
free(copy3);

return 0;

}

The first two functions are adapted from stackoverflow, and the third one by myself. Now my questions are:
1)Both strrev1() and strrev2() reverse the string in place, but strrev1() does not have return value as the latter which returns a pointer; I tried to integrate to my original program for seq->qual.s, neither one worked. Why?
2) strrev3() malloc memory and has return value too. Does it take double amount of memory as compared with strrev2()? If I have 100 millions of sequence(Yes, that's normal with my files!) Then replace the original string in place would be a good idea, isn't it?
3) When malloc is used in function with returned value, should I free the memory in the function after the value is returned (Line 57) ? Memory should automatically be released on the stack once the function call is done, is this right?

Last edited by yifangt; 02-01-2014 at 11:00 AM.. Reason: typo
# 6  
Old 01-30-2014
Just because a function returns a value, doesn't mean you need to use it. The in-place functions return pointers for convenience -- so you can do printf("%s\n", strrev(string)); instead of strrev(string); printf("%s\n", string); But the pointer itself should not change (they'd hardly be "in place" if they did).

Notably, it means you cannot do this:

Code:
char buf[]="slartibartfast";
buf=strrev(buf);

But this will work just fine for a function that works in-place:
Code:
char buf[]="slartibartfast";
strrev(buf);

That other function on the other hand actually does allocate new memory which should be freed later. Take a good look at line 57. What, exactly would it do? You need a free() somewhere, but that's not really a useful place to put it. the return() causes the function to end before its freed. ...and if you put it first, you'd be returning invalid memory -- memory that'd already been freed. That's no good either. So where does it need to go?


Code:
char *newpointer;
printf("%s\n", newpointer=copyreverse(oldpointer));
free(newpointer);

As for why your program didn't work, I'd have to see it to guess. You piled several problems in your example for brevity but they interacted with each other and obscured the point.

Last edited by Corona688; 01-30-2014 at 05:05 PM..
This User Gave Thanks to Corona688 For This Post:

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

C shell concatenate string doesn't work

I have the following code: #!/bin/csh clear set cloud_file="/home/labs/koren/davidsr/general_scripts/MFP_10_PP_Parmas.txt" # to fill set mie_tables_dir='/home/labs/koren/davidsr/SHDOM_MAIN/MIE_TABLES/non_polo_wave_0.7_water_50R_s0.5_e25_max_70.mie' # to fill set prp_dir='${pp_dir}/pp_prp/'... (2 Replies)
Discussion started by: student_wiz
2 Replies

2. Shell Programming and Scripting

How to extract work in line string.?

Hello all, Soon I will be receiving a new file. I've asked the source system to put "TRAILER.1+0000007+1" for the trailer to indicate full receipt of file. I need to know how to separate TRAILER so I can use it in a if statement. I used the tail command but not sure how to incorporate awk or... (11 Replies)
Discussion started by: pone2332
11 Replies

3. Shell Programming and Scripting

Why does bc work with 'here string' and not via pipe in this example?

Hi! I actually got it running, but I still would like to understand, why and how, since I am a beginner in bash scripting. I Need floating numbers and thus use bc in my bash script. Here it is: #!/bin/bash num1="10^-15" | bc -l #power function piped to bc - DOES NOT WORK echo $num1... (4 Replies)
Discussion started by: McHale
4 Replies

4. Programming

Scanf() string pointer problem

I have a problem with scanf() for string pointer as member of a struct. #include <stdio.h> #include <stdlib.h> struct Student { int studentNumber; int phoneNumber; char *studentName; //line 7 // char studentName; //line 8 }; int... (10 Replies)
Discussion started by: yifangt
10 Replies

5. Programming

How i use pointer as a string in c programing?

I'm newbie learner. My all friend use windows just only me use linux. so i can't solve any problem by myself. i need a solution. how can i use pointer as a string. #include<string.h> #include<stdio.h> int main() { char *s='\0'; gets(s); puts(s); return 0; } This code work on... (6 Replies)
Discussion started by: raihan004
6 Replies

6. UNIX for Dummies Questions & Answers

Counting vowels in string. "Comparison pointer-integer".

I'm trying to write a programme which scans strings to find how many vowels they contain. I get an error saying that I'm trying to compare a pointer and an integer inif(*v == scanme){. How can I overcome this ? Also, the programme seems to scan only the first word of a string e.g.: if I type "abc... (1 Reply)
Discussion started by: fakuse
1 Replies

7. Programming

segfault in pointer to string program

hello all, my question is not about How code can be rewritten, i just wanna know even though i am not using read only memory of C (i have declared str) why this function gives me segfault :wall:and the other code executes comfortably though both code uses same pointer arithmetic. ... (4 Replies)
Discussion started by: zius_oram
4 Replies

8. Solaris

string extraction won't work. Why?

#!/usr/bin/ksh set -x testfile=my.test.file.flag echo ${testfile: (-4)} #/home/maldohe/scripts/spawn1& sleep 3 echo myspawn is now ending exit Background: I am trying to extract the word flag from anf given file name. This is a demo script that I am working on to fix a production issue.... (8 Replies)
Discussion started by: Harleyrci
8 Replies

9. Programming

pass a pointer-to-pointer, or return a pointer?

If one wants to get a start address of a array or a string or a block of memory via a function, there are at least two methods to achieve it: (1) one is to pass a pointer-to-pointer parameter, like: int my_malloc(int size, char **pmem) { *pmem=(char *)malloc(size); if(*pmem==NULL)... (11 Replies)
Discussion started by: aaronwong
11 Replies

10. Programming

String and pointer problem

i am having a string like " X1 " ---> string lenght is 30 I have stored this to a chararry . ref so here ref = " X1 " now i trim the left space by my function . Si the string now becomes "X1 " ---> string lenght is 15... (3 Replies)
Discussion started by: arunkumar_mca
3 Replies

Featured Tech Videos