Merge two strings by overlapped region


 
Thread Tools Search this Thread
Top Forums Programming Merge two strings by overlapped region
# 8  
Old 04-02-2014
Problem:
Code:
char *str1 = argv[1]; 
const char *str2 = argv[2];

Why is that one not const?

Remember, programming isn't about "fixing compiler errors". If you get one, think about what it's telling you.
# 9  
Old 04-02-2014
I thought str1 is the destination and will be modified, i.e. appended with str2. And using two const char * I got warnings as:
Code:
prog009c2.c:73:2: warning: passing argument 1 of ‘strmerg' discards ‘const' qualifier from pointer target type [enabled by default]
prog009c2.c:21:7: note: expected ‘char *' but argument is of type ‘const char *'

# 10  
Old 04-02-2014
Is it? I didn't think it changed, but that may be my mistake. In any case, you have to think about the code I give you, not blindly use it -- especially not make blind changes just to "fix compiler errors". Can you tell me why I said to make these variables const? And how I told you to deal with those warnings?

As for question 2 -- you're back at square one with pointers again. What, exactly, have you assumed that str3 = ... is going to do?

Last edited by Corona688; 04-02-2014 at 01:44 PM..
# 11  
Old 04-02-2014
My understanding of "why they are const" is to avoid modification of the parameters by accident. For my strmerg( char *dest, char *src) function, the original design is dest and src are inter-changable, i.e. src can be appended to dest and vice verse.
I did not know the const char * restriction at the beginning to avoid editing the parameters.
True, at this moment I am trying to cross the stage to "make blind changes ...to fix the compiler error". When you ask me "why I said make these variables const" I thought I understood your saying, but not really. ------Should I give up with C now? Thank you!
# 12  
Old 04-02-2014
I declared it 'const' for a reason -- because if you try to modify them it will not work.

The compiler error happened because you tried to modify them.

Blindly removing the 'const' to fix the compiler error didn't make it work, just made the compiler error go away.

I am writing another in-depth explanation of pointers. Patience please.
This User Gave Thanks to Corona688 For This Post:
# 13  
Old 04-02-2014
I am dumping raw memory contents to show you what happens when you declare a variable on the stack, and allocate memory with malloc.

For the purposes of this, you can ignore the contents of the 'printpage' and 'spew' functions, they're convenience functions I made to dump memory and are not related otherwise.

Code:
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

void printpage(void *pointer, FILE *fout) {
  unsigned long p=(unsigned long int) pointer;
  unsigned long size=~(unsigned long)(getpagesize()-1);
  unsigned char *pp;
  p &= size;
  pp=(unsigned char *)p;
  fwrite(pp, getpagesize(), 1, fout);
}

void spew(const char *msg, const void *ptr) {
  FILE *fp=popen("hexdump -C", "w");
  unsigned long size=(unsigned long)(getpagesize()-1);
  printf("%7s @ %08lx(%08lx+%08lx)\n", msg,
        (unsigned long)ptr,
        ((unsigned long)ptr)&(~size),
        (unsigned long)ptr&size);
  fflush(stdout);
  printpage((void *)ptr, fp);
  pclose(fp);
  fflush(stdout);
  printf("\n");
  fflush(stdout);
}

int main(int argc, char *argv[])
{
  char *mem1=malloc(16);
  char *mem2=malloc(16);
  char *mem3=malloc(16);
  char *mem4=malloc(16);
  mem1[0]='A';  mem2[0]='B';    mem3[0]='C';    mem4[0]='D';

  printf("mem1=%p mem2=%p mem3=%p mem4=%p\n", mem1, mem2, mem3, mem4);

  spew("heap", mem1);
  spew("stack", &mem1);

  // Now, what happens if I do mem1=mem3 ?
  mem1=mem3;
  spew("heap after", mem1);
  spew("stack after",&mem1);
  return(0);
}

Remember that memory can be considered to be one gigantic array of bytes, from address 00000000 all the way up to ffffffff (on a 32-bit machine). Pointers are just array indexes inside this array.

So we allocate four blocks of 16 bytes, set their first element to something, and dump memory to see where they ended up:

Code:
mem1=0x085ef008 mem2=0x085ef020 mem3=0x085ef038 mem4=0x085ef050
   heap @ 085ef008(085ef000+00000008)
00000000  00 00 00 00 19 00 00 00  41 00 00 00 00 00 00 00  |........A.......|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 19 00 00 00  |................|
00000020  42 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |B...............|
00000030  00 00 00 00 19 00 00 00  43 00 00 00 00 00 00 00  |........C.......|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 19 00 00 00  |................|
00000050  44 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |D...............|
00000060  00 00 00 00 b1 00 00 00  84 2c ad fb 00 00 8c b7  |.........,......|
00000070  00 00 8c b7 00 00 8c b7  00 00 8c b7 00 00 8c b7  |................|
00000080  00 10 8c b7 00 00 8c b7  00 10 8c b7 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 60 45 8b b7  |............`E..|
000000a0  04 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000b0  08 f1 5e 08 ff ff ff ff  ff ff ff ff 00 00 00 00  |..^.............|
000000c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000d0  ff ff ff ff 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 36 8b b7  |.............6..|
00000100  5f 74 00 00 00 00 00 00  01 00 00 00 01 00 00 00  |_t..............|
00000110  c0 e6 76 b7 f1 0e 02 00  00 00 00 00 00 00 00 00  |..v.............|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000
(continued)

...but what about mem1 itself? Something, somewhere, has to remember that address of 0x85ef008, yes? And it does, on the "stack", which is a big block of memory which the processor uses as temporary space.

There's actually quite a lot of stuff on the stack. It's not just us that's using it. Every time you call a function, it uses stack to pass arguments, create local variables, and remember where to return. It gets used so much, in fact, that you can't trust that a local variable doesn't contain previously-used garbage values unless you set it to anything else yourself.
Code:
  stack @ bf8596b0(bf859000+000006b0)
00000000  81 cb 7a b7 c0 44 8b b7  84 88 04 08 02 00 00 00  |..z..D..........|
00000010  00 00 00 00 88 88 04 08  05 00 00 00 01 00 00 00  |................|
00000020  c0 ff ff ff c0 ff ff ff  c0 ff ff ff 54 bb 7a b7  |............T.z.|
<lots of garbage snipped>
000006a0  50 f0 5e 08 f4 3f 8b b7  c8 96 85 bf c9 87 04 08  |P.^..?..........|
000006b0  08 f0 5e 08 20 f0 5e 08  38 f0 5e 08 50 f0 5e 08  |..^. .^.8.^.P.^.|
000006c0  b0 87 04 08 d0 84 04 08  28 97 85 bf b5 5b 78 b7  |........(....[x.|
000006d0  01 00 00 00 54 97 85 bf  5c 97 85 bf 98 2a 8e b7  |....T...\....*..|
000006e0  b0 26 8c b7 01 00 00 00  01 00 00 00 00 00 00 00  |.&..............|
000006f0  f4 3f 8b b7 b0 87 04 08  d0 84 04 08 28 97 85 bf  |.?..........(...|
...
(continued)

It's there all right. Slightly out-of-order, but it's there. That's the order an x86 processor handles all numbers, nothing weird. Humans are weird in wanting it highest-to-lowest digit order instead of something easily mechanically processable, which is why we have printf to handle that job for us.

Now, we want to put a different string into mem1. What will the statement 'mem1=mem3' do?

Code:
heap after @ 085ef038(085ef000+00000038)
00000000  00 00 00 00 19 00 00 00  41 00 00 00 00 00 00 00  |........A.......|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 19 00 00 00  |................|
00000020  42 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |B...............|
00000030  00 00 00 00 19 00 00 00  43 00 00 00 00 00 00 00  |........C.......|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 19 00 00 00  |................|
00000050  44 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |D...............|
00000060  00 00 00 00 b1 00 00 00  84 2c ad fb 00 00 8c b7  |.........,......|
00000070  00 00 8c b7 00 00 8c b7  00 00 8c b7 00 00 8c b7  |................|
00000080  00 10 8c b7 00 00 8c b7  00 10 8c b7 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 00 00 00 60 45 8b b7  |............`E..|
000000a0  04 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000b0  08 f1 5e 08 ff ff ff ff  ff ff ff ff 00 00 00 00  |..^.............|
000000c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000d0  ff ff ff ff 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 36 8b b7  |.............6..|
00000100  61 74 00 00 00 00 00 00  01 00 00 00 01 00 00 00  |at..............|
00000110  c0 e6 76 b7 f1 0e 02 00  00 00 00 00 00 00 00 00  |..v.............|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000

It looks like -- absolutely nothing. Except... What happened on the stack?

Code:
stack after @ bf8596b0(bf859000+000006b0)
00000000  81 cb 7a b7 c0 44 8b b7  84 88 04 08 02 00 00 00  |..z..D..........|
00000010  00 00 00 00 88 88 04 08  05 00 00 00 01 00 00 00  |................|
00000020  c0 ff ff ff c0 ff ff ff  c0 ff ff ff 54 bb 7a b7  |............T.z.|
<lots of garbage snipped>
000006a0  50 f0 5e 08 f4 3f 8b b7  c8 96 85 bf c9 87 04 08  |P.^..?..........|
000006b0  38 f0 5e 08 20 f0 5e 08  38 f0 5e 08 50 f0 5e 08  |8.^. .^.8.^.P.^.|
000006c0  b0 87 04 08 d0 84 04 08  28 97 85 bf b5 5b 78 b7  |........(....[x.|
000006d0  01 00 00 00 54 97 85 bf  5c 97 85 bf 98 2a 8e b7  |....T...\....*..|

If mem1 is a pointer to a string, mem1=... does not modify the string. It alters the pointer.

Moderator's Comments:
Mod Comment pointer = cannot, will not, absolutely not, at all, ever, even for the sake of argument ever do anything else except alter the pointer.


Also: free() requires the exact same pointer that malloc() gave you. If you give it a different pointer, even slightly, it will crash.

If you give it the same pointer twice, it will crash. (Which means, if we did free() on all our pointers, our program would crash right now.)

If you write beyond the end of the memory you allocated, it will probably crash. (Because, as you can see in the dump, if you write beyond that you're stomping on top of something else).

Last edited by Corona688; 04-02-2014 at 03:15 PM..
These 3 Users Gave Thanks to Corona688 For This Post:
# 14  
Old 04-02-2014
We've had this conversation four or five times. You have a hard time telling when you are modifying the pointer instead of its contents. But the compiler can tell you that easily, so I have a suggestion:

Whenever you do char *mem=malloc(300); ...do this instead: char * const mem=malloc(300); This will make the mistake you keep repeating a compiler error -- "assignment of read-only variable". (It of course sets it, once, when you declare it. But thereafter it's considered 'fixed'.)

You are still free to modify its contents, like with strcpy(mem, originalstring) or mem[5]='Q' or *(mem+5)=37 or any other way you please. But if you try to alter the pointer that's an error. You can understand "assignment of read-only value" to mean "whoops, I mixed up a pointer's value and its contents".

If you find yourself needing to do complicated circumlocutions to get around 'assignment of read only variable', you can be fairly sure you've taken a wrong turn somewhere.

Last edited by Corona688; 04-02-2014 at 03:34 PM..
These 2 Users Gave Thanks to Corona688 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merge strings with ignore case

I have a bi-lingual database of a large number of dictionaries. It so happens that in some a given string is in upper case and in others it is in lower case. An example will illustrate the issue. toll Tax=पथ-कर Toll tax=राहदारी कर toll tax=टोल I want to treat all three instances of toll tax... (3 Replies)
Discussion started by: gimley
3 Replies

2. Shell Programming and Scripting

Merge strings from a file into a template

I am preparing a morphological grammar of Marathi to be placed in open-source. I have two files. The first file called Adverbs contains a whole list of words, one word per line A sample is given below: आधी इतक इतपत उलट एवढ ऐवजी कड कडनं कडल कडील कडून कडे करता करिता खाल (2 Replies)
Discussion started by: gimley
2 Replies

3. Programming

Perl script to merge cells in column1 which has same strings, for all sheets in a excel workbook

Perl script to merge cells ---------- Post updated at 12:59 AM ---------- Previous update was at 12:54 AM ---------- I am using below code to read files from a dir and print to excel. open(my $in, '<', $file) or die "Could not open file: $!"; my $rowCount = 0; my $colCount = 0;... (11 Replies)
Discussion started by: Jack_Bruce
11 Replies

4. Shell Programming and Scripting

Merge left hand strings mapping to different right hand strings

Hello, I am working on an Urdu to Hindi dictionary which has the following structure: a=b a=c n=d n=q and so on. i.e. Headword separated from gloss by a = I am giving below a live sample بتا=बता بتا=बित्ता بتا=बुत्ता بتان=बतान بتان=बितान بتانا=बिताना I need the following... (3 Replies)
Discussion started by: gimley
3 Replies

5. AIX

Change lv REGION in HDISK1

Dears my rootvg is missed up i can not extend the /opt as soon as i try to extend the Filesystem its give me that there is not enough space . as there any way to change the REGION of the LVs in HDISK1 ? lspv -p hdisk0 hdisk0: PP RANGE STATE REGION LV NAME TYPE ... (8 Replies)
Discussion started by: thecobra151
8 Replies

6. UNIX for Dummies Questions & Answers

overlapped genomic coordinates

Hi, I would like to know how can I get the ID of a feature if its genomic coordinates overlap the coordinates of another file. Example: Get the 4th column (ID) of this file1: chr1 10 100 gene1 chr2 3000 5000 gene2 chr3 200 1500 gene3 if it overlaps with a feature in this file2: chr2... (1 Reply)
Discussion started by: fadista
1 Replies

7. Shell Programming and Scripting

Region between lines

How can I find the regions between specific lines? I have a file which contains lines like this: chr1 0 17388 0 chr1 17388 17444 1 chr1 17444 17599 2 chr1 17599 17601 1 chr1 17601 569791 0 chr1 569791 569795 1 chr1 569795 569808 2 chr1 569808 569890 3 chr1 569890 570047 4 ... (9 Replies)
Discussion started by: linseyr
9 Replies

8. UNIX for Advanced & Expert Users

Best practice - determining what region you are on

Hello all, I have a question about what you think the best practice is to determine what region you are running on when you have a system setup with a DEV/TEST, QA, and PROD regions running the same scripts in all. So, when you run in DEV, you have a different directory structure, and you... (4 Replies)
Discussion started by: Rediranch
4 Replies

9. UNIX for Dummies Questions & Answers

Merge two strings not from files

str1="this oracle data base record" str2="one two three four five" Output: this one oracle two data three base four record five str1 and str2 have the same column but they are not fixed columns. I can do it with "paste" but I do not want to create file everytime the script runs from... (2 Replies)
Discussion started by: buddyme
2 Replies

10. UNIX for Advanced & Expert Users

stack region

how can i determine that what percentage of stack region is currently is used? (i am using tru64 unix) (2 Replies)
Discussion started by: yakari
2 Replies
Login or Register to Ask a Question