Sponsored Content
Top Forums Programming Merge two strings by overlapped region Post 302896120 by Corona688 on Friday 4th of April 2014 12:56:40 PM
Old 04-04-2014
Quote:
Originally Posted by yifangt
Questions: 1) As no malloc() was used to allocate memory for out, str1 and str2 in your strmerg() function. Is it because they are all declared const char* ?
Look closer, they're not all const char *.

It's nothing to do with malloc.

The two which are 'const char *' are set that way because I don't need to write to their contents.

The one which isn't 'const char *' is because I need to write to its contents.

I could have made them all plain 'char *' and it would have still worked. The 'const' is a reminder to me, the programmer, of which arguments I should be writing to and which I shouldn't. It's also a safety mechanism, so if I try to cram unwritable things into it, the compiler will complain.

Quote:
2) In main() char out[4096] was used to hold the merged string, in practice, string 1 and string 2 can be as large as mega-bases. I thought dynamic allocation of the memory for out using a pointer would be better. It seems there is no such thing in C to dynamically allocate space for the merged string based on the two inputs (???).
...and having said so, you go ahead and do the "impossible" in the same breath Smilie

I avoided malloc because pointers still confuse you. I just thought it was more straightforward to do it that way.

Quote:
Can I ask if if there is anything inappropriate with my modification on main() function?[/COLOR]
Actually that looks fine. It allocates more memory than it needs to so isn't ideal, but will do what you want it to do, and won't crash.

Here's what I would do:

Code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

// Returns newly-allocated memory which you must 'free' later.
char *strmerg(const char *str1, const char *str2) {
        // Note:  size_t and ssize_t are technically more correct for numbers holding string length,
        // particularly for LARGE strings.  I had to use ssize_t for 'a' since I need a negative number
        // for my while loop.

        // Position in 'str1'.  Counts backwards from end.
        // Each loop, str1+a will be 1 longer, i.e.  "t", "at", "tat", "atat", "Gatat"...
        ssize_t a=strlen(str1)-1;
        // How many bytes of overlap has been found.  0 if none.
        size_t found=0;
        // How many characters of overlap to check for.  Counts up from 1.
        size_t b=1;

        while(a >= 0) // Loop until a is negative
        {
                // Compare the last 'b' bytes of str1, to the first 'b' bytes of str2.
                // Using strncmp instead of strcmp prevents it from checking ALL of str2,
                // because strncmp takes maximum length as an argument.
                // It will return 0 if they are equal.
                if(str1[a] == str2[0]) // Optimization
                if(strncmp(str1+a, str2, b) == 0) found=b;
                a--;
                b++;
        }

        b=strlen(str1); // don't count 5 megabases more times than necessary

        // Code block only because some compilers require you to declare
        // all variables at the top of a code block
        {
                char * const out=malloc(sizeof(char)*(b+strlen(str2+found)+1));
                strcpy(out, str1);
                strcpy(out+b, str2+found); // Faster than strcat
                return(out);
        }
}

int main(int argc, char *argv[])
{
        if (argc != 3) {
                printf("Error! \nUsage: ./%s string1 string2\n", argv[0]);
                exit(EXIT_FAILURE);
        }
        else
        {
                char * const out=strmerg(argv[1], argv[2]);
                printf("string 1 is %s\n", argv[1]);
                printf("string 2 is %s\n", argv[2]);
                printf("Output is %s\n", out);
                free(out);
        }
        return(0);
}


Last edited by Corona688; 04-04-2014 at 02:13 PM..
This User Gave Thanks to Corona688 For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

stack region

how can i determine that what percentage of stack region is currently is used? (i am using tru64 unix) (2 Replies)
Discussion started by: yakari
2 Replies

2. UNIX for Dummies Questions & Answers

Merge two strings not from files

str1="this oracle data base record" str2="one two three four five" Output: this one oracle two data three base four record five str1 and str2 have the same column but they are not fixed columns. I can do it with "paste" but I do not want to create file everytime the script runs from... (2 Replies)
Discussion started by: buddyme
2 Replies

3. UNIX for Advanced & Expert Users

Best practice - determining what region you are on

Hello all, I have a question about what you think the best practice is to determine what region you are running on when you have a system setup with a DEV/TEST, QA, and PROD regions running the same scripts in all. So, when you run in DEV, you have a different directory structure, and you... (4 Replies)
Discussion started by: Rediranch
4 Replies

4. Shell Programming and Scripting

Region between lines

How can I find the regions between specific lines? I have a file which contains lines like this: chr1 0 17388 0 chr1 17388 17444 1 chr1 17444 17599 2 chr1 17599 17601 1 chr1 17601 569791 0 chr1 569791 569795 1 chr1 569795 569808 2 chr1 569808 569890 3 chr1 569890 570047 4 ... (9 Replies)
Discussion started by: linseyr
9 Replies

5. UNIX for Dummies Questions & Answers

overlapped genomic coordinates

Hi, I would like to know how can I get the ID of a feature if its genomic coordinates overlap the coordinates of another file. Example: Get the 4th column (ID) of this file1: chr1 10 100 gene1 chr2 3000 5000 gene2 chr3 200 1500 gene3 if it overlaps with a feature in this file2: chr2... (1 Reply)
Discussion started by: fadista
1 Replies

6. AIX

Change lv REGION in HDISK1

Dears my rootvg is missed up i can not extend the /opt as soon as i try to extend the Filesystem its give me that there is not enough space . as there any way to change the REGION of the LVs in HDISK1 ? lspv -p hdisk0 hdisk0: PP RANGE STATE REGION LV NAME TYPE ... (8 Replies)
Discussion started by: thecobra151
8 Replies

7. Shell Programming and Scripting

Merge left hand strings mapping to different right hand strings

Hello, I am working on an Urdu to Hindi dictionary which has the following structure: a=b a=c n=d n=q and so on. i.e. Headword separated from gloss by a = I am giving below a live sample بتا=बता بتا=बित्ता بتا=बुत्ता بتان=बतान بتان=बितान بتانا=बिताना I need the following... (3 Replies)
Discussion started by: gimley
3 Replies

8. Programming

Perl script to merge cells in column1 which has same strings, for all sheets in a excel workbook

Perl script to merge cells ---------- Post updated at 12:59 AM ---------- Previous update was at 12:54 AM ---------- I am using below code to read files from a dir and print to excel. open(my $in, '<', $file) or die "Could not open file: $!"; my $rowCount = 0; my $colCount = 0;... (11 Replies)
Discussion started by: Jack_Bruce
11 Replies

9. Shell Programming and Scripting

Merge strings from a file into a template

I am preparing a morphological grammar of Marathi to be placed in open-source. I have two files. The first file called Adverbs contains a whole list of words, one word per line A sample is given below: आधी इतक इतपत उलट एवढ ऐवजी कड कडनं कडल कडील कडून कडे करता करिता खाल (2 Replies)
Discussion started by: gimley
2 Replies

10. Shell Programming and Scripting

Merge strings with ignore case

I have a bi-lingual database of a large number of dictionaries. It so happens that in some a given string is in upper case and in others it is in lower case. An example will illustrate the issue. toll Tax=पथ-कर Toll tax=राहदारी कर toll tax=टोल I want to treat all three instances of toll tax... (3 Replies)
Discussion started by: gimley
3 Replies
All times are GMT -4. The time now is 02:42 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy