C: inputting string of unknown length


 
Thread Tools Search this Thread
Top Forums Programming C: inputting string of unknown length
# 1  
Old 11-11-2019
C: inputting string of unknown length

I realize this general issue (inputting strings of variable length in C) has been addressed in myriad locations before, but I'm interested in knowing why my specific approach is not working. (BTW I'm intentionally keeping the size increments small so that I can more easily follow what's going on. After it works on a small scale, I can increase the size to something more reasonable. The main motivation for this approach is that I want to increase the size by a fixed increment, not by doubling the allocated memory each time.)

Here is the code:
Code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define SIZE 4

int main(void)
{
	int mem = SIZE;
	char *str = malloc(mem); // let's keep str pointing to beginning of string...
	char *next_read = str; // ...and next_read pointing to where next character should go

	fgets(next_read, mem, stdin);
	next_read--; // so that after we add SIZE to pointer, it points to current '\0'

	while(str[strlen(str)-1] != '\n') // if we got whole string, the last char will be '\n'
	{
		mem += SIZE;
		str = realloc(str, mem); 
		next_read += SIZE;
		fgets(next_read, SIZE+1, stdin); // read the rest (hopefully) of the line into the new space
		printf("str is now %s\n", str);
	}

	printf("final str is %s", str);
	// free(str);
	return 0;
}

The code works fine for short strings, but stops working (program seems to get stuck) if string is longer:
Code:
bruno@thinkpad:~/Desktop434$ gcc getstring.c 
bruno@thinkpad:~/Desktop436$ ./a.out 
I love linux
str is now I love 
str is now I love linu
str is now I love linux

final str is I love linux
bruno@thinkpad:~/Desktop436$ ./a.out 
I love linux and the C programming language
str is now I love 
str is now I love linu
str is now I love linux an
str is now I love linux and th
str is now I love linux and the C 
str is now I love linux and the C 
str is now I love linux and the C 
str is now I love linux and the C 
str is now I love linux and the C 
str is now I love linux and the C 
str is now I love linux and the C

I'm a newbie in C and would like to learn something by debugging this.

Last edited by DevuanFan; 11-11-2019 at 06:00 PM..
# 2  
Old 11-11-2019
realloc() is not guaranteed to reallocate in situ, which is why you do
Code:
str = realloc(str, mem);

rather than
Code:
realloc(str, mem);

The next_read variable should be reset with something like
Code:
next_read = str + strlen(str);

Basically you were writing into the old area of memory when str was now pointing to a new area.


Andrew
These 4 Users Gave Thanks to apmcd47 For This Post:
# 3  
Old 11-11-2019
Andrew, thank you so much for your beautiful explanation. I understand the issue and can confirm that this works exactly as expected:

Code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define SIZE 4

int main(void)
{
	int mem = SIZE;
	char *str = malloc(mem); // let's keep str pointing to beginning of string...
	char *next_read = str; // ...and next_read pointing to where next character should go

	fgets(next_read, mem, stdin);
	printf("str is now %s\n", str);

	while(str[strlen(str)-1] != '\n') // if we got whole string, the last char will be '\n'
	{
		mem += SIZE;
		str = realloc(str, mem); 
		next_read = str + strlen(str);
		fgets(next_read, SIZE+1, stdin); // read the rest (hopefully) of the line into the new space
		printf("str is now %s\n", str);
	}

	printf("final str is %s", str);
	// free(str);
	return 0;
}

Code:
bruno@thinkpad:~/Desktop444$ gcc getstring.c
bruno@thinkpad:~/Desktop445$ ./a.out 
I love linux and the C programming language
str is now I l
str is now I love 
str is now I love linu
str is now I love linux an
str is now I love linux and th
str is now I love linux and the C 
str is now I love linux and the C prog
str is now I love linux and the C programm
str is now I love linux and the C programming 
str is now I love linux and the C programming lang
str is now I love linux and the C programming language
str is now I love linux and the C programming language

final str is I love linux and the C programming language

It is strange that realloc() reallocates in situ at first, then not. Knowing that there are no guarantees was the key to the mystery. THANK YOU Smilie
This User Gave Thanks to DevuanFan For This Post:
# 4  
Old 11-12-2019
The problem relates to memory management. The OS sets an "end point" and a "start point" for a process working set (memory) when the process begins.

There are flavors of the malloc (also realloc) routine, many based on Doug Lea's original malloc. His version calls brk() when it thinks more added memory will go beyond the bounds of the current memory. This brk() call will possibly change the end point of the process only when your malloc asks for more and it bumps heads with the end of the data segment or existing stack. So your string start and end may be moved

The size [executable name goes here] command shows what is going on.

Tutorial with great examples:

Memory Layout of C Programs - GeeksforGeeks
This User Gave Thanks to jim mcnamara For This Post:
# 5  
Old 11-12-2019
Thank you, Jim. That's a great tutorial.

In case this helps other newbies, I moved the getstring function to a separate file containing custom functions. Also, I changed getstring's logic to go ahead and double the allocated memory each time more space is required, so that all the input can be parsed more quickly. When the entire string has been read, a final call to malloc shrinks the allocated memory so that it is just enough to hold the string--e.g., 13 bytes for "I love linux" (there's always one more byte needed than the number of characters because of the '\0' string terminator).

Code:
// filename: mylib.c

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "mylib.h"

#define SIZE 256

char *getstring(void)   // this function reads a string from command line, 
{                    // chops off the final \n, then shrinks allocated memory to exact bytes needed to hold the string
	int mem = SIZE;
	char *str = malloc(mem); // I'll keep str pointing to beginning of string...
	if (str == NULL)
		report_alloc_error();
	char *next_char = str; // ...and next_char pointing to where next character should go

	fgets(next_char, mem, stdin);

	while(str[strlen(str)-1] != '\n') // when we get whole string, last char will be '\n'
	{
		mem *= 2;
		str = realloc(str, mem); 
		if (str == NULL)
			report_alloc_error();
		next_char = str + strlen(str);
		fgets(next_char, mem/2 + 1, stdin);
	}

	// chop off trailing newline from string
	*(str + strlen(str) - 1) = '\0';

	// trim mem down to exact bytes needed to hold string
	mem = strlen(str) + 1;
	str = realloc(str, mem);

	// for debugging:
	//printf("final str is %s\n", str);
	//printf("final mem is %d\n", mem);
	
	return str;
}

void clean_stdin(void)
{
	int c;
	do 
	{
		c = getchar();
	} while (c != '\n' && c != EOF);
}

void report_alloc_error(void)
{
	printf("Memory allocation failed. Exiting.");
	exit(1);
}

Code:
// filename: mylib.h

#ifndef MYLIB_H_ // This guards against including this header more than once
#define MYLIB_H_

char *getstring(void);
void clean_stdin(void); 
void report_alloc_error(void);

#endif

Code:
// filename: example.c
// to compile: gcc -o example example.c mylib.c

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "mylib.h"

int main(void)
{
	char c;
	int i;
	float f;
	char *s;

	printf("Enter a character: ");
	scanf(" %c", &c); // the space tells scanf to ignore leading white space (if that's what you want).
	clean_stdin();	  // note that for format specifiers other than %c, scanf automatically ignores leading whitespace.

	printf("Enter an integer: ");
	scanf("%d", &i);
	clean_stdin();

	printf("Enter a float: ");
	scanf("%f", &f);
	clean_stdin();

	printf("Enter a string: ");
	s = getstring();

	printf("\nYour character: %c\n", c);
	printf("Your integer: %d\n", i);
	printf("Your float: %f\n", f);
	printf("Your string: %s\n", s);
	free(s);

	return 0;
}

Code:
$ gcc -o example example.c mylib.c
$ ./example
Enter a character: y some garbage 38.9
Enter an integer: 3 more garbage 89
Enter a float: 3.14159 7389junk
Enter a string: I love C programming, yes I do!

Your character: y
Your integer: 3
Your float: 3.141590
Your string: I love C programming, yes I do!


Last edited by DevuanFan; 11-12-2019 at 05:58 PM..
This User Gave Thanks to DevuanFan For This Post:
# 6  
Old 01-01-2020
I'd just like to add that a good way to use realloc is to assign to a temp variable rather then the variable you are copying from. That way if realloc fails you haven't lost the memory in your original variable. i.e.


Code:
char* tmp = realloc (str, len);
if (tmp == NULL) {
    free (str); // you can free str since you haven't changed the address in realloc
    return NULL; // or something to signal an allocation failure

} 

str = tmp; // the reallocation worked and now we assign tmp to str

-Greg.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing characters from end of line (length unknown)

Hi I have a file which contains wrong XML, There are some garbage characters at the end of line that I want to get rid of. Example: <request type="product" ><attributes><pair><name>q</name><value><!]></value></pair><pair><name>start</name><value>1</value></pair></attributes></request>�J ... (7 Replies)
Discussion started by: dirtyd0ggy
7 Replies

2. Shell Programming and Scripting

Delimted to padded conversion with unknown field length

I’m looking for an elegant way to convert a delimited file (comma delimited in this case) to padded columns (for printing in non-proportional font) but the length of each column is not known ahead of time. It needs to be calculated for each column from the longest entry in that column in a given... (3 Replies)
Discussion started by: Michael Stora
3 Replies

3. UNIX for Dummies Questions & Answers

Extract substring of unknown length from string

I have a string: hgLogOutput=" +0000 files: forum/web/hook-test.txt /forum/web/hook-test-2.txt description: test" and I want to extract the file names from it, they will always appear between the files: and the description:. I have worked out that I can do this: "$hgLogOutput" | awk '{... (2 Replies)
Discussion started by: klogger
2 Replies

4. UNIX for Dummies Questions & Answers

Read a string with leading spaces and find the length of the string

HI In my script, i am reading the input from the user and want to find the length of the string. The input may contain leading spaces. Right now, when leading spaces are there, they are not counted. Kindly help me My script is like below. I am using the ksh. #!/usr/bin/ksh echo... (2 Replies)
Discussion started by: dayamatrix
2 Replies

5. Shell Programming and Scripting

searching and storing unknown number of lines based on the string with a condition

Dear friends, Please help me to resolve the problem below, I have a file with following content: date of file creation : 12 feb 2007 ==================== = name : suresh = city :mumbai #this is a blank line = date : 1st Nov 2005 ==================== few lines of some text this... (7 Replies)
Discussion started by: swamymns
7 Replies

6. Shell Programming and Scripting

perl newbie: how to extract an unknown word from a string

hi, im quite new to perl regexp. i have a problem where i want to extract a word from a given string. but the word is unknown, only fact is that it appears as the second word in the string. Eg. input string(s) : char var1 = 'A'; int var2 = 10; char *ptr; and what i want to do is... (3 Replies)
Discussion started by: wolwy_pete
3 Replies

7. Shell Programming and Scripting

read string, check string length and cut

Hello All, Plz help me with: I have a csv file with data separated by ',' and optionally enclosed by "". I want to check each of these values to see if they exceed the specified string length, and if they do I want to cut just that value to the max length allowed and keep the csv format as it... (9 Replies)
Discussion started by: ozzy80
9 Replies

8. Shell Programming and Scripting

sed problem - replacement string should be same length as matching string.

Hi guys, I hope you can help me with my problem. I have a text file that contains lines like this: 78 ANGELO -809.05 79 ANGELO2 -5,000.06 I need to find all occurences of amounts that are negative and replace them with x's 78 ANGELO xxxxxxx 79... (4 Replies)
Discussion started by: amangeles
4 Replies

9. UNIX for Dummies Questions & Answers

length of the string

Hi all, pls help me in finding the length of the given string, do we need to write a code seperately or is there any command?? pls help. (3 Replies)
Discussion started by: vasikaran
3 Replies

10. UNIX for Dummies Questions & Answers

Selecting unknown string.

Work problem: Need to set up a job to periodically check that the number of entries in the mail queue. I'm able to do the following: mailq | grep "Mail Queue" Which returns: Mail Queue (7 requests) Unfortunately I'm not sure how I select between `(` and `requests`? ... (2 Replies)
Discussion started by: Cameron
2 Replies
Login or Register to Ask a Question