Knowing the size and location of variables in a C program


 
Thread Tools Search this Thread
Special Forums UNIX Desktop Questions & Answers Knowing the size and location of variables in a C program
# 1  
Old 07-17-2013
Question Knowing the size and location of variables in a C program

So I need some help with this. Pardon me if I'm posting in the wrong forum, after some googling for my answer and finding nothing I found this forum. It seemed appropriate for what I was seeking. I just didnt find a forum that concerned the use of GDB. I'm learning to use the C language and GDB. What I don't understand is how the computer knows how big each piece of a program is in memory, and how I could find my variable's in memory using GDB.

For example how does the computer know that the disassembled instructions from main() are <main+##>? Is there a flag between each variable in memory on the stack? Or does the CPU reference the text segment with the variable in memory to know where a variable begins and ends?

I mean if all memory is numbered how can anyone including the CPU know where a word or giant or w/e starts and ends?

If I wanted to find my variable in memory after setting a break point in it and accessing the $esp register how would I know where my variables began and ended?

When I use the examine command "x" I don't know how to know where my variable begins and ends. Would it be the $ESP register on the stack minus the word size of my variable? $EIP shows how many bytes from main and the previous instruction when you disassemble something but everything on the stack is just numbers.

Any help would be much appreciated!
# 2  
Old 07-17-2013
Quote:
Originally Posted by Cambria
So I need some help with this. Pardon me if I'm posting in the wrong forum, after some googling for my answer and finding nothing I found this forum. It seemed appropriate for what I was seeking. I just didnt find a forum that concerned the use of GDB. I'm learning to use the C language and GDB. What I don't understand is how the computer knows how big each piece of a program is in memory, and how I could find my variable's in memory using GDB.
To get nice debugging information like that, you have to build the executable with debugging information(i.e. -ggdb). This embeds lots of offsets and labels inside the program file for gdb's convenience.

This is also why gdb has trouble when it steps into code outside your program, like libc... Libraries are probably not built with debugging information, so details about their insides will be very limited.

Quote:
I mean if all memory is numbered how can anyone including the CPU know where a word or giant or w/e starts and ends?
To put it bluntly -- it doesn't. They all become hardcoded segment offsets, in the end. Without debugging information, you're left with detective work.

Quote:
If I wanted to find my variable in memory after setting a break point in it and accessing the $esp register how would I know where my variables began and ended?
If your executable wasn't built with debugging info, that'd mean detective work.

Last edited by Corona688; 07-17-2013 at 05:38 PM..
# 3  
Old 07-17-2013
Wow, many questions. More magic than mechanism, it turns out. The sizeof every variable is pretty predictable, and the packing can be discovered from the offsets of pointers to variables. Lets say you give main an int variable, automatic. I just puls down the stack pointer 4 places (stacks usually grow down from FFFFFFFFF or whatever) and calls it that int. If it was static, it remembers that the current heap pointer is that int, and raises it 4 places. gdb finds structures for linking that identify most variables, and clues for debugging left by the compiler, if not stripped. This also contains pointers to linkable subroutines like main(). There is no punctuation in modern computers, they go by count/size. The cpu does not know where variables begin and end, which allows you to get SEGV faults sometimes when you overreach, unless you just get/write adjacent data of yours. C is like an assembly language for a computer that does not exist but is close enough for everyone to adapt to. Stack and heap pointers might be in registers, or in memory outside the CPU, no matter as long as the compiler knows.

Some computers do not do words that are not aligned, but the x86 lets wrds float free. For speed it helps to realign them to modulo 2, 4, 8 or whatever so one RAM fetch does the trick. Compilers often pack things with padding so they hit boundaries. It may take extra processing to compute with a misaligned word.

Some machines are big-endian, meaning the big byte goes in the low character: x86 is little-endian, SPARC and the IP protocol are big-endian, but some SPARC can change to little-endian (a slow process, I was told), perhaps to emulate an x86. Keep this in mind when interpreting the stack or heap.

Routines are relocatable, so the loader can place main wherever it wants and call it. Part of run time linking is giving the code the right pointers to actual code and data. The stack automatic addressing is relative, so it can be faster and simpler. The usual model is that all the code goes at the bottom, then the constants, and finally the initialized and not initialized heap variables, but dynamic loading of libraries may layer in more code, constants and variables. Memory is usualy virtual, and often to segregate code from data they go on different pages with different flags, so code is not writable and data is not callable.

Data Break points are managed by running the code a bit at a time and watching the location. Code break points are done by substituting call code at the breakpoint, saving aside the original code. Ditto for stepping.

I like to use where to examine the stack for calls. Mostly I do not use GDB, I use code with careful formatting, structure, error checking and logging. Sometimes I add debug printouts to narrow a problem. Sometimes I use tusc/truss/strace to trace the running process (very educational about UNIX). Usually it debugs very quickly.

I do use GDB to find out where a core died. I have even written cron scripts to pick up core files, gdb them, send mail, compress them and stash them in /tmp so new core files can be detected. You never know how many core dumps happen in prod if you do not look!

So, as I said, not much mechanism, lots of smarts about how things work. Compilers may also have calls mark the stack so it is easy to 'where'. Stacks may have a mix of hardware CPU laid out data and programmer automatic variables, but in some systems they have two stacks, one for the automatic stuff and one for the CPU defined stuff, as if the CPU starts loading registers with automatic data, anything can happen, usually a fault on the process, which passes the CPU to a signal handler. The amount of stuff on the stack can vary a lot, depending on whether variables are passed in registers, whether the stack frame is for a more significant change, not intra-thread but inter-thread or inter-process (like a disk controller interrupt). Sometimes bits in registers or in the call itself control how much CPU data goes into the stack for a call. Good luck reading the stack barefoot. Mostly, just remember that if an automatic is overwritten, look for an array declared later being written past the end, or an array declared before being written past the beginning. Hackers make a great living off of finding programs that do not limit how much they read, and pass them carefully structured too much. So, never use gets(), use scanf() with great care, make do with fgets(), getc(), fread() when possible (in the FILE* world). Less to debug!
This User Gave Thanks to DGPickett For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Curl to download file from subdivx.com after following location without knowing the file name/extens

This question could be specific to the site subdivx.com In the past, I've been able to download a file following location using cURL but there is something about subdivx.com that's different and can't figure out how to get it to work. I tried the following directly in the terminal with no... (5 Replies)
Discussion started by: MoonD
5 Replies

2. Programming

Size of memory used by a program

Hello, Here is a portion of my code: a=(int *) malloc(dim*dim*sizeof(int)); b=(int *) malloc(dim*dim*sizeof(int)); c=(int *) malloc(dim*dim*sizeof(int)); for(i=0;i<dim;i++) for(j=0;j<dim;j++) c= rand(); for(i=0;i<dim;i++) for(j=0;j<dim;j++) b=rand(); ... (6 Replies)
Discussion started by: chercheur111
6 Replies

3. UNIX for Dummies Questions & Answers

Knowing when a different program modifies a file

so i was testing something on a test box running linux. i manually vi'ed the /var/log/messages file. and i noticed, the file immediately stopped being updated. it wasn't until i restarted the syslog process that events started being recorded in it again. so that tells me, the syslog process... (20 Replies)
Discussion started by: SkySmart
20 Replies

4. UNIX for Dummies Questions & Answers

I am not able to use variables in system command in a C program

this method is not working.I am having a problem to use variables in system command. i cannot use the variables in system command. this how i was did system("whereis command"); this method works very fine. but, i want use the commands as variable. that means i want only pass the variables.... (6 Replies)
Discussion started by: dhanda2601
6 Replies

5. UNIX for Dummies Questions & Answers

program location

Hello all, one of application on system requires that "uname" program is in "/usr/uname" location. I can find uname in "/usr/bin/uname" location. Is it possible to present the /usr/bin/uname as that it was located in /usr/uname location? Thank you in advanced, M (1 Reply)
Discussion started by: kreno
1 Replies

6. UNIX for Dummies Questions & Answers

Small Program with variables

Hello Geniuses of the unix world. please help, stupid chemist. I have the following script that I need to create a file. Doesnt make sense unless i explain this way: I need to create a file called summary.in I would like all these lines to be inserted however in the command line I would like the... (1 Reply)
Discussion started by: gingburg
1 Replies

7. Shell Programming and Scripting

Environment variables location.

Hello everyone, I am trying to figure out where all of my environment variables are getting set. When I type env I get a whole list of them, about two pages full, yet I do not seem to find where they are initialized. I checked all of my .profile .login and .cshrc files (I do not seem to have any... (1 Reply)
Discussion started by: gio001
1 Replies

8. Solaris

How to know the size of the program currently executing in memory

hey everybody, i am currently working on solaris 10 os on a m5000 server. my problem is when i want the exact size of a program in execution, i am unable to do it. earlier i thought the RSS field of prstat but because of its large size it cant be the size. pmap -x shows some output but it includes... (2 Replies)
Discussion started by: aryansheikh
2 Replies

9. Programming

finding stack location in C using program

Is there a way to find the address of stack memory writing a program? Please guide me (12 Replies)
Discussion started by: jacques83
12 Replies

10. UNIX for Advanced & Expert Users

Dump program variables

Hi, Wish if could provide some clues. How do I dump all the C program variables(global) into say a file with their names and the values. So that when I restart the application again I could use this same file for reinitializing.Is this possible? Thanks, Reji (1 Reply)
Discussion started by: rejise
1 Replies
Login or Register to Ask a Question