./a.out readonly strings

Ever wondered why this program crashes?

void main() {

char * str = “NonWritable”;

str[0] = ‘X’; /*<< Booom*/

}

Compiling the same using -fwritable-strings option to gcc however did not result in a segmentation fault.

Let me warn you before you try it yourself. GCC removed the support of writable strings with 4.0 and above. Hence, GCC no longer accepts the -fwritable-strings option. Use named character arrays when you need a writable string.

What does that mean?

It means that use this instead: char str[] = {“Writable”}; instead of char * str = “Non Writable”

Aha, now i get it…somehow char * str is marked read only? but where and how?

Ok Lets dig in…

What Kinds of C Statements End Up in Which Segments? (Courtesy: Expert C Programming“)

Segments? Does it have to do with segments and Paging in OS architecture… hell no! Its something to do with how ./a.out (executables) are organized and laid out. Many Authors call segments as sections who want to distinguish the two types. But for now I would continue calling them as segments.

Try this: Linux has a utility name size which prints out size of various segments(or sections) of the executable.

For example: size ./a.out

text data bss dec hex filename
1167 + 528 + 60032 = 61727 f11f ./a.out

Actual size of ./a.out is 9324 (9K Bytes) using dir equivalent.

What is the size program printing? Its actually printing various sizes of the segments in which the executable ./a.out is divided. As you might have guessed they are ‘text’ , bss and data.

Here is the example copied from the “Expert C Programming

screenshot-4

Here is the summary:
- All Initialized Global/Static variable go into the data segment.

- All uninitialized global variables go into the BSS segment. Since all global variables are initialized to zero by default, there is no need to remember it. That is probably why the bss segment was created.

- char * str = “Non Writable” goes into the text segment, which is when loaded into the memory by the loader(?) is marked as read only. That is why the above program crashes!

Now, I am wondering how can we write a self mutating program when the text segment is marked read only.

- Variables declared on stack are not captured in any segment. Thus declaring a big array on stack does not increase your a.out size.

But will declaring a huge array as global or static increase a.out size?

Lets see:

Declaring a global array int arr[], compiling and running size and also checking the actual size of the file a.out:

size.c:

char arr[64000];
void main () {
}

gcc size.c; size ./a.out gave the output as:

text data bss dec hex filename | Actual size of file ./a.out

char arr [6400]; 1012 484 64032 65528 fff8 a.out | 9121 bytes

char arr [640000]; 1012 484 6400032 6401528 61adf8 a.out | 9121 bytes
Clearly, just declaring a big array does not change the Actual size of the binary.

However declaring and initializing the array

char darr[6400000] = {“test”};
void main () {
}

text data bss dec hex filename | Actual size of file ./a.out
1012 6400512 16 6401540 61ae04 ./a.out | 6409145

Makes the executable size as 6409145 bytes, ouch! compare it with 9121 bytes above.

The only change was that the array was initialized, which moved the array into Data segment (instead of BSS segment). Since the BSS segment only holds variables that don’t have any value yet, it doesn’t actually need to store the image of these variables.

Until next time…


About this entry