Understanding String Literal Initialization on C99

Asked 1 years ago, Updated 1 years ago, 41 views

In C language, you can initialize the char array as follows:

int main (void)
{
    char str [ ] = "abc";
    ...
}

This is a C99 specification document with 6.7.8 Initialization paragraph 14

An array of character type may be initialized by a character string literal, optionally enclosed in braces.Successful characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the element.

It is based on .On the other hand, paragraph 5 of 6.4.5 String literals contains

(...) The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. (...)

Does this rule apply when string literal is used for initialization?

In other words, when compiling the opening code, does a strictly compliant compiler have to keep "abc" in the static area before initializing str?

If you use gcc-std=c99-O0-S code like the one at the beginning, the corresponding part will be

movl$6513249,24(%esp)

So I initialized it to an immediate value (6513249=0x636261), and it doesn't seem to meet the requirements of 6.4.5.

c

2022-09-30 21:12

2 Answers

C996.4.5 p5 states that string literals are used to initialize areas with static storage periods, but it does not specify when they will be initialized.
In other words, you do not need to initialize the static region when that literal appears.For example, in ELF format, I think it is done when a program is loaded and when a program is deployed to memory.
Also, there is no indication that the initialization of an array variable requires reference to its static region.

Therefore, this code can be interpreted as follows:

As for 3, even if you disable optimization in gcc, it may seem suspicious because it has been deleted, but in fact, this is still the case.
For example, if you put a statement like "def"; in the appropriate place in the function, there is no space that is equivalent to "def".
If you look away from the string, the operation divided by the constant 10 expands in a way that does not rely on the division instruction.
I think these things are built into very basic parts and are applied even if you disable optimization.


2022-09-30 21:12

clang seems different from gcc.

$clang --version
Ubuntu clang version 3.7.1-1 ubuntu3 (tags/RELEASE_371/final) (based on LLVM 3.7.1)
Target: x86_64-pc-linux-gnu

$ clang-std = c99-O0-Sa.c

  movl.Lmain.str, %eax
       :

  .section.rodata.str 1.1, "aMS", @progbits, 1
  .Lmain.str:
      .asciz "abc"
      .size.Lmain.str, 4

$ clang-std = c99-O0 a.c
$ objdump-s-j.rodata a.out
# or
$ readelf-x.rodata a.out

4005a001000200 61626300 25730a00....abc.%s..    

The string literal is located in the rodata(Read Only DATA) section.

This has nothing to do with the subject, but the gcc -O0 option does not actually suppress all optimizations.

$gcc-Q --help=optimizers-O0
             :
  - aggressive-loop-optimizations [enabled]
  -falign-functions [disabled]
             :
  -fasynchronous-unwind-tables [enabled]
  -fauto-inc-dec [enabled]
             :


2022-09-30 21:12

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.