Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suspected bug in elaboration phase supporting -fbitfields introducing sequence point violation #395

Closed
tduckwo opened this issue Apr 30, 2021 · 1 comment

Comments

@tduckwo
Copy link

tduckwo commented Apr 30, 2021

Consider the following C source, bug.c. I believe it should result in the
printing of 0, but with CompCert it prints 1.

int printf(char *, ...);

struct {
  int m1 : 4;
  int m2 : 4;
} volatile s = {1, 1};

int f() {
  s.m2 = 0;
  return 0;
}

int main() {
  s.m1 = f();
  printf("%d\n", s.m2);
  return 0;
}

Based on my understanding of C99, there's a trivial sequencing of the evaluation
of f() before the assignment of its result to s.m1 (which, to my knowledge,
does not imply any read of the previous value of s.m1 or s.m2 except perhaps
during the execution of f). This implies that s.m2 is set to 0 during the
evaluation of f(), and that s.m1 is set to 0 thereafter as part of the
assignment in main. Thus, we expect that the read of s.m2 as part of the
call to printf should return 0, and that 0 should be printed to standard
out. Using CompCert 3.8 compiled for x86_64-linux, we may observe the
following (this also appears to hold for a newer revision,
e4542668e6d348e0300e76bb77105af24aff4233).

$ ccomp -fbitfields -interp -quiet bug.c
1

$ ccomp -fbitfields -o bug bug.c

$ ./bug
1

$ gcc -o nobug bug.c

$ ./nobug
0

$ ccomp -fbitfields -interp -quiet -random bug.c
0

Using -dparse, we arrive at the following elaborated program.

extern void __builtin_debug(int kind, ...);

int printf(char * , ...);

struct _139;

struct _139 {
  unsigned char __bf1;
};

struct _139 volatile s =
  {(unsigned char) ((1 & 0xFU) << 0 | (1 & 0xFU) << 4), };

int f(void)
{
  s.__bf1 = s.__bf1 & ~0xF0U | (unsigned int) 0 << 4U & 0xF0U;
  return 0;
}

int main(void)
{
  s.__bf1 = s.__bf1 & ~0xFU | (unsigned int) f() << 0U & 0xFU;
  printf("%d\n", (int) (s.__bf1 << 24) >> 28);
  return 0;
  return 0;
}

The bitfield elaboration appears to introduce a sequence point violation.
Depending on the order in which s.__bf1 is read and written (read on the
left-hand side of the | operator in main, and written by the call to f in
main on the other side of |), the value printed may be either 0 or 1.

If s.__bf1 is read before f() is evaluated, then its value is 0x11. The
masking and logical or with 0 then results in a final value of 0x10. The
shifts performed immediately before the call to printf, then, produce a value
of 1. This is CompCert's usual interpretation.

If, on the other hand, f() is evaluated before s.__bf1, then the value read
for s.__bf1 in the first line of main is 0x01, which means that s.__bf1 & ~0xFU evaluates to 0. As a result, the whole of s.__bf1 is then assigned a
final value of 0x00, and the shifts and call to printf write 0 to standard
out. CompCert's interpreter, using the -random flag, will sometimes take this
interpretation.

Using -dclight may make the issue somewhat more clear by offering a different
perspective.

struct _139;
struct _139 {
  unsigned char __bf1;
};

extern unsigned int __compcert_va_int32(void *);
extern unsigned long long __compcert_va_int64(void *);
extern double __compcert_va_float64(void *);
extern void *__compcert_va_composite(void *, unsigned long long);
extern long long __compcert_i64_dtos(double);
extern unsigned long long __compcert_i64_dtou(double);
extern double __compcert_i64_stod(long long);
extern double __compcert_i64_utod(unsigned long long);
extern float __compcert_i64_stof(long long);
extern float __compcert_i64_utof(unsigned long long);
extern long long __compcert_i64_sdiv(long long, long long);
extern unsigned long long __compcert_i64_udiv(unsigned long long, unsigned long long);
extern long long __compcert_i64_smod(long long, long long);
extern unsigned long long __compcert_i64_umod(unsigned long long, unsigned long long);
extern long long __compcert_i64_shl(long long, int);
extern unsigned long long __compcert_i64_shr(unsigned long long, int);
extern long long __compcert_i64_sar(long long, int);
extern long long __compcert_i64_smulh(long long, long long);
extern unsigned long long __compcert_i64_umulh(unsigned long long, unsigned long long);
extern void __builtin_debug(int, ...);
extern int printf(signed char *, ...);
int f(void);
int main(void);
signed char const __stringlit_1[4] = "%d\012";

struct _139 volatile s = { 17, };

int f(void)
{
  register unsigned char volatile $62;
  $62 = builtin volatile load int8u(&s.__bf1);
  builtin volatile store int8u
    (&s.__bf1, $62 & ~240U | (unsigned int) 0 << 4U & 240U);
  return 0;
}

int main(void)
{
  register unsigned char volatile $64;
  register int $63;
  register unsigned char volatile $62;
  $62 = builtin volatile load int8u(&s.__bf1);
  $63 = f();
  builtin volatile store int8u
    (&s.__bf1, $62 & ~15U | (unsigned int) $63 << 0U & 15U);
  $64 = builtin volatile load int8u(&s.__bf1);
  printf(__stringlit_1, (int) ($64 << 24) >> 28);
  return 0;
  return 0;
}

Exchanging the order in which $62 and $63 are evaluated (i.e. switching
their lines) alters the determinization of the elaborated program, but the
original program did not have any nondeterminism here to begin with. In this
case, the introduction of nondeterminism and the choice of determinization
appears to produce an incorrect result.

@xavierleroy
Copy link
Contributor

Thank you @tduckwo to have registered on Github just to report this issue.

I agree with your analysis: the bit-field emulation pass of CompCert produces CompCert C code that , in conjunction with the nondeterminism of expression evaluation in CompCert C, can produce wrong results, as your example shows.

Please keep in mind that bit-fields are not part of the verified part of CompCert, and their emulation is turned off by default and should be considered a best effort at this time.

Interestingly, if the structure is not volatile, the determinization pass of CompCert picks "the right" evaluation order, where the current contents of the struct are read after the call to f(). This suggests a simple fix for the reported issue, which plays on the evaluation order picked by the determinization pass. I'll push this fix shortly.

But this is just a workaround. The proper solution is to support bit-fields directly in CompCert C, as part of the verified part of CompCert. I have some ideas on how to do that, but it will take some work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants