davidmmenendez

When we designed Alive, our interpretation for the shift instruction produced undefined behavior when the shift amount is greater than the width of the integer. We have been thinking about alternative interpretations for the shift instruction because we think this interpretation is inconsistent with LLVM’s language reference.

When using a shift operation on n-bit integers, LLVM requires the second argument (i.e., shift amount) to be less than n. One might expect that, say, left-shifting by more than n would result in zero, since all the bits have been shifted out, but LLVM does not require this because several popular processor architectures, including x86 and ARM, do not work this way. On x86, for example, shifting a 32-bit integer by 33 places is the same as shifting it 1 place. By restricting the shift amount, LLVM is free to map its shift instructions directly to the processor’s instructions without needing to insert range checks.

This leaves the question of what to do when the shift amount is too large, and this is where Alive (i.e., the main branch of Alive) diverged from LLVM. Alive treated too-large shifts as undefined behavior, similar to division by zero.

However, the relevant text in the LLVM Language Reference states:

If [the shift amount] is (statically or dynamically) equal to or larger than the number of bits in [the first argument], the result is undefined.

Compare this with the language used for the division and remainder operations:

Division by zero is undefined behavior.

While the language reference doesn’t actually define what “the result is undefined” means, the intention seems to be something similar to undef, LLVM’s most restricted form of undefined behavior. That is, a shift with a too-large shift amount can produce an arbitrary bit-pattern, but cannot produce a poison value or have any side-effect.

A broken optimization

We have implemented this semantics in a variant of Alive-NJ, and using it we have discovered an optimization in the Alive suite, which Alive had incorrectly validated earlier.

  %Op0 = shl 1, %Y
  %r   = mul %Op0, %Op1
=>
  %r   = shl %Op1, %Y

This transforms (1 << Y) * X to X << Y, an optimization which seems acceptable. But Alive-NJ reports a counterexample:

ERROR: Mismatch in values for i4 %r                                                                 

Example:
i4 %Y   = 0x8 (8, -8)
i4 %Op0 = 0x0 (0)
i4 %Op1 = 0x0 (0)
source: 0x0 (0)
target: 0x1 (1)

Essentially, if %Y is too large, then the shift %Op0 returns an undefined value. In the source, this might be subsequently multiplied by 0 or an even number, resulting in 0 or a partially-defined value. In contrast, the target produces a fully-undefined (undef) value regardless of the value of %Op1

For example, if %Y is 8 and %Op1 is 2, then %r will always be an even number in the source, but %r in the target may take any value. The optimization has added new possible behaviors to the program, and must therefore be rejected.

We have confirmed that this optimization is performed by LLVM 4.0. For example, this function:

int foo(int x, int y) {
    return (1 << y) * x;
}

will have a single shift operation after optimization.

But is it actually broken?

The argument that (1 << y) * x and x << y are different relies on LLVM’s semantics for undef, but these arguably do not reflect actual processor behavior. For example, LLVM says 2 << 33 could be any 32-bit integer, but we are not aware of any processor that would produce an odd value here.

In the forthcoming paper, “Taming undefined behavior” (PLDI 2017), Lee and others propose combining undef with LLVM’s notion of poison values. The relevant part here is that any arithmetic operator with a poison argument always returns poison. While multiplying undef by 0 yields 0, multiplying poison by 0 yields poison.

If shifts return poison when the shift amount is too large, then the optimization above is valid, as the original and optimized code both return poison in that case.

What about `nsw`, `nuw`, and `exact`?

LLVM provides three attributes which make stronger claims about the range of values its arguments may take. LLVM’s left shift, shl, can be modified by one or both of nsw, meaning no-signed-wrap, and nuw, meaning no-unsigned-wrap. Essentially, these are promises that the shift x << y is actually the same as x⋅2^y. Similarly, the arithmetic and logical right shifts, ashr and lshr, can be modified by exact, which promises that no one bits are shifted out. If a shift with an attribute is given arguments which do not meet its requirements, it produces a poison value.

It is tempting to assume that a shift with an attribute always produces poison if the shift amount is too large, but this does not appear to be what the language reference specifies.

For shl nsw, the reference states that it is poison “if it shifts out any bits that disagree with the resultant sign bit.” When the shift amount is too large, the resultant sign bit is undefined, so we can always choose a sign bit such that the shifted bits disagree.

In other words, this optimization is valid:

Pre: C u>= width(%x)
  %r = shl nsw %x, C
=>
  %r = poison

In contrast, the reference states that shl nuw is poison “if it shifts out any non-zero bits.” If the argument is 0, the shift will not shift out any non-zero bits, so a shl nuw with a too-large shift amount produces poison for every input except 0.

That means that we cannot perform this optimization:

Pre: C u>= width(%x)
  %r = shl nuw %x, C
=>
  %r = poison

Because %r is not poison when %x is zero.

Similarly, ashr exact and lshr exact are poison “if any of the bits shifted out are non-zero.”

Of course, if undef and poison are merged, then it would no longer be necessary to make this distinction.

The proposal for combining undef and poison in “Taming undefined behavior” (PLDI 2017) addresses many of the inconsistencies that are currently present in the LLVM language reference including the issues highlighted in the above examples.

Research at Rutgers Architecture and Programming Languages Research Group

Author: davidmmenendez

LLVM’s shifty semantics

A broken optimization

But is it actually broken?

What about `nsw`, `nuw`, and `exact`?

A broken optimization

But is it actually broken?

What about nsw, nuw, and exact?

What about `nsw`, `nuw`, and `exact`?