The attribute `inline` is only intended for functions, but a bug in the
Rust compiler failed to catch that in all cases. As of
https://github.com/rust-lang/rust/issues/31769, this usage will cause
builds to fail.
The Rust compiler validates the extern ABI while parsing the "extern"
keyword, so normal conditional compilation (`#[cfg(...)]`) isn't enough
to hide the ABI from Rust versions which don't know it.
I tried hiding the extern ABI using a macro, but the contents of an
"extern" block aren't a valid `item`, and I couldn't find any other
working way to pass the function declarations to the macro.
The solution which worked in the end was to use `include!`. This
prevents the compiler from even trying to parse the "extern" block
unless the nightly-only cargo feature "simd" is enabled.
Before the conversion to SIMD-like vectors, this was not possible
because the array had more than 32 elements, and these traits are only
implemented for arrays of up to 32 elements.
After the conversion, the array has only 2 elements, so deriving these
traits is possible and simplifies the code.
For each round, BLAKE2 loads a different set of words from the message,
controlled by the SIGMA array. This seems an obvious place to use a SIMD
gather instruction. To allow for further experimentation, move the
gather of the message words to the SIMD code.
The compiler doesn't seem to be able to convert 8-bit rotating shuffles
of 64-bit elements into the VEXT instruction.
Unfortunately, this code crashes the LLVM compiler used by rustc.
Also simplifies the SIMD code, and uses builtin operations for shifts
and shuffles. For x86 and x86_64, and VREV shuffles on arm, the compiler
does a good job.
Unfortunately, the compiler fails to use VEXT for shuffles on arm, and
the inline assembly for it crashes the LLVM compiler, so I removed it in
this commit.