blake2-rfc

Commit Graph

Author	SHA1	Message	Date
Cesar Eduardo Barros	1d505019b9	Use data-encoding crate instead of rustc-serialize	9 years ago
Cesar Eduardo Barros	f3d56f7bc6	rust-clippy explicit_iter_loop	9 years ago
Cesar Eduardo Barros	463c675b7b	Bump version to 0.2.11	9 years ago
Cesar Eduardo Barros	464c01e282	Refactor simd_opt code	9 years ago
Cesar Eduardo Barros	6275659202	Use SSSE3's pshufb for rotates	9 years ago
Cesar Eduardo Barros	e07e847c84	Refactor SIMD code	9 years ago
Cesar Eduardo Barros	8904714192	Make AsBytes slightly safer	9 years ago
Cesar Eduardo Barros	c5f49b66ba	Missed a few shuffles	9 years ago
Cesar Eduardo Barros	eacf0460ea	Port code to latest "SIMD groundwork part 1" Port the SIMD code to https://github.com/rust-lang/rust/pull/27169 changes.	9 years ago
Cesar Eduardo Barros	4dbacf33dd	Hide "platform-intrinsic" from stable Rust The Rust compiler validates the extern ABI while parsing the "extern" keyword, so normal conditional compilation (`#[cfg(...)]`) isn't enough to hide the ABI from Rust versions which don't know it. I tried hiding the extern ABI using a macro, but the contents of an "extern" block aren't a valid `item`, and I couldn't find any other working way to pass the function declarations to the macro. The solution which worked in the end was to use `include!`. This prevents the compiler from even trying to parse the "extern" block unless the nightly-only cargo feature "simd" is enabled.	9 years ago
Cesar Eduardo Barros	05b5e01ba1	Finish porting code to "SIMD groundwork part 1" Finish porting the SIMD code to https://github.com/rust-lang/rust/pull/27169	9 years ago
Cesar Eduardo Barros	9e5a416d79	Port code to "SIMD groundwork part 1" Port the SIMD code to https://github.com/rust-lang/rust/pull/27169	9 years ago
Cesar Eduardo Barros	8d895f7986	Use #[derive] for Clone and Debug Before the conversion to SIMD-like vectors, this was not possible because the array had more than 32 elements, and these traits are only implemented for arrays of up to 32 elements. After the conversion, the array has only 2 elements, so deriving these traits is possible and simplifies the code.	9 years ago
Cesar Eduardo Barros	80b1ac8280	Use repr(C) for SIMD-like types	9 years ago
Cesar Eduardo Barros	d1c1a98a19	Move message words gather to SIMD code For each round, BLAKE2 loads a different set of words from the message, controlled by the SIGMA array. This seems an obvious place to use a SIMD gather instruction. To allow for further experimentation, move the gather of the message words to the SIMD code.	9 years ago
Cesar Eduardo Barros	907e2a6b1e	Minor layout tweak	9 years ago
Cesar Eduardo Barros	9e0c558aa1	Simplify result .as_bytes()	9 years ago
Cesar Eduardo Barros	3917a83e08	Simplify as_mut_bytes	9 years ago
Cesar Eduardo Barros	c59b93819c	Make shuffles for rotates optional and document SIMD features	9 years ago
Cesar Eduardo Barros	36708b280c	Use inline assembly for VEXT on NEON The compiler doesn't seem to be able to convert 8-bit rotating shuffles of 64-bit elements into the VEXT instruction. Unfortunately, this code crashes the LLVM compiler used by rustc.	9 years ago
Cesar Eduardo Barros	153a329c05	Use simdty crate for SIMD types Also simplifies the SIMD code, and uses builtin operations for shifts and shuffles. For x86 and x86_64, and VREV shuffles on arm, the compiler does a good job. Unfortunately, the compiler fails to use VEXT for shuffles on arm, and the inline assembly for it crashes the LLVM compiler, so I removed it in this commit.	9 years ago
Cesar Eduardo Barros	6a766c480d	Bump version to 0.2.7	9 years ago
Cesar Eduardo Barros	9c5fb5ad6d	Use VEXT for NEON Again, no intrinsic is available, so had to use inline assembly.	9 years ago
Cesar Eduardo Barros	d9fbbd183f	Use VREV for NEON Unfortunately, no intrinsic is available, so had to use inline assembly.	9 years ago
Cesar Eduardo Barros	136164a8d7	Use shuffles to speed up rotates in SSE2 Unfortunately, NEON shuffle intrinsics are not available, so this would require inline assembly for NEON.	9 years ago
Cesar Eduardo Barros	fa3ae10063	Implement SIMD for NEON On my phone, the speed with this NEON code is 2.3x the speed without NEON for BLAKE2b and 1.6x for BLAKE2s.	9 years ago
Cesar Eduardo Barros	19d7bc1232	When SIMD is enabled, use the SSE2 code for x86 When enabled, makes BLAKE2b ~20% faster and BLAKE2s ~60% faster on 32-bit x86.	9 years ago
Cesar Eduardo Barros	aaccc40e40	Bump version to 0.2.4	9 years ago
Cesar Eduardo Barros	c6c237116f	Make it possible to use advanced BLAKE2 features	9 years ago
Cesar Eduardo Barros	238fa7b09f	Simplify finalize code No need to write back self.h when self is about to be dropped.	9 years ago
Cesar Eduardo Barros	e6d70af3b7	Treat IV as constant vector	9 years ago
Cesar Eduardo Barros	1fdfa6310a	Avoid abstraction penalty in the SIMD code It seems that the compiler is not smart enough to avoid the abstraction penalty in the 32-bit case. Passing the vector directly instead of encapsulating it inside a struct made the SIMD Blake2s code now very slightly faster than the non-SIMD Blake2s code.	9 years ago
Cesar Eduardo Barros	63d8a42ebe	Attempt to vectorize the code Go all the way and manually vectorize the code, using a generic 4-element vector type, and add both a fallback implementation and an attempt at a SSE2 implementation using LLVM intrinsics. The experiment was not successful; the SIMD code for some reason is much slower. Interestingly, however, the fallback code was much faster for Blake2s, and only slightly slower for Blake2b. Now both implementations are around 80% of the speed of the SIMD-optimized reference libb2 code.	9 years ago
Cesar Eduardo Barros	7390881973	Bump version to 0.2.2	9 years ago
Cesar Eduardo Barros	5612c4455c	Try to help the autovectorizer Change the order of the round operations to make more parallelism visible to the compiler's autovectorizer. The results were mixed; on my 64-bit machine, it made Blake2b 14% faster, but Blake2s became 3% slower.	9 years ago
Cesar Eduardo Barros	a971812a6f	Add bench feature	9 years ago
Cesar Eduardo Barros	90e83febaf	Remove redundant mask	9 years ago
Cesar Eduardo Barros	d6ec2ff38f	Derive Debug for state	9 years ago
Cesar Eduardo Barros	f727ab6a78	Add .travis.yml	9 years ago
Cesar Eduardo Barros	3fbc2dec25	Bump version to 0.2.0	9 years ago
Cesar Eduardo Barros	ce889c001a	Implement Write	9 years ago
Cesar Eduardo Barros	a809ebeeaa	Fix PartialEq	9 years ago
Cesar Eduardo Barros	8046be2088	"a maximum of 5 keywords per crate are allowed"	9 years ago
Cesar Eduardo Barros	93d1460cc7	Initial commit	9 years ago

44 Commits (1d505019b949588098ec7acac856bdf376e25abf) All Branches Search

44 Commits (1d505019b949588098ec7acac856bdf376e25abf)

All Branches