Skip to content
  • Ard Biesheuvel's avatar
    crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize · 83dee2ce
    Ard Biesheuvel authored
    
    
    The way the KECCAK transform is currently coded involves many references
    into the state array using indexes that are calculated at runtime using
    simple but non-trivial arithmetic. This forces the compiler to treat the
    state matrix as an array in memory rather than keep it in registers,
    which results in poor performance.
    
    So instead, let's rephrase the algorithm using fixed array indexes only.
    This helps the compiler keep the state matrix in registers, resulting
    in the following speedup (SHA3-256 performance in cycles per byte):
    
                                                before   after   speedup
      Intel Core i7 @ 2.0 GHz (2.9 turbo)        100.6    35.7     2.8x
      Cortex-A57 @ 2.0 GHz (64-bit mode)         101.6    12.7     8.0x
      Cortex-A53 @ 1.0 GHz                       224.4    15.8    14.2x
      Cortex-A57 @ 2.0 GHz (32-bit mode)         201.8    63.0     3.2x
    
    Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
    Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
    83dee2ce