Commit b7b73cd5 authored by Eric Biggers's avatar Eric Biggers Committed by Herbert Xu

crypto: x86/salsa20 - remove x86 salsa20 implementations

The x86 assembly implementations of Salsa20 use the frame base pointer
register (%ebp or %rbp), which breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.
Recent (v4.10+) kernels will warn about this, e.g.

WARNING: kernel stack regs at 00000000a8291e69 in syzkaller047086:4677 has bad 'bp' value 000000001077994c

But after looking into it, I believe there's very little reason to still
retain the x86 Salsa20 code.  First, these are *not* vectorized
(SSE2/SSSE3/AVX2) implementations, which would be needed to get anywhere
close to the best Salsa20 performance on any remotely modern x86
processor; they're just regular x86 assembly.  Second, it's still
unclear that anyone is actually using the kernel's Salsa20 at all,
especially given that now ChaCha20 is supported too, and with much more
efficient SSSE3 and AVX2 implementations.  Finally, in benchmarks I did
on both Intel and AMD processors with both gcc 8.1.0 and gcc 4.9.4, the
x86_64 salsa20-asm is actually slightly *slower* than salsa20-generic
(~3% slower on Skylake, ~10% slower on Zen), while the i686 salsa20-asm
is only slightly faster than salsa20-generic (~15% faster on Skylake,
~20% faster on Zen).  The gcc version made little difference.

So, the x86_64 salsa20-asm is pretty clearly useless.  That leaves just
the i686 salsa20-asm, which based on my tests provides a 15-20% speed
boost.  But that's without updating the code to not use %ebp.  And given
the maintenance cost, the small speed difference vs. salsa20-generic,
the fact that few people still use i686 kernels, the doubt that anyone
is even using the kernel's Salsa20 at all, and the fact that a SSE2
implementation would almost certainly be much faster on any remotely
modern x86 processor yet no one has cared enough to add one yet, I don't
think it's worthwhile to keep.

Thus, just remove both the x86_64 and i686 salsa20-asm implementations.

Signed-off-by: 's avatarEric Biggers <>
Signed-off-by: 's avatarHerbert Xu <>
parent 0b3a830b
......@@ -15,7 +15,6 @@ obj-$(CONFIG_CRYPTO_GLUE_HELPER_X86) += glue_helper.o
obj-$(CONFIG_CRYPTO_AES_586) += aes-i586.o
obj-$(CONFIG_CRYPTO_TWOFISH_586) += twofish-i586.o
obj-$(CONFIG_CRYPTO_SALSA20_586) += salsa20-i586.o
obj-$(CONFIG_CRYPTO_SERPENT_SSE2_586) += serpent-sse2-i586.o
obj-$(CONFIG_CRYPTO_AES_X86_64) += aes-x86_64.o
......@@ -24,7 +23,6 @@ obj-$(CONFIG_CRYPTO_CAMELLIA_X86_64) += camellia-x86_64.o
obj-$(CONFIG_CRYPTO_BLOWFISH_X86_64) += blowfish-x86_64.o
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o
obj-$(CONFIG_CRYPTO_TWOFISH_X86_64_3WAY) += twofish-x86_64-3way.o
obj-$(CONFIG_CRYPTO_SALSA20_X86_64) += salsa20-x86_64.o
obj-$(CONFIG_CRYPTO_CHACHA20_X86_64) += chacha20-x86_64.o
obj-$(CONFIG_CRYPTO_SERPENT_SSE2_X86_64) += serpent-sse2-x86_64.o
obj-$(CONFIG_CRYPTO_AES_NI_INTEL) += aesni-intel.o
......@@ -71,7 +69,6 @@ endif
aes-i586-y := aes-i586-asm_32.o aes_glue.o
twofish-i586-y := twofish-i586-asm_32.o twofish_glue.o
salsa20-i586-y := salsa20-i586-asm_32.o salsa20_glue.o
serpent-sse2-i586-y := serpent-sse2-i586-asm_32.o serpent_sse2_glue.o
aes-x86_64-y := aes-x86_64-asm_64.o aes_glue.o
......@@ -80,7 +77,6 @@ camellia-x86_64-y := camellia-x86_64-asm_64.o camellia_glue.o
blowfish-x86_64-y := blowfish-x86_64-asm_64.o blowfish_glue.o
twofish-x86_64-y := twofish-x86_64-asm_64.o twofish_glue.o
twofish-x86_64-3way-y := twofish-x86_64-asm_64-3way.o twofish_glue_3way.o
salsa20-x86_64-y := salsa20-x86_64-asm_64.o salsa20_glue.o
chacha20-x86_64-y := chacha20-ssse3-x86_64.o chacha20_glue.o
serpent-sse2-x86_64-y := serpent-sse2-x86_64-asm_64.o serpent_sse2_glue.o
This diff is collapsed.
This diff is collapsed.
* Glue code for optimized assembly version of Salsa20.
* Copyright (c) 2007 Tan Swee Heng <>
* The assembly codes are public domain assembly codes written by Daniel. J.
* Bernstein <>. The codes are modified to include indentation
* and to remove extraneous comments and functions that are not needed.
* - i586 version, renamed as salsa20-i586-asm_32.S
* available from <>
* - x86-64 version, renamed as salsa20-x86_64-asm_64.S
* available from <>
* Also modified to set up the initial state using the generic C code rather
* than in assembly.
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
#include <asm/unaligned.h>
#include <crypto/internal/skcipher.h>
#include <crypto/salsa20.h>
#include <linux/module.h>
asmlinkage void salsa20_encrypt_bytes(u32 state[16], const u8 *src, u8 *dst,
u32 bytes);
static int salsa20_asm_crypt(struct skcipher_request *req)
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
const struct salsa20_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
u32 state[16];
int err;
err = skcipher_walk_virt(&walk, req, true);
crypto_salsa20_init(state, ctx, walk.iv);
while (walk.nbytes > 0) {
unsigned int nbytes = walk.nbytes;
if (nbytes <
nbytes = round_down(nbytes, walk.stride);
salsa20_encrypt_bytes(state, walk.src.virt.addr,
walk.dst.virt.addr, nbytes);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
return err;
static struct skcipher_alg alg = {
.base.cra_name = "salsa20",
.base.cra_driver_name = "salsa20-asm",
.base.cra_priority = 200,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct salsa20_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = SALSA20_MIN_KEY_SIZE,
.max_keysize = SALSA20_MAX_KEY_SIZE,
.ivsize = SALSA20_IV_SIZE,
.chunksize = SALSA20_BLOCK_SIZE,
.setkey = crypto_salsa20_setkey,
.encrypt = salsa20_asm_crypt,
.decrypt = salsa20_asm_crypt,
static int __init init(void)
return crypto_register_skcipher(&alg);
static void __exit fini(void)
MODULE_DESCRIPTION ("Salsa20 stream cipher algorithm (optimized assembly version)");
......@@ -1436,34 +1436,6 @@ config CRYPTO_SALSA20
The Salsa20 stream cipher algorithm is designed by Daniel J.
Bernstein <>. See <>
config CRYPTO_SALSA20_586
tristate "Salsa20 stream cipher algorithm (i586)"
depends on (X86 || UML_X86) && !64BIT
Salsa20 stream cipher algorithm.
Salsa20 is a stream cipher submitted to eSTREAM, the ECRYPT
Stream Cipher Project. See <>
The Salsa20 stream cipher algorithm is designed by Daniel J.
Bernstein <>. See <>
config CRYPTO_SALSA20_X86_64
tristate "Salsa20 stream cipher algorithm (x86_64)"
depends on (X86 || UML_X86) && 64BIT
Salsa20 stream cipher algorithm.
Salsa20 is a stream cipher submitted to eSTREAM, the ECRYPT
Stream Cipher Project. See <>
The Salsa20 stream cipher algorithm is designed by Daniel J.
Bernstein <>. See <>
tristate "ChaCha20 cipher algorithm"
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment