PREVIEW, WIP
Here are some notes on the Espressif ESP32 Hardware Encryption Features for wolfSSL.
Any discussion of ESP32 (or any other) hardware encryption should address the generally non-updatable nature of the implementation. For example, early versions of the ESP32 were discovered by limitedresults to have exploitable hardware vulnerabilities.
Hardware vulnerabilities exist across the board, for pretty much all platforms: hertzbleed, heartbleed, Spectre, etc.
Clearly physical security is just as important as any software design.
Key to any security implementation is a prompt disclosure and response from the vendor. Espressif announced Security Advisory concerning fault injection and eFuse protections (CVE-2019-17391) shortly after “LimitedResults provided a proof of concept report demonstrating fault injection attack and analysis to recover keys stored in eFuse”.
Note modern ESP32 devices have had a hardware revision to address the fault injection.
“The ESP32-D0WD-V3 chip has checks in ROM which prevent fault injection attack” – Espressif Security Advisory
Of equal importance is proper software implementation. See the wolfSSL blog on TLS Glitch Resistance.
Getting Started
Ensure the user settings header is enabled: define WOLFSSL_USER_SETTINGS
via -DWOLFSSL_USER_SETTINGS
in the CMake file at compile time.
wolfssl libraries are typically found in the components directory
of either the local project ${CMAKE_HOME_DIRECTORY}
or the ESP-IDF $ENV{IDF_PATH}
directory.
Terminology
ctx
: context object.DH
: diffie hellmanHW
: hardware encryption methodSW
: software encryption method
Hardware Acceleration Overview
There are two useful features of hardware security implemented in the ESP32: Storarage and Computational Acceleration.
There’s an ability to store keys in the eFuse non-volatile memory.
From the ESP32 Datasheet, page 12:
Here’s the summary from page 38:
The interesting files for the wolfSSL hardware encryption for the ESP32 are found in the esp32-crypt.h and source files:
Of particular interest and importance: the Espressif hardware acceleration implementation is NOT RTOS friendly. ONLY ONE hash can be generated at a time. There is NO mechanism to save an in-progress computation to let something else use the hardware on an interim basis.
One of the concerns might be the encryption used by WiFi. At this time, at least in ESP-IDF the WiFi crypto functions are performed in software, such as the hmac_sha256_vector().
Another place of interest regarding the hardware crypto is found in the esp_rom component:
- ESP32 aes.h, rsa_pss.h, and sha.h, among others.
- ESP32-C2 [aes.h ?], rsa_pss.h, and sha.h, among others.
- ESP32-C3 aes.h, rsa_pss.h, and sha.h, among others.
- ESP32-H2 aes.h, rsa_pss.h, and sha.h, among others.
- ESP32-S2 aes.h, rsa_pss.h, and sha.h, among others.
- ESP32-S3 aes.h, rsa_pss.h, and sha.h, among others.
Unfortunately, the implementation seems to be proprietary, as there are only ld
linker files with functions assignments to addresses.
It is assumed the functions in question are not using the hardware acceleration. TODO how to confirm this?
Note that on multi-core ESP32 devices, there’s a concurrency warning:
Do not use these function in multi core mode due to inside they have no safe implementation (without DPORT workaround).
Reference Documents
The following documents are directly applicable to the crypto-acceleration functions:
- AES (FIPS PUB 197) Advanced Encryption Standard (AES)
- Hash SHA-2 (FIPS PUB 180-4) Secure Hash Standard (SHS)
wolfSSL Fine Tuning
If size is more important the speed for software computation, the USE_SLOW_SHA
can be defined.
See sha.c
“nearly 1 K bigger in code size but 25% faster”.
wolfSSL ESP32 Hardware Encryption
Turn on with -DWOLFSSL_ESP32WROOM32_CRYPT
. This is enabled by default for the ESP32-WROOM. Enables:
int esp_sha_process(struct wc_Sha* sha, const byte* data)
int esp_sha_digest_process(struct wc_Sha* sha, byte blockproc)
int esp_sha256_process(struct wc_Sha256* sha, const byte* data)
int esp_sha256_digest_process(struct wc_Sha256* sha, byte blockproc)
SHA Accelerator
See Chapter 23, page 573 of the ESP32 Technical Reference Manual and Section 5 of FIPS PUB 180-4 Secure Hash Standard, “SHS”.
By default, SHA acceleration is enabled with WOLFSSL_ESP32WROOM32_CRYPT
.
To disable just SHA acceleration, use NO_WOLFSSL_ESP32WROOM32_CRYPT_HASH
.
See wolfSSL esp32_sha.c
Given const byte* V
, a SHA-256 is calculated in wolfSSL:
byte data[DRBG_SEED_LEN];
len = (outSz / OUTPUT_BLOCK_LEN) + ((outSz % OUTPUT_BLOCK_LEN) ? 1 : 0);
XMEMCPY(data, V, sizeof(data));
for (i = 0; i < len; i++) {
ret = wc_InitSha256(sha);
if (ret == 0) {
ret = wc_Sha256Update(sha, data, sizeof(data));
}
if (ret == 0) {
ret = wc_Sha256Final(sha, digest);
}
wc_Sha256Free(sha);
}
Note that from a typical code implementation perspective, we don’t even know if the acceleration features are being used.
The exact same API interface is used with the actual implementation being controlled with the compiler #define
s.
To aid in development, it can be helpful to have a web hash converter or desktop SHA-256 calculator. See system.security.cryptography.sha256:
using System;
using System.Security.Cryptography;
namespace mySHA256_calculator
{
class Program
{
public static void PrintByteArray(byte[] array)
{
for (int i = 0; i < array.Length; i++)
{
Console.Write($"{array[i]:X2}");
if ((i % 4) == 3) Console.Write(" ");
}
Console.WriteLine();
}
// 32 words: 45725791 C47B3261 8CC57B88 343E2BCE EC3B0A01 B83BC97D 144A2CBC 11A20C3D
// 16 words: 60E05BD1 B195AF2F 94112FA7 197A5C88 28905884 0CE7C6DF 9693756B C6250F55
// 8 words: 84E0C0EA FAA95A34 C293F278 AC52E45C E537BAB5 E752A00E 6959A13A E103B65A
//
// 64 zeros (16 words of 4 bytes of "0" = 0x30)
// 0000000000000000000000000000000000000000000000000000000000000000
// 60E05BD1 B195AF2F 94112FA7 197A5C88 28905884 0CE7C6DF 9693756B C6250F55
//
// "0" = 0x30 (VS)
// 5FECEB66 FFC86F38 D952786C 6D696C79 C2DBC239 DD4E91B4 6729D73A 27FB57E9
//
// 0x30000000
// 4F8320D9 1E97D546 DC799848 E8D218E1 8050AF7A 7964E041 4DE9E547 9006D7E3
static void Main(string[] args)
{
using (SHA256 mySHA256 = SHA256.Create())
{
int word_size = 8;
byte[] buffer = new byte[word_size * 4];
int byte_fill = word_size * 4;
for (int i = 0; i < byte_fill; i++)
{
buffer[i] = 0x30;
}
byte[] hashValue = mySHA256.ComputeHash(buffer);
PrintByteArray(buffer); // 30303030 30303030 30303030 30303030 30303030 30303030 30303030 30303030
PrintByteArray(hashValue); // 84E0C0EA FAA95A34 C293F278 AC52E45C E537BAB5 E752A00E 6959A13A E103B65A
Console.WriteLine("");
byte[] buffer2 = new byte[4];
buffer2[0] = 0x30;
buffer2[1] = 0x00;
buffer2[2] = 0x00;
buffer2[3] = 0x00;
byte[] hashValue2 = mySHA256.ComputeHash(buffer2);
PrintByteArray(buffer2); // 30000000
PrintByteArray(hashValue2); // 4F8320D9 1E97D546 DC799848 E8D218E1 8050AF7A 7964E041 4DE9E547 9006D7E3
}
}
}
}
There are some interesting notes about the SHA encryption registers on the ESP32:
- The hash output is found in the same registers used for input, both starting at
SHA_TEST_0_REG
at0x3FF03000
. - There’s only 1 register set for all hash functions: SHA1, SHA256, SHA384, SHA512.
- The initial hash values, (see SHA-2 pseudo-code, e.g. h0 := 0x6a09e667 in wolfcrypt sha256.c) do not need to be loaded.
- Once a hash process is started, the interim result is hidden and cannot be stashed to start on a different computation.
- At least one
asm volatile("memw");
“should be executed in between every load or store to a volatile variable” (See Xtensa ISA Reference Manual) - Repeated calls to
periph_module_enable(PERIPH_SHA_MODULE)
are tracked for recursion. Call toperiph_module_disable
is only effective after all enables are unwrapped. - A call to
periph_module_reset
will reset the device regardless of how many timesperiph_module_enable
was called, and the call-counter is not reset. - Data must be processed in 512 bit chunks for SHA256 (64 bytes stored in sixteen 4-byte words, starting at
SHA_TEST_0_REG
at0x3FF03000
) - The trailing bit “1” and 64-bit has word count does need to be manually applied to the last block as noted on page 18 of FIPS PUB 180-4 :
Reminder: The ESP32 SHA encryption accelerator does not do final padding. The 0x80
and 64-bit message length need to be manually added!
Each block of data is hashed into digest for wolfSSL:
Given the single-computation nature of the hardware accelerated hash content registers, note that even in a single-thread RTOS, multiple hashes may need to be computed concurrently. This will cause the second one to fall back to software calculations.
For example, in the ESP32 SSH to UART example, the non-blocking call to wolfSSH_accept that is started upon connection
The SendKexDhReply()
(Send Key Exchange Diffie-Hellman Key) hashes together multiple different items all in one big SHA256 hash result:
However, after about 30 increments to the hash calculation, another call is made to wc_ecc_make_key_ex()
which also will need a few small hashes:
As there can be only one hardware hash in progress at a given time, the code should detect this and fall back to software hashes, as seen here with verbose debugging turned on:
Extra care should be taken when computing hardware-accelerated hashes in a multi-thread RTOS environment.
RSA Accelerator
The RSA Accelerator is for math functions. See Espressif/esp32_mp.c and Chapter 24, page 582 of the ESP32 Technical Reference Manual.
NOTE:
The maximum operation length for RSA, ECC, Big Integer Multiply and Big Integer Modular Multiplication is 4096 bits
Turn on with -DWOLFSSL_ESP32WROOM32_CRYPT_RSA_PRI
. Enables:
Large Number Modular Exponentiation
The operation is based on Montgomery multiplication. Aside from the arguments X, Y , and M, two additional ones are needed -r and M’ These arguments are calculated in advance by software.
See Chapter 24.3.2, page 584 of the ESP32 Technical Reference Manual:
Z = (X ^ Y) mod M
(sometimes in DH context referred to asY = (G ^ X) mod P
)
int esp_mp_exptmod(struct fp_int* X, /* G */
struct fp_int* Y, /* X */
word32 Xbits,
struct fp_int* M, /* P */
struct fp_int* Z) /* Y */
Large Number Modular Multiplication
See Chapter 24.3.3, page 584 of the ESP32 Technical Reference Manual:
Z = X * Y (mod M)
int esp_mp_mulmod(fp_int* X,
fp_int* Y,
fp_int* M,
fp_int* Z)
Support for large-number multiplication:
Z = X * Y
int esp_mp_mul(fp_int* X,
fp_int* Y,
fp_int* Z)
Support for various lengths of operands:
AES Accelerator
See chapter 22 of ESP32 Technical Reference Manual.
The AES Accelerator supports six algorithms of FIPS PUB 197, specifically AES-128, AES-192 and AES-256 encryption and decryption.
int wc_esp32AesCbcEncrypt(struct Aes* aes, byte* out, const byte* in, word32 sz);
int wc_esp32AesCbcDecrypt(struct Aes* aes, byte* out, const byte* in, word32 sz);
int wc_esp32AesEncrypt(struct Aes *aes, const byte* in, byte* out);
int wc_esp32AesDecrypt(struct Aes *aes, const byte* in, byte* out);
ECC
Not to be confused with the Error Code Capture feature:
Error Code Capture (ECC) feature allows the TWAI controller to record the error type and bit position of a TWAI bus error in the form of an error code – Technical Reference Manual 21.5.8
The maximum operation length for RSA, ECC, Big Integer Multiply and Big Integer Modular Multiplication is 4096 bits – 4.1.19 Acclerator of ESP32 Datasheet
- todo
RNG Random Number Generator
- todo
Random number generator table:
Start Address | End Address | Size |
---|---|---|
0x3FF7_5000 | 0x3FF7_5FFF | 4KB |
wolfSSL utility library
Coding Convention
In wolfCrypt those API’s return 0 for success.
WOLFSSL_SUCCESS
and WOLFSSL_FAILURE
values should only be used in the ssl layer, not in wolfCrypt.
WOLFSSL_SUCCESS and WOLFSSL_FAILURE values should only be used in the ssl layer, not in wolfCrypt
Development
Install wolfSSL for WSL:
cd /mnt/c/workspace/wolfssl
./configure --enable-tls13 --prefix=/usr/ && make && sudo make install
# or
./configure --enable-dtls --enable-tls13 --prefix=/usr/ && make && sudo make install
Setup ESP-IDF
cd ~/esp/esp-idf
. $HOME/esp/esp-idf/export.sh
idf.py -p /dev/ttyS9 -b 230400 flash monitor
DTLS compile:
gcc -o server-dtls server-dtls.c -Wall -I/usr/local/include -Os -L/usr/local/lib -lm -lwolfssl -Wl,-rpath=/usr/local/lib
WSL TLS1.3 Server:
cd /mnt/c/workspace/wolfssl-examples/tls
./server-tls13 -v 4
Resources, Inspiration, Credits, and Other Links:
- wolfSSL wolfcrypt sha256.c = wolfSSL wolfcrypt esp32_sha.c
- wolfSSL Espressif
- wolfSSL docs/Espressif
- wolfSSL wolfSSL ESP32 Hardware Acceleration Support
- wolfSSL Porting Guide
- wolfSSL wolfcrypt stm32
- wolfSSL Building wolfSSL
- Espressif ESP32 Datasheet
- Espressif ESP32 Technical Reference Manual
- Espressif Blog Understanding ESP32’s Security Features
- Espressif Blog ESP32: TLS (Transport Layer Security) And IoT Devices
- Espressif Blog ESP32-S2: Digital Signature Peripheral
- Espressif GitHub Example ESP-MQTT SSL Mutual Authentication with Digital Signature
- Espressif Logging Library
- Espressif ESP-TLS
- LimitedResults Pwn the ESP32 Forever: Flash Encryption and Sec. Boot Keys Extraction
- wolfSSL ESP32 Hardware Acceleration Support
- wolfSSL Install
- Wireshark capture filters
- Adafruit ESP32uesday: The ESP32-S3 is More Than a Fancy S2
- OpenOCD Open On-Chip Debugger User Guide
- Stackoverflow How to debug “cannot open shared object file: No such file or directory”?
- Xtensa Instruction Set Architecture (ISA) Reference Manual
- Michael Driscoll’s The Animated Elliptic Curve
- Michael Driscoll’s The Illustrated TLS 1.3 Connection
- Microsoft Cipher Suites in TLS/SSL (Schannel SSP)
- Wikipedia SHA-2
- NIST FIPS PUB 180-4
- NIST SP 800-90A Deterministic Random Bit Generator Validation System (DRBGVS)
- Olimex ESP32 with ECC