The major points of departure from classical design are these:
One and only one cipher suite. No negotiation of secret key algorithms, modes, key strengths or MACs. The protocol is fixed in its use of all algorithms.
Fewer features. Specifically, there is no protection for replay, ordering or tampering attacks (The presence of the MAC was indicated more for DOS prevention than tamper protection). There is no provision for identity or authentication. There is no compression, and no fragmentation.
The Pad does the job of the IV. A Pad at the front of the plaintext includes sufficient uniqueness to form an initialisation vector.
Padding has been moved to the front. In order to reduce programming needs and to improve the robustness of the initialisation, padding is placed at the beginning, rather than being tacked on at the end, and can be made random.
Stand-alone. The protocol is stand-alone, making minimal references to other layers. All external requirements are concentrated in the context.
In SDP1, there is no handshake, alert, or change cipher spec protocol as there is in the TLS record protocol. These functions all done by a higher layer, and placed in the context for SDP1. Extensions and remedies are effected by creating DP2, DP3...
As the protocol is stateless, there is no "connection state" but there is a context. There is no default or initial context.
To Be Done.
IP Encapsulating Security Payload - RFC 2406 . The part played by ESP's Security Parameters Index (SPI) is played by SDP1's token. These features are practically identical, although they differ in minor detail (the SPI is a single 32 bit quantity whereas the SDP1 token is a variable length array of bytes).
The sequence number for SDP1, as suggested in Appendix Z, is included in the inner plaintext rather than the outer wrapping, as it is used to assist in uniqueness within the DIV.
ESP Triple DES Transform - RFC 1851 . This transform does not define an IV method, but includes it as the first 8 bytes. Unlike DES's 8 byte blocks, AES has 16 bytes in which to pack more useful IV information in. Hence, SDP1 benefits with Appendix Z's layout of a counter, a time, and still has room left for a goodly sized chunk of random data. This then means that a standard implementation should run out of the box without undue worry.
DTLS is a new variant of TLS reworked for datagrams .
The record format includes an epoch and sequence number. It is not clear that the MAC protects these elements or what effect would be had if they were unprotected (the Security Analysis correctly points out that they are public numbers but does not consider any active attack). The epoch is similar to SDP1's token and identifies the encryption context for the receiving endpoints.
The sequence number is purposed to protect against replays and to providing for a higher layer window protocol. Neither of these purposes are indicated in SDP1, due to requirements to avoid relationship between datagrams and higher layer responsibility to control replays (R2). However it does raise a simple DOS attack for SDP1 in that a valid datagram repeated many times could consume MAC, encryption and higher activities. Plausibly, this could be covered by a cache of MACs?
The encryption mode uses an explicit IV which is equivalent to SDP1's Pad (when 16 bytes long). No mention is made of padding (notes below on OpenVPN imply this is handled by the EVP interface of OpenSSL).
Other handshake elements are outside the scope of SDP1.
OpenVPN is a datagram based, open source protocol to provide a VPN into userland. . It uses a grab bag of tools to do the job, and seems to derive its packet format from ESP.
It provides both static key and TLS key sharing (we can ignore the latter). The static key is analogous to SDP1, and its 'static key' contains 4 independent keys: HMAC send, HMAC receive, encrypt, and decrypt. This is very similar to the SDP1 context (albeit with the lack of the CIVs).
The encrypted packet is formatted as follows: .
The plaintext of the encrypted envelope is formatted as follows:
The HMAC and explicit IV are outside of the encrypted envelope. The per-packet IV is randomized using OpenSSP PRNG.
HMAC, encryption, and decryption functions are provided by the OpenSSL EVP interface and allows the user to select an arbitrary cipher, key size, and message digest for HMAC. BlowFish is the default cipher and SHA1 is the default message digest. The OpenSSL EVP interface handles padding to an even multiple of block size. CBC-mode cipher usage is encouraged but not required.
Comparison. The big difference is the more classical approach of delivering the explicit IV as a randomised block in the clear in each packet. SDP1 takes the novel approach of using the unique Pad within the encryption and its context-shared CIV, which overcomes the dependency on the PRNG, and addresses the danger of secret leakage via the public IV.
Otherwise, the choice of HMAC outside and covering the entire envelope is the same, although there is no token (ESP's SPI). The 64 bit sequence number appears (to me) to be in the wrong layer. A higher layer implementing a window protocol could quite happily implement its own, and do so with more care and appropriateness.
SRTP - Secure Real Time Protocol is a new protocol aimed at VoIP. Protocol docs.
The MKI in the packet is equivalent to SDP1's token; it identifies (a master key in) the cryptographic context.
SRTP offers optional MACs and also variable length MACs. This is justified because VoIP packets are often -- they say -- 10-20 bytes long, so our 21 byte SHA-1 HMAC would blow that away. It's certainly a good argument for allowing a variable length HMAC, and they suggest a 4 byte HMAC is good enough.
(As we know, the full 20 byte strength in SDP1 is overdone and out of balance. I also recently came across an argument that it should be 16 bytes, so as to create 16 byte core lengths for all fixed compnents. This makes it easier to optimise some internal code fragments.)
It has been suggested that the "counfounder" in Kerberos V is similar to the use of the Pad as the leading element, as a way of strengthening the IV.
A confounder is an extra block of random plaintext that is prepended to a message prior to encryption with a block cipher in CBC (or CTS) mode; the resulting extra block of ciphertext must also be sent to the peer. [Nicolas Williams, cryptography list. 25.04.2007]