Session Protocol: Technical implementation details
December 15, 2020 / Technical
Note: This blog is a highly specialised writeup containing detailed information about the new Session Protocol. For a less technical overview of what’s happening and why, check out our overview of the Session Protocol.
Since Session’s first release, we’ve used the Signal protocol to provide end-to-end-encryption (E2EE) for all one-on-one chats, and the Sender Keys system for encryption in closed groups. Taken together, the Signal protocol and the Sender Keys system provide certain cryptographic properties to each conversation, specifically Perfect Forward Secrecy (PFS), deniability, and (in the case of the Signal protocol) self-healing. In some theoretical scenarios, these properties do protect users; however the utility of these protections in real-world scenarios is often more limited in scope than might be expected. We must also consider that these safeguards are offered at the expense of additional complexity, decreased account portability, and multi-device limitations. These protocols were simply not designed to be run over a decentralised network.
To empower future development efforts, and thus ensure that Session is able to acquire as many users as possible over the course of 2021, it is necessary for us to use a more streamlined, purpose-built protocol. This will allow us to focus on delivering increased reliability, implementing features such as a scalable version of a multi-device, better restoration from backups, and massively simplifying the codebase. The new Session Protocol accomplishes all of this while maintaining E2EE and sacrificing very few practical benefits over the Signal protocol.
Properties of the Signal protocol
We will begin by examining the properties provided by the Signal protocol, and how these properties function to protect users both theoretically and in real world scenarios.
Perfect Forward Secrecy (PFS)
PFS is a feature provided by the Signal protocol which protects former conversations or messages from being read when long-term key information is exposed. This is achieved through key ratcheting — periodically deriving new shared ephemeral keys which are used for message encryption. These keys are periodically deleted from the device each time new keys are derived, meaning those messages will not be able to be decrypted even if your long-term private key is compromised.
PFS specifically provides protection when the long-term keys of a device are compromised. Assuming the application properly manages keypairs, the only way this should occur is through full device access. It is this detail which limits the cases in which PFS is applicable. If an attacker has full device access, decrypted messages can be pulled directly from the Signal/Session database.
PFS would be effective in the instance of a user using disappearing messages or if manually deleting messages while an attacker was intercepting their messages at the network level. In this case, the attacker could not decrypt the messages collected at the network level by compromising the long-term key of the device. However, the usage of disappearing messages is low, and sophisticated attackers that have full network and device access are likely to perform easier and more damaging attacks, like accessing current contact information, reading future messages, and compromising device information — which neither the Signal protocol nor the Session protocol are currently able to protect against.
Deniability, also known as deniable authentication, allows for both parties in a conversation to authenticate the origin of the message(s) they receive during the conversation, but prevents either party from proving the origin of the message to a third party post-conversation. This property is provided in the Signal protocol through the derivation of per-conversation shared encryption keys, which are used for encryption and authentication. Messages are not signed with a long-term identifiable key.
Deniability in the Signal protocol is a cryptographic property which prevents a cryptographic proof from being provided that a person signed a message with a particular key. In practice, cryptographic deniability is often disregarded when it comes to court cases or media reporting. For example, cryptographic deniability was used unsuccessfully as a defense in a court case involving the communications of Chelsea Manning and Adrian Lamo. Instead, courts often rely on screenshots of conversations from a dishonest chat partner or seized devices to establish the real world identities of the chat participants.
Additionally, deniability fails to provide protection if a chat partner or device is compromised during the sensitive conversation. In such a case, the compromised participant or device can prove which device or chat partner sent which messages post-conversation.
Self-healing is a property of the Signal protocol which ensures that if a ratchet key is leaked, no future message keys can be derived from it, and thus, no future messages will be compromised. This property is provided by ongoing Diffie-Hellman key exchanges which reset sending and receiving ratchets. Without these secrets, the exposure of a ratchet key only allows the attacker to read messages associated with the current ratchet.
If keys are being properly managed by the application, the only way a session key could be compromised is if the attacker gains full device access. If an attacker can gain access to the device, then they can read previous plaintext messages from the database. If full device access is gained, they can also use the long-term keys of the device to request a session reset, which will lead to all future messages being compromised by the attacker.
Session uses a modified version of the Sender Keys component of the Signal protocol, which was developed by Signal but is not currently used in the Signal application (the most prominent current deployment of Sender Keys is in WhatsApp). Sender Keys provide all closed group conversations with PFS and deniability, but not self-healing. These protections are primarily provided through agreeing on a shared group key using the existing 1-1 chat mechanism and then establishing a Sender Key per participant which ratchets after each message is sent.
As a derivative feature of the Signal protocol, the Sender Keys system has many of the same properties, and thus the same practical drawbacks. If an attacker compromises any of the devices in the chat and participants do not have disappearing messages turned on, the attacker can simply read all messages straight from the database. If the attacker compromises the long-term keys of any of the devices in the chat, then all future messages can be read. Deniability is provided by enforcing the non-signing of messages with each device’s long-term keys. However, just as with cryptographic deniability for 1-1 messages when used as a defense in the media or court system, cryptographic deniability for group chats is largely impractical as a real-world defense in such situations.
The Session Protocol: Rollout and technical information
The Session Protocol will be rolled out in two stages. The first stage, which was released to mobile app stores and the desktop client on Friday Dec 11 AEDT, is a modified version of the Session Protocol which maintains backwards compatibility with older Session clients still relying on the Signal protocol. This version of the protocol is detailed below as “Session Protocol with backwards compatibility”. The second stage, which will be released at a later date, will strip out the backwards compatibility layer, fully realising our vision for a streamlined E2E encryption protocol. This version of the protocol is detailed below as “Session Protocol (full implementation”.
Session Protocol with backwards compatibility: 1-to-1 chats
A long-term X25519 keypair is generated for each Session account at time of account creation, and the public part of this keypair is the account’s “Session ID”.
Consider a scenario where Alice wants to send Message M to Bob, whose Session ID is `IPK_2`. To do that, Alice first signs Message M using her private (identity) key `IPK_1`:
S = sign(M, IPK_1);
Alice then generates an ephemeral keypair `(ESK, EPK)`, where ESK and EPK are private and public components, respectively. She then performs an Elliptic-Curve Diffie Hellman (ECDH) over `(ESK, IPK_2)` to generate a shared secret, which is then used to derive a shared key `SK` using a key derivation function HKDF. Alice can now use a symmetric cipher (AES-GCM) to get ciphertext `C` using `SK` to encrypt `(S || M)`.
The ciphertext is sent to Bob, along with the public component of the ephemeral keypair `EPK`.
Bob performs ECDH over `(ISK_2, EPK)`, repeating the same steps as Alice to obtain the secret key `SK`. Bob can now use `SK` to decrypt the ciphertext `C` and obtain `S || M`. Bob can check the validity of the signature using Alice’s identity key, ignoring the message `M` if the signature is invalid. If the signature is valid, it is deleted and the message is stored in the device’s local database.
Session Protocol (full implementation): 1-to-1 chats
Under this scheme, a message (M) consists of the message being sent, the sender’s ED25519 public key, and the recipient’s X25519 public key. This message is then signed using the crypto_sign() function in libsodium with the sender’s ED25519 private key to produce S. The plaintext message and the signature are then encrypted using libsodium’s crypto_box_sealed() function, taking the sender’s ED25519 public key and the recipient’s X25519 public key to produce X. X is then sent to the recipient, along with the message tag (the recipient’s public X25519 key) which ensures that the message can be identified and requested from the swarm.
m = message || sender_pubkey_ed25519 || recipient_pubkey_x25519
s = signature(m, sender_privkey_ed25519)
x = crypto_box_sealed(message || sender_pubkey_ed25519 || s, recipient_pubkey_x25519)
Session Protocol with backwards compatibility: Closed groups
When a user wants to establish a closed group, they will generate a shared long-term symmetric key `SK` and share it with other members via pairwise channels, as is the case in the Sender Keys protocol. However, unlike under the Sender Keys system, users will not establish individual Sender Keys. Instead the messages will be encrypted directly with `SK` using a symmetric cipher, and messages will be signed with users’ identity keys. New entrants to a group will be notified of the current `SK` so they can begin sending and receiving messages. When a user leaves, or is kicked, the group will establish a new shared group key `SK’`.
Session Protocol (full implementation): Closed groups
As an overview of the scheme, first the creator of the group will derive a shared ED25519 public keypair for the group. This key will be shared to all group members via the 1-1 chat network. Once a member has this keypair, they can identify new messages destined for the group and encrypt new messages for the group.
A message (M) consists of the message being sent, the sender’s ED25519 public key, and the group’s shared X25519 public key. This message is signed using the crypto_sign() function in libsodium with the sender’s ED25519 private key to produce S. The plaintext message and the signature are then encrypted using libsodium’s crypto_box_sealed() function taking the sender’s ED25519 public key and the group X25519 public key to produce X. X is then sent to the group’s defined swarm, along with the message tag (the group’s X25519 public key), which ensures that the message can be identified and requested from the swarm.
m = message || sender_pubkey_ed25519 || sharedgroup_pubkey_x25519
s = signature(m, sender_privkey_ed25519)
x = crypto_box_sealed(message || sender_pubkey_ed25519 || s, sharedgroup_pubkey_x25519)
Session Protocol: Advantages
Simplicity and reliability
Using existing long-term keypairs in place of the Signal protocol massively simplifies 1-1 messaging. Complex key exchange processes and sharing of prekey bundles — processes which are particularly difficult to manage in a decentralised environment — are no longer required. This new protocol also makes it impossible for sessions to go out of sync or for messages to be encrypted for nonexistent sessions — both very common errors frequently encountered by Session users under the current protocol.
For closed groups, removing individual sender keys removes the need to progressively ratchet keys for each user, which can also cause messages to be encrypted for nonexistent sessions.
Under the Session Protocol, multi-device can be reimplemented by simply sharing a user’s long-term key pair to a new device and duplicating sent messages back into their own swarm. This is significantly less complex and intricate than our old approach to multi-device, which required that sessions be kept constantly in sync between multiple devices which could be online or offline, without relying on a centralised server.
The removal of ephemeral keys stored only on a local device will massively increase the portability of Session accounts. This means that when restoring your Session ID, you will be able to receive new messages immediately. Under the previous implementation (i.e., the Signal protocol), new messages would be encrypted for nonexistent sessions until the user who restored their account performed a session reset.
The Signal protocol maintains three complex yet separate implementations of the Signal protocol between Android, iOS and desktop clients. The simplicity of the Session protocol allows the encryption and decryption pipelines to be implemented in less than a page worth of code — code which can effectively be standardised across all platforms. This significantly reduces the difficulty of making future changes, and also makes the protocol easier to understand and analyse from a security perspective.
Session Protocol: Theoretical disadvantages
Switching to the Session Protocol means that Session will no longer have deniability and PFS in 1-1 chats, or PFS in closed groups. However, as previously discussed, these properties provide only limited protections in real-world scenarios. To demonstrate this, it is useful to explore situations in which the Session Protocol would be weaker than the Signal protocol and old closed group Sender Keys system.
Perfect Forward Secrecy
If an attacker has full device access, neither the Session Protocol nor the Signal protocol can prevent an attacker from reading all previous messages in the conversation by pulling plaintext message history directly from the device. If the attacker has full device access and the user has disappearing messages turned on, both the Session Protocol and the Signal protocol prevent the attacker from reading any previous messages.
If an attacker has both full device access and targeted network scraping capability, and the user has disappearing messages turned on, the Signal protocol can protect users’ previous messages from being read, while the Session Protocol will not. The situation for closed groups is similar. There are ways for Session to limit the scope of this attack by limiting the data which an attacker can scrape, which will be discussed later.
All messages sent using the Session Protocol are signed with the long-term key of the sender, and thus a cryptographic proof can be provided to a third party proving that a particular keypair did indeed sign a message.
However, since signatures are immediately deleted after validation, this limits the Session Protocol’s divergence from the practical benefits of the Signal protocol in real-world scenarios. Practically speaking, an attacker or recipient would need to have compromised the device’s long-term key pair and be able to scrape messages off the network or off the device before the signature was deleted in order for this difference to have any meaningful security implications, and as with the scraping attack discussed above, there are protections which can be deployed against this type of scraping.
The Session Protocol has no session keys, and thus the self healing property of the Signal protocol is not retained. If long-term keys are compromised in either protocol, an attacker can compromise future messages by requesting a session reset.
Addressing theoretical disadvantages
Scraping: Protections and mitigations
Since all messages in Session are onion-routed to their destination swarm, they cannot be scraped during transit by ISPs or the Service Nodes in the onion routing path due to the additional layers of encryption provided by onion routing. Messages can however be scraped from the swarm, either by any of the nodes in the recipient’s swarm (a swarm typically consists of 5-7 Service nodes) or by another entity which makes a request to a node in the recipient’s swarm.
Currently, message retrieval requests are unauthenticated, meaning any user can request any other user’s (encrypted) messages. To limit scraping, we will require a signature from the requester’s long-term key. This signature will be validated by the Service Node, which ensures that unauthorised parties cannot scrape data from swarms without being able to produce a signature from the long term key belonging to a stored message. This does not prevent scraping in every possible scenario; a sufficiently well-resourced and motivated attacker could run Service Nodes and pull messages directly from their own database. However, this would be extremely costly due to the staking requirement needed to run a Service Node (~US$15,000). Further, since Service Nodes cannot choose their swarms, targeting particular users would be extremely difficult to the point of practical impossibility.
On the loss of cryptographic deniability
As previously mentioned, cryptographic deniability is often something that is largely ignored by the court system and the media. If contextual information can be provided around screenshots, this is often enough to lead to a conviction or personal damages, regardless of the presence or absence of cryptographic deniability.
Instead of designing a cryptographic protection, Session will add the ability to edit other users’ messages locally, thus providing a way to completely forge conversations. Since signatures are deleted after messages are received, there will be no way to prove whether a screenshot of a conversation is real or edited, diminishing the value of screenshots as evidence.
There are certain other aspects of Session which further limit the need for use of the Signal protocol.
Ephemeral identities and anonymous account creation
Unlike in Signal, where accounts are tied to their phone numbers and tend to be established and maintained for a long period of time, Session is built on the concept of ephemeral identities. Session IDs can be created and abandoned on a per-conversation basis, weekly, monthly or as often as a user likes. If you want to destroy your account and create a new one, you can do so at any time, and there will be absolutely no cryptographic link between that account and your real world identity. This limits the likelihood of key compromise or message scraping.
Session’s use of onion routing for message sending and receiving increases deniability at the network level since no central server has a log of when messages are retrieved or sent, or which IP addresses are sending or receiving those messages. This is not the case in Signal; Signal’s central servers have a log of which IP address sends which message and which IP address retrieves that message.
The Session Protocol massively reduces application complexity, makes sessions easier to manage, and increases account portability, all while increasing reliability and making highly-requested features like multi-device, account restoration, and other upcoming improvements significantly easier to implement.
The practical protections provided by the Session Protocol and the Signal protocol in real-world scenarios are similar. Furthermore, features being added to the Session application, such as authenticated Service Node requests and recipient-side message editing, bring the practical benefits of the two protocols virtually in line with each other — all while massively increasing message sending and receiving efficiency, and enabling development of exciting new features.
How to strip metadata from your files
January 07, 2022
Disposing of Big Tech: Building better algorithms (Part Two)
January 05, 2022
2021 retrospective: Messaging, calling, and communicating
December 29, 2021
Disposing of Big Tech: Free speech is not disposable (Part One)
December 22, 2021
Calls on Session: How to opt-in
December 20, 2021
No caller ID: Calls on Session
December 07, 2021