Crypto features
Forwared secrecy
Authenticity
(Deniability)
Security in general and end-to-end encryption in particular is always a tradeoff. You can't have everything and you basically need to know what you to be secure agains.
However when designing OMEMO we wanted it to have some specific cryptographic features:
First of all we wanted it to have forward secrecy which means that if the keys are compromised at any time an attacker won't automatically gain access to past conversations.
Second of all we wanted it to have authenticity which of course means that messages can't be spoofed by a third party.
XEP-27 for example doesn't have neither of these features because it doesn't sign the messages.
Deniablity is also a feature common in modern end-to-end encryption methods but I'm putting this brakets because I personally doubt that who ever breaks your encryption will care that mathematically someone else could have written those messages
Usability features
Reliable
Multiple devices
Existing XMPP servers
Aside from these cryptographic features we wanted OMEMO to have a few more features.
Most importantly we wanted it to be reliable and we wanted it to work with multiple devices as well as in situations where the recipient is offline.
OTR which has been used in the past didn't to neither of these things.
Furthermore we wanted OMEMO to work with existing servers and not require server admins to install an addional module or something like that.
An Introduction to Ratchets
Key exchange at beginning of ‘session’
Message keys within a ‘session’ are derived from each other
Message keys are destroyed after decryption
Our cryptographic requirements can be implement with what is commonly know as a Ratchet.
The basic concept is that there is a key exchange at the beginning of a session - I'll come to what a session is in a minute - and message keys within that session are all derived from each other. Each message has its own key and that key is destroyed after successfull decryption.
This means you can only decrypt each message once. The key derivation can only move forward. Like a ratchet.
OTR vs Axolotl
Lifelong sessions
First part of key exchange stored on server (pre keys)
Resilient against lost messages, reordering, replay
OTR uses such a ratchet. When TextSecure came along they introduced sort of an improved version of OTR called Axoltl that brought three important changes.
While sessions in OTR were rather short lived which basically meant something like as long as your chat window is open; sessions in Axolotl are used over the entire lifespan of the programm.
The second improvement was that the first part of a key exchange is stored on the server which meant the recipient doesn't have to be online anymore to establish the session.
The most important change however is that Axolotl is resilient aganist message loss and reordering which is vital when your transport isn't 100% reliable.
Naming confusion
Axolotl renamed to Signal Protocol
Copyright claims by OWS lead to several third party libraries with the same algorithm
Actual algorithm public domain (Double Ratchet)
Libraries differ in wire (binary) format
A quick note; After the introduction of OMEMO Axolotl was renamed to Signal Protocol and after copyright claims by Open Whisper System several third party libraries were introduced that basically did the same thing.
However the wire format of those third party libries might not be compatible to the original Axolotl and thus the libraries aren't compatible with each other.
For simplicitly I'm going to continue talking about Axolotl. Just know that from a technological standpoint it doesn't make a difference what library you choose.
History of OMEMO
Google Summer of Code 2015 (June - September)
Conversations release in September
protoXEP in October 2015
Gajim Plugin Christmas 2015
ChatSecure Beta November 2016
As of yesterday: XEP-0384
Just for a very brief history of OMEMO
OMEMO - or rather a first proof of concept implementation - was implemented in Conversations during the Google Summer of Code 2015.
That implementation was first released to the public in September 2015.
A proto XEP got released in October 2015.
Since December 2015 there is a plugin for Gajim.
And since very recently the iOS client ChatSecure has basic support for OMEMO
And of course since yesterday OMEMO is an experimental XEP with the number 384.
Integrating Axolotl in XMPP
Session for every device
Pre keys are stored in PEP for each device
Index node holds list with devices
Clients subscribe (+notify
) to index node
How does one integrate Axolotl into XMPP?
Axolotl itself isn't capable of encrypting to multiple devices so we have to create sessions for all devices involved. That's all of our devices and all of our contatcs devices.
We use PEP to store the pre keys. One node for every device.
And we have an index node that holds a list of all our devices clients subscribe to that node to get notified of new devices.
Integrating Axolotl in XMPP (2)
Individual node for every device (namespace contains device id)
Every node has >100 pre keys. Clients pick one at random
<body/>
gets encrypted with random key (AES-GCM)
Random key gets encrypted in n sessions
Since every device has it's own PEP node we have to abuse namespaces to carry the device id.
Every device node holds about 100 prekeys and your contact would pick one at random
To send a message we first encrypt the body with a random key using AES-GCM
and then encrypt that key with all of our sessions
OMEMO over MUC
n * m sessions w/ n : average number of devices/participant, m : number of participants
Requires presence subscription with each participant
Reliably detect participants (retrieve member list) & changes (notification of affiliation changes even if participant is offline)
MUC MAM needs to communicate real JID
Using OMEMO over MUC works just the same way. You just encrypt to every device of every participant. Since there is no connection between resources and devices we have to encrypt to every device even though that device might not actually be in that conference at this time.
Of course since OMEMO requires PEP we need to have presence subscription with every participant and we need a way to reliably detect the real JID of the participants in a muc. This means the memberlist has to be retrievable by everybody and the MUC should notify about affiliation changes.
If we want access to the history MAM also needs to communicate the real jid
OMEMO over MIX
Presence subscription no longer required
Improved method to detect participant changes
OMEMO over MIX will probably work very similar to the way it works in MUC.
We probably don't need presence subscription anymore since we can just put of copy of the pre keys in a MIX node.
And it will probably be a little less hacky to watch participant changes
Accomplishments
Hassle-free communication between multiple devices
Works with carbons and MAM
Even in (private, non-anonymous) conferences
Existing servers not so much. PEP was a rocky road
So what can we do right now? We set out to create an end-to-end encryption that is reliable enough to be always on without the user noticing it. We wanted the user to be able to switch between multiple devices. Thats totally possible now if you use Conversations and Gajim. I'm using it on a daily basis with several of my contacts. Even in MUC.
One of our goals was to make it work with existing servers and that let us to choose PEP as a storage mechanism despite the downsides like deliverying 100 preykeys even though we just need one.
That actually wasn't as succesful as we hoped it to be because a year ago PEP was very very broken in ejabberd and people had to update their servers anyway. OpenFire actually just released a fix for their PEP two weeks ago.
And when looking at how fast HTTP Upload got adopted we probably could have created our own server component.
Future
Provide a way to optionally encrypt XML
No plan to support large conferences
The next revison of OMEMO will probably contain a way to optionally encrypt an XML payload as well but keep the plain text encryption as a fallback for clients who don't want to start an XML parser from their XML parser.
There are currently no plans to support larger conferences where the encrypting with every session approach doesn't work. Let us run into problems first before we fix them. We don't actually know yet how many users we can support this way.
Trust multiple devices
TOFU doesn’t work if you can’t judge plausibility
Signal/WhatsApp are migrating to trust everything
Conversations >1.15.0
had manual mode Let user decide for each and every new device
BTBV (Blind Trust Before Verification) Trust everything by default. After first actual verification (scanning barcode) go to manual mode and let user decide for subsequent devices
OMEMO is capable of encryption but OMEMO doesn't have an answer on how to verify the indivdual devices.
In general Trust on First Use doesn't work very well because people change keys and you can't judge if they actually did or if you are being Man in the middled. Compare that to SSH where you usually control the server as well and where you have a chance to judge if it was plausible that the server got new keys because you reinstalled the operating system.
For this reason the mainstream messenger are switching to Trust Everything and merly warn.
So Conversations introduced this new mode called Blind Trust Before Verification where we trust everything by default but stop with the trust everything after the first actual verification.
I don't have time to go into the details but I'd love to chat with you about this later