The future of encryption is AI: Chat Bots for undetectable strong-encryption

The rise of state surveillance has made it such that even the very act of utilizing secure communication services (Tor, Vuvuzela, Proxy Servers etc.) gives rise to the suspicion of nefarious activity. It is a well known fact that government organizations have actively tried to infiltrate and monitor secure systems such as Tor or Proxies and have passed legislation to enable this.

What this means is that any secure communication must be widely adopted to be effective since the very act of using secure communication renders one suspicious.

Unfortunately the majority of  people are fully comfortable with utilizing standard services such as Facebook Messenger, WhatsApp or iMessage despite privacy implications. Such technologies do have encryption capabilities but are still susceptible to search and seizure without notification to the individual due to gag-orders. With the now continued historical precedent of irrational, nefarious and incompetent government, this is incredibly scary.

The ultimate solution in my opinion is to leverage existing social networks as “carriers” for encrypted message traffic — after all : encrypted traffic is just a series of bytes … and can easily be base64 encoded and sent over any medium.

The real issue is not that this is impossible, but rather, that it is easy to detect and to track. Let’s take the example of the simple message “Hello World” encoded using a 256-bit AES cipher with the key “foobar”:


The encrypted message above is easy to transmit via facebook messenger or any other messaging technology. But unfortunately it is very susceptible to detection — it does not resemble regular human language and would raise suspicion.

So how can we make this avoid detection?

By utilizing AI!

Let’s breakup the above example using what I dub the “chatbot” encoding scheme.   The high-level idea is that two AI Chatbots talk to each other in such a way as to encode the base64 string in a manner that resembles real world communication and is statistically difficult to detect. To generate the encoding scheme we take millions of real world conversations and generate permutations of these by adding things like emojis or grammatical symbols. Each “conversation” between two people actually corresponds to a unique encoded transmission.

The scheme starts with the sender (“Alice”) sending the receiver (“Bob”) the first message over the carrier (“Facebook”). Alice takes the first N letters of her encoded message and looks them up in the “sender” table.

Since her original string was ODuX2B2GyhgwqlVbVJNLIOSyicYr7EJR8hS+MDhOCJM=she will lookup “ODu” in the sender table (we will assume N=3) . The sender table is keyed upon two values: The bytes to encrypt (“ODu”) and the last message received (just the empty string since she is starting the transmission):

SENDER[‘ODu’, ”] —> “Hey there…”

Now Bob has to send a response to continue receiving the transmission. Bob will now lookup a random response based on the received message:

RECEIVER[‘Hey there…”] –> [“Hi :)” , “Yo”, “whatss up”]

Bob picks a response at random. (Let’s assume it’s “whatss up”) … Alice receives the response and now moves on to encoding the next 3 bytes(“X2B”)  keyed on the last message (“whatss up”) :

SENDER[‘X2B’, ‘whatss up’] –> “Wanna watch a movie?”

This scheme continues until the entire encrypted string has been transmitted. This is powerful because the text is difficult to differentiate from normal conversation — it very well could be the case that these two people were actually wanting to watch a movie rather than submitting the message “Hello World” in an encrypted format.

Of course detection is still possible as more and more messages are sent — but this can be combatted using more and more complex language models. Furthermore — this pushes detection into the realm of AI rather than code-breaking. Messages first need to be identified as encrypted before we can even think about using any type of code-breaking strategy.  To identify them the computer must differentiate between human conversation and automated conversation. That is a much harder problem. Furthermore — a Tor like network could be built on top of facebook by utilizing this algorithm: facebook would have to utilize complex graph algorithms to differentiate these “bots” from real accounts… and would almost certainly face a very high false positive ratio.

My goal is to demonstrate a high-level idea that natural language paired with chat-bot like behavior can create a simple encryption protocol that is very difficult to detect and still guarantees strong encryption. The example I have given purposefully skimps on the aspects of the algorithm that can be greatly improved (the language model). There are several extension to make the chat behavior more plausible (e.g utilizing n-gram keys and more complex encoding schemes).

The future of secure communication and encryption must leverage artificial intelligence.