This notebook serves a technical introduction, demonstration, and collaboration baseline for this repository, PyCryption. It values ‘fire-and-forget’ systems to allow you to write, and test custom encryption systems.
The true way to measure performance would be analyze mathematical complexity, but this is a good baseline for understanding the performance of your system, especially in production, and prototyping.
Some good references I enjoyed learning from:
https://dangduongminhnhat.github.io
0.0 - Getting Started: Building a Basic Algorithm
In this section we’ll exploring how to rapidly test new encryption algorithms without writing tedious harnesses for testing.
0.1 Understand the Mental Model
Define your encryption algorithm into a class with two endpoint functions: encrypt, and decrypt, that can accept data, and the algorithm context.
Perform the quick_test to ensure successful dataflow.
Register other models and algorithms to compare.
Refine and/or repeat.
0.2 Encryption Helpers and Algorithm Context
An incredibly powerful component, AlgorithmContext is a communication interface to your algorithm from the cryptography components you add to your algorithm.
from lib.notebook import ( algorithm, with_key, generate_key, AlgorithmContext, ComposerSession, ReportBuilder, quick_test, with_memory_profiling,)# Initialize report builder for styled outputreport = ReportBuilder()# -----------------------------------------------------------------------------# Prototype Algorithm Development# -----------------------------------------------------------------------------# Use decorators for logistics (key injection, context, metrics).# Write your own experimental crypto logic in the methods.KEY = generate_key(32)@algorithm("XOR-Prototype")@with_key(KEY)@with_memory_profiling()class XORPrototype:""" Simple, insecure XOR-based prototype for testing the framework. """def encrypt(self, data: bytes, ctx: AlgorithmContext) ->bytes:# Prototype: simple repeating-key XOR key = ctx.keyreturnbytes(b ^ key[i %len(key)] for i, b inenumerate(data))def decrypt(self, data: bytes, ctx: AlgorithmContext) ->bytes:# XOR is symmetric key = ctx.keyreturnbytes(b ^ key[i %len(key)] for i, b inenumerate(data))# Verify the prototype worksquick_test(XORPrototype())
Beyond a simple byte emitter in the last example, we can easily add new capabilities. Let’s take a look at the Salsa20 algorithm which was succeeded by its descendent, ChaCha in 2008. Though it’s dated, it still uses much of the same IO of modern ciphers.
import hashlibfrom Crypto.Cipher import Salsa20from lib.notebook import with_metricsdef salsa_kdf(key: bytes, salt: bytes) ->bytes:return hashlib.pbkdf2_hmac("sha256", key, salt, 100000, dklen=32)@algorithm("Salsa20-Prototype")@with_key(KEY)@with_metrics()@with_memory_profiling()class Salsa20Prototype:""" A classic encryption cipher, Salsa20 was shelved in 2008. """def encrypt(self, data: bytes, ctx: AlgorithmContext) ->bytes:# Register KDF and salt, derive key, cache result ctx.set_kdf("salsa-kdf", salsa_kdf) ctx.set_salt("salsa-salt")# extract a unique key for this encryption from the original key (KDF + salt) and store/cache it derived = ctx.derive("salsa-kdf", "salsa-salt", cache_as="salsa-derived") cipher = Salsa20.new(derived) ctx.set_nonce("salsa-nonce", cipher.nonce)return cipher.encrypt(data)def decrypt(self, data: bytes, ctx: AlgorithmContext) ->bytes:# Retrieve cached materials from registry derived = ctx.get_derived_key("salsa-derived")assert derived isnotNone, "Derived key not found in context" nonce = ctx.get_nonce("salsa-nonce")assert nonce isnotNone, "Nonce not found in context"# create cipher and decrypt cipher = Salsa20.new(derived, nonce)return cipher.decrypt(data)# Verify the prototype worksquick_test(Salsa20Prototype())# load the Salsa20 cipher with the password KDF into our active sessionsession.register(Salsa20Prototype(), "Salsa20-Prototype")# load the AES-256-GCM cipher with memory profilingsession.register(adapt(Aes256GcmAlgorithm, KEY, name="AES-256-GCM", profile_memory=True))
Comparing Salsa20, and AES is inherently unfair and is equivalent of comparing a 2016 NASCAR vs Steve McQueen’s Ferrari. Salsa20 is a stream cipher, and AES is a block cipher. Stream ciphers are much more lightweight, and perform impeccably against dynamic-sized data blocks, hence the ‘stream’ part of the cipher.
Let’s play into this, despite the lack of realism, and go-through the motions of selecting an algorithm from research done using this composer. 256-bit AES-GCM clearly wins the battle for scalability with the ability to encrypt gigabytes at a time whilst still remaining around and even lower in memory than the Salsa20 cipher, and the XOR prototype (binary additive stream cipher) that we built; which wouldn’t have even stood the test of WWII.