Full Disk Encryption and Integrity Protection
In the area of network security, each message or packet is
always both encrypted and integrity protected.
Although we usually just say "the packet is encrypted", in reality
all network security protocols (IPsec ESP and others) both encrypt the packet
and append a cryptographic hash, often known as
ICV, Integrity Check
Value. The ICV protects the packet from
malicious modification, so that the receiver knows
if the packet had been changed en route, and can reject it.
Whether the ICV should be computed over the plaintext or over the encrypted packet is a long standing discussion. Quite possibly there may be one true way, but for our purposes, we will skip this problem and concentrate on whether or not integrity is even provided. In other words, when the data is decrypted, it is possible to detect that it had been tampered with by an attacker.
When securing network packets, it is normally acceptable to add a few bytes. Packet fragmentation may be an issue but most packets are smaller than the network's MTU (maximum transmission unit) so appending a hash is no big deal. A further optimization, taking IPsec ESP as an example again, is to truncate the hash. In the case of ESP we truncate a 160-bit SHA-1 hash into 96 bits to save space.
Disk encryption (important for storage in general, and a critical component of cloud security) is very different. In most cases we have a fixed disk, which we want to encrypt block for block. We simply cannot add any data. And even if we could, there are additional problems:
- You may think that there's always some more space on your disk. But large scale storage costs real money. Physical disks cost money, and cloud storage also has non-negligible cost. Adding a 32 byte SHA-256 hash to each 512 byte disk block means adding 6.25% to your storage costs.
- Since full-disk encryption is block-for-block, the integrity hashes would typically be stored separately. Which means two physical disk reads for every block being read by the application, both in the critical path. Although the hashes for successive blocks would normally be stored on the same block, even sequential reads suffer because the disk needs to constantly seek between the data and the hash blocks. And it's certainly a killer for random disk access.
But is there a real threat? Some serious people in the disk encryption community claim that there isn't one, because:
- The goal of disk encryption is to protect against someone stealing your disk, and once it's gone, it's gone.
- If someone can tamper with you disk at the underlying block level, they can also tamper with your disk encryption software and get access to the encryption keys and/or the plaintext data.
- Even if disk blocks are protected with cryptographic hashes, an attacker can always roll back the block to any previous state without being detected, because any previous state is valid is long as we're using the same integrity keys.
While the third point is definitely valid, the first two are strongly invalid in two important cases: one is Storage Area Networks (SAN) and the other, cloud storage such as Amazon's EBS. These cases are nothing exotic, they are two common ways of architecting large data farms!
Porticor's Cloud Security solution provides very strong encryption for your data. For the reasons I outlined above, block-level integrity is not one of our features. So what is one to do?
- As usual, the first option is do nothing. Since we're using the AES block cipher, any tampering with the disks will end up setting the whole block to random garbage, which will likely be detected by either the file system or the application. If the former case, Linux will reward you with an I/O error. In the latter case, the application will likely complain.
- You can select to use a checksumming file system. Unfortunately neither ZFS or btrfs are yet production-worthy on Linux (or at least on Debian/Ubuntu) at the time of writing. Although cryptographers would advise you to use a cryptographic hash at this level (above the encrypted disk itself), this is probably a gross overkill, and a simple CRC or the higher performance Fletcher or FNV will do.
- You can join two encrypted disks into a single RAID (Linux MD) device. The two disks are encrypted using different keys, so any attack will be detected because the attacker cannot force the blocks to decrypt into the same plaintext on both disks. Note that when reading a block from RAID, only one of the copies is read. So the mismatch will only be detected upon a periodic RAID consistency check. This is probably the simplest solution other than #1.
- For the adventurous, this link has the code for a checksumming Linux RAID (thanks fnj). This solution can be layered on top of the encrypted file system, similarly to the file system-based solutions. But bear in mind that this is an academic project, so you're on your own.
To summarize, Porticor's cloud encryption solution provides you with world class encryption and key management for you critical data. Data integrity in the face of malicious attacks may or may not be a problem for your particular project. If you decide that it is, there are several ways to deal with it. Feel free to contact us for additional information and help.
One last important comment: Porticor's S3 encryption (with keys in the cloud) does include integrity protection. Depending on the application, this may be another alternative to look at.