Zfs send & receive with encryption - how to retrieve data?
Good morning everyone,
I'm looking for some help to retrieve data in a scenario where I might have made a big mistake. I'm hoping to understand what I did wrong.
My system is made of two Solaris 11 Express servers (old free version for evaluation). The first if for data and the second is for backups.
On the first, I created zfs filesystems with encryption turned on (tank/Documents). To make things easy, I used "keysource=passphrase,file:///zfs_key", then I copied the file to the second (backup) server in the same path.
In order to do my backups, I used zfs send & mbuffer to send the whole zpool (all the zfs' filesystems). Normally, this would work fine for both encrypted and unencrypted volumes. Except the last time I did this, I did not mount the encrypted filesystem and I ran send & receive without getting any errors... That is, until I rebooted the backup server and tried to access the data (mount the filesystem).
For some reason I do not understand, I always get an "invalid key" error. The weird thing is the "keysource" in the backup system is still the same as the source and the "zfs_key" is the same. I thought that when you send&receive encrypted filesystem the "key" was automatically generated on the receiving system using the "keysource" mentioned here, but there seems to be something fundamentally different when the filesystem is not mounted. (For example scrub of encrypted zfs filesystem give errors when it is not mounted)
I would like to know where is the valid key in such a scenario? and/or what happened?
Thank you for giving me your opinion on the subject.
Can you confirm when everything is mounted on primary location, backup server receives the data and can be rebooted and import the zpool in question ?
A clean send / recv after everything is destroyed on backup location.
Just to mention, unrelated, i think that when you are using send/receive and mbuffer you are sending unencrypted filesystems over network.
Since you are using encryption on endpoints, i guess security is important so you trust that network
Thanks for your reply. Starting with the side comment, yes, I was aware that zfs send/receive on encrypted filesystem was, in fact, unencrypted and using mbuffer for such a thing needed to be on a "safe network". But it is a good reminder.
To answer the first question, I must say that when I had a primary and backup server, eveything worked fine initially. After I messed up the backup server and re-created the whole zpool, all the unencrypted data was mounted and worked perfectly. The problem was with the encrypted zfs filesystems. The encrypted ones would not mount anymore.
The reason is well explained in this mailing list, mostly the last answer from Darren J Moffat : https ://thr3ads.net/zfs-discuss/2012/02/1839530-Cannot-mount-encrypted-filesystems
Here is what is explained
That should have failed because the keysource property is inherited from
slice_2/base. So you have found a bug and I can reproduce it.
The reason that should have failed is the source of where the keysource
comes from is used to determine which dataset to look at for the hidden
salt property. We know what that salt property should actually be in
your case because it is set on slice_2/base.
Unfortunately ''zfs set salt'' won''t work because salt
is read-only from
userland (so it doesn''t accidentally get overridden and cause the very
same symptoms you have!).
In theory you would assume that you could go back to having the
keysource inherited by running:
''zfs inherit keysource slice_2/base/bitsavers''
However that won''t work because of a protection we have in place to
again avoid yet another route into these same symptoms. It will fail
with an error message something like this:
cannot inherit keysource for ''slice_2/base/bitsavers'': use
''zfs key -c
Using a hacked up libzfs that removes the check that ''zfs
so I can get out of the situation and make the datasets accessible
again. So this is fixable so don''t abandon hope yet.
Darren J Moffat
In my case, it is similar, but basically I had something like this on my primary server
Also, on the primary server I had sourcekey which was a file like this:
(Not very secure, I know).
Here is the fun part,
Instead of creating the same "tree" of zpool and zfs filesystem on the backup system, I just created the zpool. I expected everything would be created automatically when I would "zfs recv" the whole zpool on the backup server.
Also, to make this more fun, I pre-created the the folder and file :
/key/passfile <- a copy from the initial primary server
The fun actually lies in a bug that this creates. Normally, if you create the file system manually, you get prompted for a password, you type it, and the systems create a key based on your password and a SALT. The important bug here is doing it the way I did, the salt is automatically a bunch of zeros instead of a random salt. There are protection in the system to not accept salts that are all zeros. Hence, eveything is encrypted with a known key and salt, but unmountable because of this bug.
** This was my understanding of the bug, based on the mailing list **
So things worked for years, but I had forgotten that my first backup server was done manually and correctly. After I re-did it a second time, I did not take the time to double-check... (I know, my bad).. The worst part is the primary server got destroyed and I erased all the hard-drives. So I now only have the backup server... which means I have lost my data.
To make matter worse, I had "send files" on another external hard-drive, but my "photos" files got corrupted and I cannot restore this data. After a long while, I get an error and I cannot restore, even partially, my files...
I really hope I can find a solution or get some help with this issue.