Restore a VM which was backed up with verticalbackup

Jumo · 12 August 2022 13:57

Hello,

I back up virtual machines (onPrem) with “verticalbackup” (cmd: vertical) and store all the backups via an AWS Storage Gateway into a S3 bucket. So far so good.
About the fact that an ESXi host cannot run cmds which needs lot of memory I have to run the prune cmd from another Linux machine - in this case with duplicacy. Working.

But - imagine you have a DR case - your server room melt down and you will not have an ESXi server anymore - how you can restore a virtual machine from a backup?

The video I saw and all the comments about restore doesn’t show me how it could work in my case.
So

cd path/to/repository1
duplicacy restore -r 1

why I cannot give the name of the backup with this cmdl?

Imagine:

your ID is vmware5
the storage is mounted /esxi-storage
the ESXi host will backup 6 different virtual machines (vm1,…vm6) every 12 hours
then you will see after a while and prune a list like:

Snapshot vm1@vmware5 revision 1 created at 2022-03-20 19:30
Snapshot vm1@vmware5 revision 11 created at 2022-03-28 12:21
Snapshot vm1@vmware5 revision 21 created at 2022-04-04 12:27
…
Snapshot vm1@vmware5 revision 150 created at 2022-08-10 13:09
Snapshot vm1@vmware5 revision 151 created at 2022-08-11 13:10
Snapshot vm2@vmware5 revision 1 created at 2022-03-20 19:46
…
Snapshot vm2@vmware5 revision 149 created at 2022-08-09 22:26
Snapshot vm2@vmware5 revision 150 created at 2022-08-10 13:24
…
Snapshot vm4@vmware5 revision 149 created at 2022-08-10 13:00
Snapshot vm4@vmware5 revision 150 created at 2022-08-11 13:00
Snapshot vm5@vmware5 revision 1 created at 2022-03-21 14:24
Snapshot vm5@vmware5 revision 10 created at 2022-03-28 12:47
…
Snapshot vm5@vmware5 revision 147 created at 2022-08-05 12:38
Snapshot vm5@vmware5 revision 148 created at 2022-08-09 22:39
Snapshot vm5@vmware5 revision 149 created at 2022-08-10 13:38
Snapshot vm6@vmware2 revision 1 created at 2022-08-04 21:12
Snapshot vm6@vmware2 revision 2 created at 2022-08-05 01:12
Snapshot vm6@vmware2 revision 3 created at 2022-08-05 16:10
Snapshot vm6@vmware2 revision 4 created at 2022-08-09 16:11
Snapshot vm6@vmware2 revision 5 created at 2022-08-10 01:12
Snapshot vm6@vmware2 revision 6 created at 2022-08-11 01:13
Snapshot vm6@vmware5 revision 1 created at 2022-03-20 18:47

How can I tell duplicacy to restore "vm5@vmware5 revision 149 "?
In your documentation I see only “duplicacy restore -r …” without a name.
The init cmd with vmware5 will not work because the result is that vmware5 cannot be found - of course, because this is not the name of the VM but an ID.

Could it be that duplicacy is not able to restore virtual machines which are backed up with verticalbackup?
I will also ask verticalbackup.

Thx and best,
J.

Droolio · 12 August 2022 22:55

While you can probably restore the raw files with Duplicacy, I’m not sure why you’d want to, since the backed up files are basically the .vmdk disk images and the other configuration files, such as .vmx. You wouldn’t be able to do much with them without further conversion.

Why not reinstall ESXi, and reinstall verticalbackup to do the actual restore?

You could even install ESXi into a VMware Workstation VM, and do the restore inside that. Perhaps later use VMware Converter on the running ESXi VM to convert it to another environment.

Anyway, if you want to do this with Duplicacy, you first have to init the storage and assign it the ID of one of the original machines - i.e. vm5@vmware5. Once initialised, you’d only need to reference the revision number. This is just one VM, though, so you’d need to reinitialise another directory to restore another VM.

sevimo · 12 August 2022 23:41

I am pretty sure you can directly mount raw images and .vmdk files in both Linux and Windows (may need some free tools like OSFMount).

gadget · 13 August 2022 02:20

Yes, on Linux there’s libvmdk for mounting VMDK files.

VirtualBox, and at least a few other hypervisors, can also work with them. In VirtualBox, .vmdk files can be added to a VM without conversion.

gadget · 13 August 2022 05:04

As @Droolio already alluded to, in a typical disaster recovery scenario – unless a functioning VM guest is no longer required, only it’s file contents are needed – it’s unusual not to provision a replacement VM host where a full restore would take place. Also, depending on how the datastore was configured in ESXi, the VM disks aren’t necessarily in VMDK format.

However, I’ve had situations such as when a VM is a database server and the VM is no longer required because the data is going to be imported into a VM with a newer or different database engine. So…

I downloaded a copy of Vertical Backup to confirm that it’s an ELF binary and would run on a typical 64-bit Linux system (since ESXi is a Linux system turned into a hypervisor). It does, so in theory it should be possible to extract a VM image from a storage without an ESXi host. But the caveat is that Vertical Backup expects to talk to an ESXi host, so there’s no place to restore a VM to on a regular Linux system.

I’m assuming that you meant your S3 bucket is available via /esxi-storage. If not, then whatever is in /esxi-storage isn’t useful for the restoration process using Duplicacy.

Yes and no.

No, I don’t believe so based on a line in the Vertical Backup user guide:

Although the prune operation is not known to be resource consuming, it is even better if the prune operation is performed on a non-production ESXi host, or even by Duplicacy on a non-ESXi computer.

So if Duplicacy can be used to execute a prune operation, it must share the same backup format. However, Duplicacy and Vertical Backup don’t share the same configuration file format.

Unlike Vertical Backup, Duplicacy supports multiple storages per repository, i.e., the contents of a directory can be configured to back up to AWS, B2, an external drive, etc.

Duplicacy doesn’t require a snapshot ID be provided during a restore operation because there’s only one per repository and it’s found in the preferences file. If no storage name is provided, Duplicacy looks for a storage named “default”, so the command duplicacy restore -r 1 is equivalent to duplicacy restore -r 1 -storage default.

In contrast, Vertical Backup derives a snapshot ID from a combination of a virtual machine name in ESXi and the host ID that was assigned when the storage was initialized.

You’ll have to make adjustments appropriate for your environment (there are no guarantees this will work because I no longer have an ESXi host so I don’t have a need to use Vertical Backup):

Make an empty directory somewhere for depositing your restored file(s).
If your destination directory is MyVM, change directories into it.
Issue the following command: duplicacy init vmware5 s3://amazon.com/bucket/path/to/storage
If successful, Duplicacy will have created a subdirectory named .duplicacy and a preferences file .duplicacy/preferences in JSON format. Edit the preferences file, replacing “vmware5” with “vm5@vmware5”. This step is required because for portability reasons Duplicacy no longer allows snapshot ID names to be created with anything but alphanumeric (0-9,a-z,A-Z), hyphen (-) and underscore (_) characters, but it will still accept them from a preferences file.
If you’ve configured your storage URL, access key, and secret key properly, you should be able to restore snapshot revision 149: duplicacy restore -r 149

I have no idea what’s going to be restored because on an ESXi host Vertical Backup passes the data to ESXi’s VM creation tools while with Duplicacy you’re just getting a bunch of bits that could be VMDK files, VMFS blocks, etc.

Droolio · 13 August 2022 12:38

Yea, you could, thought in a DR scenario you’d probably want it in a runnable state. If you just need access to raw files, sure. Otherwise, nesting ESXi in another VM environment is easy.

Also bare in mind, Duplicacy might not restore .vmdk files sparsely - if the disk files were in thin format.

I’d honestly recommend restoring to ESXi, and not use Duplicacy. Though, it’s perfectly fine to use Duplicacy for prune, check and copy.

Jumo · 13 August 2022 16:58

Hi there,

Thank you for all the answers.
To clear this special case:

you have only ONE server room - if it melt down you will not have an ESXi server anymore so no, you cannot use another one.
Yes, you can order a new one, with enough storage, etc. But when you will have anything in place?
quote
order
delivery
Maybe one+ week is over. In times of cloud and since the backups are already in the cloud, it would be much faster to restore the VMs to the cloud, wouldn’t it? VMware Cloud on AWS could help.

VMware Workstation - where? = You meant locally on a laptop? Hm, could work if you can restore a VM but in case of DR you need all VMs restored…well, it’s a question of the storage and memory on the machine with VMware Workstation.

The format of the backup:

you have lots of chunks with different and long keys in the names.
in the bucket (one storage) are different backups, so not only one VM but six. You cannot see this in the chunk file name.

So what if you have three different VM backups and all these three have a revision 149? What should duplicacy restore? - yes if the ID is working, all is fine.

I will test the change in the preference file, to change the ID. My fingers are crossed…

The whole thing is more theoretical (but I must test it!), because sooner or later all VMs (out of a total of five ESXi hosts and SAN) are migrated to AWS. (Without VMware Cloud on AWS, because it is not neccessary and too expansive. But in the meantime it could be a valid solution.)

Thx and best
J

gadget · 13 August 2022 19:23

All valid concerns…

One big unknown is what is the structure of the data after it’s been restored by Duplicacy, which is only responsible for restoring whatever is contained in a backup – not to know what to do with it (unlike Vertical Backup + ESXi). Whether it’s VMware Workstation, VirtualBox, Azure, AWS, or some other hypervisor that’s not ESXi, there’s a good chance it’ll require additional time and resources for the migration.

For example, if a VM was configured to pass thru the host CPU instead of paravirtualizing it, migrating to a different host/platform with a different CPU – even if it’s the same architecture, worse if it’s not – will create additional delays and might not always even be doable (e.g. moving a Windows 11 VM to a host without TPM support).

One of the big selling points of ESXi is low starting cost (free with some features disabled) and fairly flexible hardware requirements (a compatible HP ProLiant MicroServer can be purchased for a few hundred dollars or less), so it could be faster than migrating to the cloud unless there’s already an active cloud subscription on standby ready to receive transfers.

Given the limits for the free edition of ESXi (e.g., maximum of 8 virtual CPUs + 128GB RAM per guest VM in a fault tolerant configuration), finding another suitable host to install ESXi on shouldn’t be too difficult. And from personal experience, figuring out how much a cloud-hosted VM is going to cost per month can sometimes take longer than provisioning a replacement for the one that melted into a pile of goo.

Yes, that’s because any given chunk could be shared by multiple backups as a result of the deduplication process. The hashing process that generates the long key names makes it efficient and fast when looking for a particular chunk (potentially as fast as an index without relying on an index file which would be a single point of failure).

As long as each of your ESXi hosts backing up to the same S3 bucket have unique host IDs, it’s not a problem because Vertical Backup generates unique snapshot IDs from a concatenation of virtual machine name + host ID (e.g., virtual machine name = “vm5”, host ID = “vmware5”, then the snapshot ID is “vm5@vmware5”).

Duplicacy looks up the snapshot ID via the specified storage name from the repository preferences file:

        "name": "default",
        "id": "vm5@vmware5",

Because Vertical Backup increments the revision number for every snapshot taken, there is always a unique combination of virtual machine name + host ID + revision number.

Therefore, the command duplicacy restore -r 149 -storage default is synonymous with “Restore revision 149 of snapshot ID vm5@vmware5 from the storage named default”. (Because Duplicacy assumes the storage name is default if not specified, the command can be shortened to just duplicacy restore -r 149.)

So to restore revision 149 of vm2@vmware5, you simply have to update the id field. Changing it from “vm5@vmware5” to “vm2@vmware5” and reissuing the command duplicacy restore -r 149. While the snapshot ID is set to “vm2@vmware5”, duplicacy restore -r 1 would restore revision 1. Rinse and repeat as needed.

Alternatively, you could initialize multiple repositories, one for each unique snapshot ID. The overall technique is the same.