vMotion Fails At 14% – with at least one solution

Every found your self with an issue and spending hours trying to find a solution while none of the Google (or bing) search(find)engine results fixed you’re problem? Well I did just this today. Trying to update our VMWare clusters I noticed some VM’s not willing to vMotion to another node and that the task stalled at 14%. With off course the all clarifying error: “Operation timed out” and from Tasks & Events “Cannot migrate <VM> from host X, datastore X to host Y, datastore X”

My solution:
It turns out that, in my case, there was a vmx-***.vswp file left over from a failed DRS migration. During a DRS migration, because the VMX is started at both nodes, each VMX process create a process swap file, with or -1 of -2 in the name. At a successful DRS migration one of the two will be removed.

So when you browse the datastore you’re failing VM is located on, and you open up the folder of the VM, you should see those two vmx-***.vswp files, as presented in the image below. If you only see one file, sorry, than this solution is not the one you’re looking for.
NOTE: There could be also a vswp file for the VM itself! Leave that one alone!

The oldest one is most likely the one you need to delete. You can do this while the VM is running and most of the time just from the DS browser. If you try to delete the wrong one, or when it’s not possible to delete it from the DS browser, you’ll the following error:

If you’re unsure which one to delete, than the only way is to power down the VM and than the file that remains is the one you need to delete.

If you’re unable to delete the file from the file browser, than you need to start SSH daemon on the host on which the VM is registered and then login through SSH. Navigate to the right datastore and folder and delete it from there:
# cd /vmfs/volumes/<datastore_name>
# cd <VM folder>
# ls -lash *.vswp (just to verify the timestamps and locate the right file)
# rm vmx-<VM name>-[1-2].vswp

That’s it, now you should be able to migrate the VM again while it’s running.

This entry was posted in Software, VMWare and tagged , , , , . Bookmark the permalink.

6 Responses to vMotion Fails At 14% – with at least one solution

  1. Steve says:

    Hey, I couldn’t reply to your VMWare post, but this helped me! Thanks so much for your solution!

  2. Ron says:

    Thanks for the great write up, I wasn’t able to delete the second .vswp file from my SAN, I renamed it .log and it now migrates.

  3. Michael says:

    Bravo! The “vMotion fails at 14%” issue just occured in my lab and deleeting the stale vswp file for each VM was the exact solution.

  4. vXav says:

    Nice article, I actually encountered that and fixed it with a simple storage vMotion .

    For those looking for it the error was:
    “Unable to generate userworld swap file in directory ‘/vmfs/volumes/545ba82f-7233f8c8-0ec5-a41f72d33041/*******’ for ‘********’.”

    Once it finishes the destination datastore only has one vmx-***.vswp.
    Then the vMotion works like a charm :).

Leave a Reply to Ron Cancel reply

Your email address will not be published. Required fields are marked *