Skip to main content

Unlock the VMware VM vmdk file

 Unlock the VMware VM vmdk file


Kill -9 PID


Sometimes a file or set of files in a VMFS become locked and any attempts to edit them or delete will give a device or resource busy error, even though the vm associated with the files is not running. If the vm is running then you would need to stop the vm to manipulate the files. If you know that the vm is stopped then you need to find the ESX server that has the files locked and then stop the process that is locking the file(s).




  • 1. Logon to the ESX host where the VM was last known to be running.
  • 2. vmkfstools -D /vmfs/volumes/path/to/file to dump information on the file into /var/log/vmkernel
  • 3. less /var/log/vmkernel and scroll to the bottom, you will see output like below:
  • a. Nov 29 15:49:17 vm22 vmkernel: 2:00:15:18.435 cpu6:1038)FS3: 130: <START vmware-16.log>
  • b. Nov 29 15:49:17 vm22 vmkernel: 2:00:15:18.435 cpu6:1038)Lock [type 10c00001 offset 30439424 v 21, hb offset 4154368
  • c. Nov 29 15:49:17 vm22 vmkernel: gen 66493, mode 1, owner 46c60a7c-94813bcf-4273-0017a44c7727 mtime 8781867] ك Bold type added to number for emphasis.
  • d. Nov 29 15:49:17 vm22 vmkernel: 2:00:15:18.435 cpu6:1038)Addr <4, 588, 7>, gen 20, links 1, type reg, flags 0x0, uid 0, gid 0, mode 644
  • e. Nov 29 15:49:17 vm22 vmkernel: 2:00:15:18.435 cpu6:1038)len 23973, nb 1 tbz 0, zla 2, bs 65536
  • f. Nov 29 15:49:17 vm22 vmkernel: 2:00:15:18.435 cpu6:1038)FS3: 132: <END vmware-16.log>
  • 4. The owner of the lock is on line 3c, the last part is all you need, in this case 0017a44c7727
  • 5. esxcfg-info | grep -i 'system uuid' | awk -F '-' '{print $NF}' will display the system uuid of the esx server. You need to run the esxcfg-info command on each esx server in the cluster to discover the owner.
  • 6. When you find the ESX server that matches the uuid owner, logon to that ESX server and run the command: ps -elf|grep vmname where vmname is the problem vm. Example output below:
  • a. 4 S root 7570 1 0 65 -10 - 435 schedu Nov27 ? 00:00:02 /usr/lib/vmware/bin/vmkload_app /usr/lib/vmware/bin/vmware-vmx -ssched.group=host/user/pool2 -@ pipe=/tmp/vmhsdaemon-0/vmxf7fb85ef5d8b3522;vm=f7fb85ef5d8b3522 /vmfs/volumes/470e25b6-37016b37-a2b3-001b78bedd4c/iu-lsps-vstest/iu-lsps-vstest.vmx0
  • 7. Since there is a process running, pid 7570 in the example, you need to kill it by following steps 5-12 on stopping a VM above
  • 8. Once the kill is complete the files should be released.

Comments

  1. Thank you for your article.
    Can you explain detailed process 5? How must run esxcfg-info command on each server? For example, in my esxi, esxdfg-info shows 2 servers: 9440c9ed81ca and 9110ba24c577. What is the command to run on 9440c9ed81ca and what info is the key.
    Once I know what server matches uuid owner, how logon on ESX server?
    Finally, where can find steps 5-12 to stop VM?

    ReplyDelete

Post a Comment

Popular posts from this blog

Integration with vCloud Director failing after NSXT upgrade to 4.1.2.0 certificate expired

  Issue Clarification: after upgrade from 3.1.3 to 4.1.2.0 observed certificate to be expired related to various internal services.   Issue Verification: after Upgrade from 3.1.3 to 4.1.2.0 observed certificate to be expired related to various internal services.   Root Cause Identification: >>we confirmed the issue to be related to the below KB NSX alarms indicating certificates have expired or are expiring (94898)   Root Cause Justification:   There are two main factors that can contribute to this behaviour: NSX Managers have many certificates for internal services. In version NSX 3.2.1, Cluster Boot Manager (CBM) service certificates were incorrectly given a validity period of 825 days instead of 100 years. This was corrected to 100 years in NSX 3.2.3. However any environment originally installed on NSX 3.2.1 will have the internal CBM Corfu certs expire after 825 regardless of upgrade to the fixed version or not. On NSX-T 3.2.x interna...

Calculate how much data can be transferred in 24 hours based on link speed in data center

  In case you are planning for migration via DIA or IPVPN link and as example you have 200Mb stable speed so you could calculate using the below formula. (( 200Mb /8)x60x60x24) /1024/1024 = 2TB /per day In case you have different speed you could replace the 200Mb by any rate to calculate as example below. (( 5 00Mb /8)x60x60x24) /1024/1024 =  5.15TB  /per day So approximate each 100Mb would allow around 1TB per day.

Device expanded/shrank messages are reported in the VMkernel log for VMFS-5

    Symptoms A VMFS-5 datastore is no longer visible in vSphere 5 datastores view. A VMFS-5 datastore is no longer mounted in the vSphere 5 datastores view. In the  /var/log/vmkernel.log  file, you see an entry similar to: .. cpu1:44722)WARNING: LVM: 2884: [naa.6006048c7bc7febbf4db26ae0c3263cb:1] Device shrank (actual size 18424453 blocks, stored size 18424507 blocks) A VMFS-5 datastore is mounted in the vSphere 5 datastores view, but in the  /var/log/vmkernel.log  file you see an entry similar to: .. cpu0:44828)LVM: 2891: [naa.6006048c7bc7febbf4db26ae0c3263cb:1] Device expanded (actual size 18424506 blocks, stored size 18422953 blocks)   Purpose This article provides steps to correct the VMFS-5 partition table entry using  partedUtil . For more information see  Using the partedUtil command line utility on ESX and ESXi (1036609) .   Cause The device size discrepancy is caused by an incorrect ending sector for the VMFS-5 partition on the ...