Skip to main content

Recreating a missing VMFS datastore partition in VMware vSphere 5.x and 6.x

 


 Symptoms
  • A datastore has become inaccessible.
  • A VMFS partition table is missing.
 Purpose
The partition table is required only during a rescan. This means that the datastore may become inaccessible on a host during a rescan if the VMFS partition was deleted after the last rescan. The partition table is physically located on the LUN, so all vSphere hosts that have access to this LUN can see the change has taken place. However, only the hosts that do a rescan will be affected.
 
This article provides information on:
  • Determining whether this is the same problem
  • Resolving the problem
 Cause
This issue occurs because the VMFS partition can be deleted by deleting the datastore from the vSphere Client. This is prevented by the software, if the datastore is in use. It can also happen if a physical server has access to the LUN on the SAN and does an install, for example.
 Resolution
To resolve this issue:

Run the partedUtil command on the host with the issues and verify if your output is similar to
 
partedUtil getptbl /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011


Verify if the output of the command is similar to:

gpt
52216 255 63 838860800
1 2048 838850039 AA31E02A400F11DB9590000C2911D1B8 vmfs 0



If your output appears similar to the following, it indicates the partition is missing:

gpt
52216 255 63 838860800



In this case, you must recreate the partition. To recreate the partition: 
  1. Find the beginning and end blocks of the VMFS partition. To find the beginning of the partition, run this command (one line script) on the host:


    # offset="128 2048"; for dev in `esxcfg-scsidevs -l | grep "Console Device:" | awk {'print $3'}`; do disk=$dev; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do echo "Checking offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1 -A 5 d00d; echo "---------------------"; done


    Note: The preceding script checks all of the storage devices and the list may be lengthy. This script is not applicable for local disks. 

    You see output similar to:

    /vmfs/devices/disks/naa.60060160455025009839a9ed4cfee011
    msdos
    78325 255 63 1258291200
    1 128 1258291124 251 0
    Checking offset found at 128:
    0110000 d00d c001 
    0110004
    1310000 f15e 2fab
    1310004
    0131001d 46 43 5f 53 68 61 72 65 64 00 45 76 65 72 5f 47 |old_VMFS3.......|
    0131002d 65 74 74 69 6e 67 5f 55 70 00 00 00 00 00 00 00 |................|
    ---------------------
    /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011
    gpt
    52216 255 63 838860800
    Checking offset found at 2048:
    0200000 d00d c001 
    0200004
    1400000 f15e 2fab 
    1400004

    0140001d 4a 55 50 48 41 4d 5f 53 52 4d 35 00 00 00 00 00 |new_VMFS5.......|
    0140002d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
    ---------------------


    The preceding output has two example storage devices. The first example was created on an ESXi host prior to version 5 and it reports: 

    Checking offset found at 128.

    Where 128 is the beginning block.

    The second storage device was created on vSphere 5 or later and reports: 

    Checking offset found at 2048. 

    Note: In this example, you are using the second device, so the beginning of the partition is 2048.
     
  2. To get the end block for the partition, run this command:

    # partedUtil getUsableSectors /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011 

    You see this output:

    34 838860766

    Notes
     
    • If you do not see this output and you get an Unknown partition table on disk error, run this command to label the table as a GPT partition table:

      # partedUtil mklabel /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011 gpt

      Rerun the partedUtil getUsableSectors command. If you do not get the expected output of 2 numbers, run the partition type identification commands in the next bullet also.
       
    • If you do not see the specified output and receive an error message stating partition table invalid,unable to satisfy all constraints on the partition or a similar error, run this command:

      partedUtil setptbl /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011 gpt "1 2048 4123456 AA31E02A400F11DB9590000C2911D1B8 0"

      This creates a temporary partition. You can now read the disk information. You should now see the correct output. You should now be able to calculate the correct last usable block. 

      The partition type identifies the purpose of a partition, and may be represented by either a decimal identifier (for example, 251) or a GUID (for example, AA31E02A400F11DB9590000C2911D1B8).  Partitions created on ESXi 5.x and higher with the gpt disklabel must be specified using the GUID.
       
  3. Run this command to temporarily turn off Storage IO Control:

    # /etc/init.d/storageRM stop
     
  4. Run this command to set the correct values for the partition table:

    Note: Ensure to use appropriate values in this command depending on your environment.

    # partedUtil setptbl /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011 gpt "1 2048 838860766 AA31E02A400F11DB9590000C2911D1B8 0"

    The number in Red indicates the last usable block, so the end of the partition cannot be any higher. It is unknown whether this was the number used when the datastore was created, so you can try it and adjust if necessary. 
     
  5. Run this command to attempt to mount the VMFS datastore:

    vmkfstools -V


    Note: If the datastore mounts, the numbers are correct and you need not adjust the value.
     
  6. If the datastore does not mount, you may see a message in /var/log/vmkernel.log similar to:

    ... cpu0:44828)LVM: 2891: [naa.6006016045502500c20a2b3ccecfe011:1] Device expanded (actual size 838858719 blocks, stored size 838847992 blocks)


    In this case, add the offset value, minus one, to the stored size to get the actual end block.

    For example:

    838847992 + 2047 = 838850039

    Run the command with the new end value:

    # partedUtil setptbl /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011 gpt "1 2048 838850039 AA31E02A400F11DB9590000C2911D1B8 0"


    Now you have the correct partition. Run the VMFS rescan again:

    # vmkfstools -V

     
  7. Run this command to temporarily turn off Storage IO Control:

    # /etc/init.d/storageRM start

After the datastore is successfully mounted on one host, you can expect that the same VMFS rescan command will mount the VMFS datastore when run on other hosts that have access to this LUN. 

Alternatively, you can run a full cluster rescan from the vCenter Server using the vSphere Client.
 Related Information
In VMware Sphere 5.x and later, newly-created VMFS datastores use GPT partition tables instead of MBR partition tables.
 
The benefit of using GPT partition tables is thatmore than one copy of the partition table is kept on the LUN. If a physical Windows host has access to the LUN on the SAN, it, by default, automatically assigns a drive letter to the LUN, which destroys an MBR partition table. This type of problem does not occur with GPT, since vSphere uses the backup partition table.



To avoid lengthy delays, please add in addtional script for single device interrogation:
 
disk="/vmfs/devices/disks/naa.....xxxxx"; offset="128 2048"; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do echo "Checking offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1 -A 5 d00d; echo "---------------------"

Comments

  1. You're a legend, this guide just saved me a world of pain after accidentally deleting a datastore from a local disk, thank you very much

    ReplyDelete

Post a Comment

Popular posts from this blog

Integration with vCloud Director failing after NSXT upgrade to 4.1.2.0 certificate expired

  Issue Clarification: after upgrade from 3.1.3 to 4.1.2.0 observed certificate to be expired related to various internal services.   Issue Verification: after Upgrade from 3.1.3 to 4.1.2.0 observed certificate to be expired related to various internal services.   Root Cause Identification: >>we confirmed the issue to be related to the below KB NSX alarms indicating certificates have expired or are expiring (94898)   Root Cause Justification:   There are two main factors that can contribute to this behaviour: NSX Managers have many certificates for internal services. In version NSX 3.2.1, Cluster Boot Manager (CBM) service certificates were incorrectly given a validity period of 825 days instead of 100 years. This was corrected to 100 years in NSX 3.2.3. However any environment originally installed on NSX 3.2.1 will have the internal CBM Corfu certs expire after 825 regardless of upgrade to the fixed version or not. On NSX-T 3.2.x interna...

Calculate how much data can be transferred in 24 hours based on link speed in data center

  In case you are planning for migration via DIA or IPVPN link and as example you have 200Mb stable speed so you could calculate using the below formula. (( 200Mb /8)x60x60x24) /1024/1024 = 2TB /per day In case you have different speed you could replace the 200Mb by any rate to calculate as example below. (( 5 00Mb /8)x60x60x24) /1024/1024 =  5.15TB  /per day So approximate each 100Mb would allow around 1TB per day.

Device expanded/shrank messages are reported in the VMkernel log for VMFS-5

    Symptoms A VMFS-5 datastore is no longer visible in vSphere 5 datastores view. A VMFS-5 datastore is no longer mounted in the vSphere 5 datastores view. In the  /var/log/vmkernel.log  file, you see an entry similar to: .. cpu1:44722)WARNING: LVM: 2884: [naa.6006048c7bc7febbf4db26ae0c3263cb:1] Device shrank (actual size 18424453 blocks, stored size 18424507 blocks) A VMFS-5 datastore is mounted in the vSphere 5 datastores view, but in the  /var/log/vmkernel.log  file you see an entry similar to: .. cpu0:44828)LVM: 2891: [naa.6006048c7bc7febbf4db26ae0c3263cb:1] Device expanded (actual size 18424506 blocks, stored size 18422953 blocks)   Purpose This article provides steps to correct the VMFS-5 partition table entry using  partedUtil . For more information see  Using the partedUtil command line utility on ESX and ESXi (1036609) .   Cause The device size discrepancy is caused by an incorrect ending sector for the VMFS-5 partition on the ...