Posts Tagged ‘Vmware’

Create a Bootable VMware install as usual on USB Memory Stick

Dump the below into a file called ks.cfg ( Change the value in the square brakets before and put it on the root of the usb with overwriting the isolinux.cfg with the one below , change the menu options for your enviroment

ks.cfg

# +-----------------------------------+
# | Begin default install    |
# +-----------------------------------+
# VMWare License options accepting EULA
vmaccepteula
# Partitioning
clearpart --firstdisk=hpsa --overwritevmfs
install --firstdisk=hpsa --overwritevmfs
 
# root Password encrypted ( use openssl passwd -1 to generate)
rootpw --iscrypted [ENCRYTPEDPASSWORD\
 
# Network install type
network --device=vmnic0 --bootproto=DHCP
%post --interpreter=busybox
Echo Installing ESXi
#Reboot after copying image to disk
reboot
%firstboot --interpreter=busybox
 
# Set the serial number for ESXi (please change it) 
serialnum --esx=[SERIAL NUMBER ]
 
# +---------------------------------------------------------------------------+
# | Creating Networks                            |
# +---------------------------------------------------------------------------+
 
# Remove vSwitch0
sleep 30
esxcli network ip interface remove -i vmk0
esxcli network vswitch standard portgroup remove -p 'Management Network' -v vSwitch0
esxcli network vswitch standard remove -v vSwitch0
 
# Create management switch
esxcli network vswitch standard add --vswitch-name vSwitch1
# Add nics
esxcli network vswitch standard uplink add --uplink-name vmnic0 --vswitch-name vSwitch1
esxcli network vswitch standard uplink add --uplink-name vmnic1 --vswitch-name vSwitch1
esxcli network vswitch standard uplink add --uplink-name vmnic2 --vswitch-name vSwitch1
 
# Add Port Groups  
 
esxcli network vswitch standard portgroup add --portgroup-name "Management Network" --vswitch-name vSwitch1  
esxcli network vswitch standard portgroup set --portgroup-name "Management Network" --vlan-id 4  
 
esxcli network vswitch standard portgroup add --portgroup-name "Server Network" --vswitch-name vSwitch1  
 
# Configure vmkNIC
esxcli network ip interface add -i vmk0 -p 'vMotion Network'
# Set IP Settings [HOSTIP] is dynamic resolved from the template
esxcli network ip interface ipv4 set --interface-name=vmk0 --ipv4=[HOSTIP] -N 255.255.255.0 -t static
# Set default gateway
esxcfg-route -a default [HOSTGW]
# Put management nics to active
esxcli network vswitch standard policy failover set --active-uplinks vmnic0 --vswitch-name vSwitch1
esxcli network vswitch standard policy failover set --active-uplinks vmnic1 --vswitch-name vSwitch1
esxcli network vswitch standard policy failover set --active-uplinks vmnic2 --vswitch-name vSwitch1
 
echo Create VMotion and Mangement network
# +---------------------------------------------------------------------+
# | Creating vMotion and Mangement Network                                        |
# +---------------------------------------------------------------------+
 
# Create vMotion vSwitch
esxcli network vswitch standard add --vswitch-name vSwitch0
# Add nics
esxcli network vswitch standard uplink add --uplink-name vmnic3 --vswitch-name vSwitch0
# Add portgroups
esxcli network vswitch standard portgroup add --portgroup-name vMotionNetwork --vswitch-name vSwitch0
 
# Configure vmkNIC
esxcli network ip interface add -i vmk1 -p 'vMotionNetwork'
esxcli network ip interface ipv4 set --interface-name=vmk1 --ipv4=[VMOTIONIP] --netmask=255.255.255.0 --type=static
 
# +---------------------------------------------------------------------------+
# | enable VMotion                                          |
# +---------------------------------------------------------------------------+
vim-cmd hostsvc/vmotion/vnic_set vmk3
vim-cmd internalsvc/refresh_network
 
# Set DNS and hostname
esxcli system hostname set --fqdn=[HOSTNAME]
esxcli network ip dns server add --server=[DNS1]
esxcli network ip dns server add --server=[DNS2]
#echo add DNS configuration
echo search cotton-on.local  > /etc/resolv.conf
echo nameserver 10.0.0.8  >> /etc/resolv.conf
echo nameserver 10.0.0.5 >> /etc/resolv.conf
 
echo Configure NTP
# +--------------------------------------------------------------------+
# | Add NTP Settings                                                   |
# +--------------------------------------------------------------------+
# Backup
mv /etc/ntp.conf /etc/ntp.conf.bak
# ntp.conf creation
cat > /etc/ntp.conf << __NTP_CONFIG__
restrict default kod nomodify notrap noquerynopeer
restrict 127.0.0.1
server au.pool.ntp.org
__NTP_CONFIG__
/sbin/chkconfig --level 345 ntpd on
echo "driftfile /etc/ntp.drift" >> /etc/ntp.conf
 
echo Configure Syslog
# +--------------------------------------------------------------------+
# | Add syslog confiuration to ESX host                                   |
# +--------------------------------------------------------------------+
# No Remote Syslog server
# vim-cmd hostsvc/advopt/update Syslog.Remote.Hostname string telesto
#vim-cmd hostsvc/advopt/update Syslog.Local.DatastorePath string "[datastore] /logfiles/$(hostname -s).log"
 
#Disable MOB
vim-cmd proxysvc/remove_service "/mob" "httpsWithRedirect"
# +--------------------------------------------------------------------+
# | SNMP Trap                                                            |
# +--------------------------------------------------------------------+
echo "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?><config><snmpSettings><enable>true</enable><syscontact></syscontact><syslocation></syslocation><EnvEventSource>indications</EnvEventSource><communities></communities><port>161</port><targets>[SNMPIP]@161 [SNMPTRAP]</targets><loglevel>info</loglevel><authProtocol></authProtocol><privProtocol></privProtocol></snmpSettings></config>" > /etc/vmware/snmp.xml
 
echo Rename local datastore
# +---------------------------------------------------------------------------+
# | Rename local datastore if --novmfsondisk is not used                      |
# +---------------------------------------------------------------------------+
vim-cmd hostsvc/datastore/rename datastore1 "local-[hostname]"
 
 
# backup ESXi configuration to persist changes
/sbin/auto-backup.sh
 
#enter maintenance mode
esxcli system maintenanceMode set -e true
 
# Needed for configuration changes that could not be performed in esxcli
esxcli system shutdown reboot -d 60 -r "Rebooting after host configurations"

 

isolinux.cfg

DEFAULT menu.c32
MENU TITLE ESXi Boot menu
NOHALT 1
PROMPT 0
TIMEOUT 300
 
LABEL install 
 KERNEL mboot.c32 
 APPEND -c boot.cfg 
 MENU LABEL ^ESXi Setup Interactively install
 
Label ESXi USB install scripted 
 KERNEL mboot.c32 
 APPEND -c boot.cfg ks=usb:/ks.cfg
 MENU LABEL ^USB install Scripted install 
 IPAPPEND 1
 
Label ESXi NFS install scripted 
 KERNEL mboot.c32 
 APPEND -c boot.cfg ks=nfs:uncpath/of/share/ks.cfg
 MENU LABEL ^NFS Scripted install 
 IPAPPEND 1
 
Label ESXi http install scripted 
 KERNEL mboot.c32 
 APPEND -c boot.cfg ks=http://www.website.com/ks.cfg
 MENU LABEL ^HTTP Scripted install 
 IPAPPEND 1
 
LABEL hddboot 
 LOCALBOOT 0x80 
 MENU LABEL ^Boot from local disk

 

 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

Uncorrectable Machine Check Exception (Board 0, Processor 1, APIC ID 0x00000001, Bank 0x00000005, Status 0xB2000000’00800400, Address 0x00000000’00000000, Misc 0x00000000’00000000)

Server reboots randomly and the above is displayed in the iLO.

 

This is an HP Problem , to into the Bios and change HP Power Profile to Maximum Performance

VN:F [1.9.22_1171]
Rating: 2.0/10 (2 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

vmware_view_pilot-5132020[1]Recently spun up a new Windows 7 virtual machine on an ESX cluster. After installing VMware Tools and rebooting the machine , the NIC card would stay unconnected. You would tick the connected box , restart the machine and the Tick would untick.

The default port limit on a vSwitch if you don’t use VLAN’s to segment traffic is 120. So when you hit this limit the above happens. I increased the port limit on each host to double. The VSphere client says you need to reboot the host to apply this , but I just vMotioned the machines from one to another which seemed to fix the problem.

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

clip_image0024[1]When provisioning a new VMware host important things to configure

1) NTP Server
2) IP of Management with DNS Name
3) Portgroup Names
4) Syslog Server
5) SNMP Trap for monitoring
6) Renaming Local Datastore to something with the server name other than datastore1
7) Nic Teaming

Another thing that is also important to setup is the pass selection policy. Depending on the SAN Manufacturer they will tell you the recommended policy e.g. https://vstorage.wordpress.com/2013/03/28/optimising-vsphere-path-selection-for-nimble-storage/

1) Fixed
2) Round Robin
3) Recently used

Most of the time it will be Round Robin

This can be changed via Command Line

esxcli storage nmp satp set -s VMW_SATP_ALUA -P VMW_PSP_RR

or through the VMWare Client

Go to Configuration Tab on Host , go to Storage , Go to Properties of Datastore and click Manage Paths

When changed make sure no VM’s are on the host as this could disrupt them

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)
New-VIProperty -Name ToolsVersion -ObjectType VirtualMachine -ValueFromExtensionProperty 'Config.tools.ToolsVersion' -Force
 
New-VIProperty -Name ToolsVersionStatus -ObjectType VirtualMachine -ValueFromExtensionProperty 'Guest.ToolsVersionStatus' -Force
 
Get-VM | Select Name, Version, ToolsVersion, ToolsVersionStatus | Export-Csv -NoTypeInformation -UseCulture -Path C:\VMHWandToolsInfo.csv
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

When your LUN useage shows useage that does not match your datastore useage you will need to perform a reclaim on the lun. Remember the LUN will need to be Thin provisioned with Space Reservation Disabled on a NetApp. To check you LUN is supported , login to an ESX Host and run

esxcli storage core device vaai status get

On the volume it need to have :

Delete Status: supported

Recalim can be done manually using the putty commands on a VMWare host attached to the datastore

 

cd /vmfs/volumes/datastorename/
 
vmkfstools -y %percenttoclear%

 

It is recommended that the %percenttoclear% increases from say 10% up to 80% in 10% Blocks

There’s a nice script here : https://kallesplayground.wordpress.com/2014/04/03/storage-reclamation-part1/ to automate the increase

****WARNING**** Do only one reclaim at a time , and it sharpley increases the IOP’s on the San to best to do out of hours

We had to do a few of these out of hours at around 3am , I actually automated this as a scheduled task using putty’s plink.exe which meants it ran in the very early hours of the morning! I couldn’t use the Kallesplaygroup .sh script as this didn’t work sending from plink , the script would have to be on the local esx box which it’s practical to to the multiple hosts our current cluster runs

FYI you will see you have to manually enter the password. A more recommended solution would be to use certificate authentication with putty

https://www.virten.net/2014/02/howto-esxi-ssh-public-key-authentication/

 

plink.exe %esxboxwithsshenabled% -ssh -batch -l root -pw %rootpasswordofesxbox% -m C:\Scripts\filewithabovelinesin.sh > c:\log.txt

 

**** Update in 5.5 this has changed to 

storage vmfs unmap -l Datastore name -n unit

 

 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

2014-06-17_16h26_59[1]Recently during a Veeam backup HA kicked in due to a problem with VCenter and marked two machines as Invalid

  •  Veeam started backup and created snapshot on affected VMs
  • HA kicked in after VCenter Server failure
  • Vcenter tried to move the machines after the HA event but they were locked by the snapshot Veeam had (the hypervisor has a lock because of the snapshot)
  • Machine couldn’t move but Vmware removed them from inventory and tried to register them to another host, without success
  • Machines marked as Invalid, Removing machines from Inventory and registering the machines manually did not work
  • The machines are locked however still powered on and response

Fix to get VM’s registered on VCenter

  • Stop any tasks (i.e. Veeam) that might trigger a snapshot on affected machines
  • Temporarily disable DRS in the cluster (when we power up the machine we don’t want Vsphere to power it up on another host otherwise the lock will persists)
  • Power affected VMs off gracefully from within the OS.
  • Login to vcenter and select Host which failed and . Unregister “unkown” VM’s listed (these should be the actual Vms). Unregistering should remove the lock.
  • Register the virtual machines using the vmx files in the datastore
  • Check no snapshots are present, if yes delete or consolidate
  • Power on Vm
  • Re-enable DRS
VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

SRMError – Error creating test bubble image from group instance ‘RGID-xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx’ of group ‘GID-xxxxxx-xxxx-xxxx-xxxx-xxxxxx’ on VR Server ‘localhost.localdom’ (address ‘xxx.xxx.xxx.xxx’). VRM Server generic error. Please check the documentation for any troubleshooting information. The detailed exception is: ‘java.lang.NullPointerException’

This happens due to problem RGID which needs to be recreated by the following method for each VMware Machine:

1) Rename the VM Datastore folder for the problem VM on the remote side to something else like vmname.old

2) Wait for the machine sync Replication Status to come up as not active in the vSphere Replication ( You can use Pause Replication then Resume to force this)

2) Stop and remove the replication

3) Rename the VM folder back to the previous name

4) Reconfigure Replication and use the existing data as a seed

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)

vmware_vcenter_site_recovery_manager_diagramWhen trying to test a Recovery Plan ( DR Bubble ) upon the starting up two of the machines we get the following error show up in the Errors

Error – Unable to copy the configuration file ‘MachineName.vmx’ from the host to ‘C:\Windows\TEMP\vmware-SYSTEM\machinename.vmx324-0’  – No file exists for given path.

We check the replication status manually as the gui is sometimes out of date by SSHing  into the host with the virtual machine and using:

vim-cmd vmsvc/getallvms | grep –i vmname

This will give you the ID of the machine where you can then do this

vim-cmd hbrsvc/vmreplica.getState vmnameid

The status of the machine was just IDLE instead of Inactive which is incorrect.

Found using this command in /var/log/

grep -i hbr vmkernel.log | grep ” LWD delta transfer terminated (aborted)” | awk ‘{print $10}’ | sort -u

Brought up a load of diskid’s that had issues one of them was our VM. In the end we had to remove replication and reseed again.

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)