ESXi won’t boot from USB stick

I was installing ESXi 5.5 onto a USB stick today, something I have done hundreds of times. The install went fine, however no matter what I tried in the bios it just wouldn’t boot. Turns out, the server (in this case a HP DL380 G8) had no support for UEFI. Since ESXi5.5 installs as GPT by default this was never going to boot. The solution is to force the ESXi install to use MBR (were not going to make use of the GPT advantages on a 4GB USB stick anyway):

  • Boot from ESXi install media
  • Hit SHIFT+O when prompted
  • Enter "runweasel formatwithmbr" then hit enter to continue with the installer. (Note the word runweasel is already there, you just need to type <space> formatwithmbr.
  • Carry on the installer as you normally would…..

Veeam as a vSphere backup solution

I’ve recently had the pleasure of implementing Veeam Backup & Replication v8 as a direct replacement for BackupExec 2012/2014 vRay edition.  BackupExec 2012 was a terrible product, which is such a shame considering how good its Veritas predecessor was.  BackupExec 2014 was just as bad; its low reliability and poor usability create a huge support overhead and make RTO/RPO goals unachievable.  I think Symantec approached VM backups from the wrong angle; by indexing and cataloging all data within the VM at the point of backup, your backup jobs take longer and are more prone to failure, therefore it becomes difficult to fit everything within your backup windows.  Veeam keeps things simple and just uses the inbuilt ESXi snapshot capabilities – backups are therefore quick and the danger of agent/application based failures is removed.  Here is a quick summary of my experience with Veeam:

Pros:

  • VEEAM It just works! I can’t count the amount of times those words have been shouted across my office.  Seriously, this software is reliable.  I have yet to see a backup job failure other than when hardware was at fault.
  • It is truly agentless.  There is nothing to install on the VM which makes roll-out much quicker and reduces ongoing support effort.

Distributed Monitoring in Nagios with check_mk multisite

For some time now, I’ve been exploring the best ways of configuring a distributed Nagios setup. With the “federated” configuration that Nagios recommend, you can pass data from remote Nagios instances back to a central Nagios server with the use of passive checks combined with NSCA or NRDP. Whilst this works well, the duplicate configuration on each server soon becomes tedious and unmanageable. There are other alternatives such as mod_gearman but in my opinion these lack the intelligence to be effective.

The ability to have centralised configuration in a distributed setup isn’t currently supported by Nagios, therefore I have shifted my focus towards centralised reporting, where data is aggregated from several independent Nagios instances to a centralised location. This provides the benefits of multiple Nagios instances at remote sites but without the overhead and complexity associated with duplicating the configuration. There are quite a few tools offering centralised reporting, such as Nagios Fusion and Thruk, but my favorite by far is checkmk multisite.

vSphere ESXi firmware & driver alignment

Recently whilst troubleshooting a vSphere cluster issue, I had to align the firmware and driver on each ESXi host in the cluster. The following commands help to gather the required information. Of course you could easily create a PowerCLI script to cycle through each host and run these commands if you are inclined:

Bios Version:
smbiosDump | grep "Primary Ver"

NIC Driver type:
esxcfg-nics -l
NIC Driver & FW version:
esxcli network nic get -n vmnic

HBA Driver type:
esxcfg-scsidevs -a
HBA Driver & FW version:
vmkload_mod -s bfa |grep Version

check_cab Nagios plugin for Hawk-I rack monitoring

If you use Sinetica Hawk-I or RacKMS to manage/monitor your datacentre cabinets, I’ve created a Nagios plugins to monitor each of the sensors (temperature/humidity etc). You can grab check_cab over at Nagios Exchange.

Configuring ESXi network coredump settings

Its useful to configure your VCSA as a network dump collector for when your ESXi hosts experience a PSOD, just make sure that the “ESXi Dump Collector” service is running on your VCSA.  The host configuration via ESXCLI is as follows:

esxcli system coredump network set --interface-name vmk0 --server-ipv4 xx.xx.xx.xx --server-port 6500
esxcli system coredump network set --enable true
esxcli system coredump network get
esxcli system coredump network check
/sbin/auto-backup.sh

You can also configure this via host profiles if they are in use within your infrastructure.

Moving from SysV to Systemd

Goodbye initd, systemd is here!  Systemd brings a lot of benefits such as parallel startups and enhanced troubleshooting but sure does takes some getting used to when you have been working in a completely different way for your entire Linux life!  Systemd is now shipping with CentOS7 and most of the other major distros, so its time to learn!

This table should help with transitioning the commands:

Sysvinit Cmd Systemd Cmd
service httpd start systemctl start httpd
service httpd stop systemctl stop httpd
service httpd restart systemctl restart httpd
service httpd reload systemctl reload httpd
service httpd condrestart systemctl condrestart httpd
service httpd status systemctl status httpd
chkconfig httpd on systemctl enable httpd
chkconfig httpd off systemctl disable httpd
chkconfig –list systemctl list-unit-files –type=service

Export Java Keystore Certificate & Private Key to PEM

I always find Java Keystores a total ballache to work with, would rather manage individual PEM files any day of the week. If you need to export the contents for use with something else you can use the following commands:

Export from JKS to PKCS #12.
keytool -importkeystore -srckeystore oldkeystore.jks -destkeystore cert.p12 -deststoretype PKCS12 -srcalias tomcat -deststorepass -destkeypass

Export certificate.
openssl pkcs12 -in cert.p12 -nokeys -out cert.pem

Export unencrypted private key.
openssl pkcs12 -in cert.p12 -nodes -nocerts -out key.pem

Gluster active/passive cluster

Gluster is a nice distributed file system which offers some management benefits over block level storage systems like DRBD. By design Gluster works in an active/active cluster configuration, however for applications where millisecond precision data replication is essential, an active/passive configuration is preferable. This is how (based on SLES11 with HAE):

usual SLES HA cluster creation:
create cluster key: corosync-keygen
add nodes to corosync.conf
ensure all nodes are in hosts file
chkconfig openais on

Destroy gluster brick

If you wish to destroy or recreate a Gluster brick, it leaves some traces behind on the filesystem. This is the procedure to blat it:

umount
gluster volume stop gv0
gluster volume delete gv0
attr -lq /data/gv0/brick1
setfattr -x trusted.glusterfs.volume-id /data/gv0/brick1/
setfattr -x trusted.gfid /data/gv0/brick1/
rm –rf /data/gv0/brick1/.glusterfs
/etc/init.d/glusterd restart