Environment
- HP DL380 G7, 96GB RAM, 2 Xeon L5630 CPUs, 8 total cores available
- ESXi 6.0U3, HPE version
- ESXi freshly installed onto internal SD card, no other VMs present
- 3TB of RAID-10 datastore available (empty)
Problem
When installing the vCenter Server Appliance using the CLI, after a 45 minute pause at "RPM Install: Progress: 95% Configuring the maching" I get a final fatal error: "Failed to authenticate with the guest operating system using the supplied credentials."
Details
First of all, hats of to vMware for providing this article: https://kb.vmware.com/s/article/2106760 ("Triaging a vCenter Server Appliance 6.0 installation Failure") Those are the sorts of articles that can really help by providing basic info about what the software is trying to do and letting us use our own brains to try and find why it's not doing what's intended. (Contrasted with the guides that try to find specific solutions for every possible problem-- an impossible goal.)
The first error seen in the /tmp/vcsaCliInstaller-<datestamp>/vcsa-cli-installer.log file seems to suggest a possible problem:
2018-01-11 21:01:38,004 - vCSACliInstallLogger - TRACE - Cannot download file /var/log/firstboot/rpmInstall.json from esx60dl380.mydomain.net. Got error: Failed to download file from guest
with error '(vim.fault.InvalidGuestLogin) {
dynamicType = <unset>,
dynamicProperty = (vmodl.DynamicProperty) [],
msg = 'Failed to authenticate with the guest operating system using the supplied credentials.',
faultCause = <unset>,
faultMessage = (vmodl.LocalizableMessage) []
}'
The odd part about this message is that the download fails against THE ESXi SERVER but it appears that the file would be one generated on THE VCSA GUEST.
When downloading the logs manually from https://<VCSA hostname>/appliance/support-bundle, it works, though it takes about 3 minutes to complete. This appears to be dynamically generated by /bin/vc-support.cgi, which when run from the bash command line, takes about 2 minutes to complete. The logs show that the long steps are things like "rpm -qa --verify" (58s) and "/sbin/service --status-all" at 8s. No surprises there.
The final errors in the log file:
2018-01-11 21:47:49,075 - vCSACliInstallLogger - INFO - Gathering VC support log bundle. This can take a few minutes.
2018-01-11 21:47:49,365 - vCSACliInstallLogger - WARNING - Collecting the support bundle from the deployed appliance...
2018-01-11 21:47:49,753 - vCSACliInstallLogger - ERROR - Cannot collect the support bundle from the deployed appliance: Failed to run command in guest with error '(vim.fault.InvalidGuestLogin) {
dynamicType = <unset>,
dynamicProperty = (vmodl.DynamicProperty) [],
msg = 'Failed to authenticate with the guest operating system using the supplied credentials.',
faultCause = <unset>,
faultMessage = (vmodl.LocalizableMessage) []
}'
2018-01-11 21:47:49,754 - vCSACliInstallLogger - ERROR - Got error while running OVF Tool command: Failed to download file from guest with error '(vim.fault.InvalidGuestLogin) {
dynamicType = <unset>,
dynamicProperty = (vmodl.DynamicProperty) [],
msg = 'Failed to authenticate with the guest operating system using the supplied credentials.',
faultCause = <unset>,
faultMessage = (vmodl.LocalizableMessage) []
}'
2018-01-11 21:47:49,756 - vCSACliInstallLogger - DEBUG - The vCenter Server Appliance installer log file is at: /tmp/vcsaCliInstaller-2018-01-12-04-59-U0CFN0/vcsa-cli-installer.log
2018-01-11 21:47:49,756 - vCSACliInstallLogger - DEBUG - The vCenter Server Appliance installer result file is at: /tmp/vcsaCliInstaller-2018-01-12-04-59-U0CFN0/vcsa-cli-installer.json
These messages fail to provide basic info like what host ("deployed appliance") were you trying to download from? What were the credentials used? (Yeah, that's not always something good to put in log files, but it sure is helpful for troubleshooting!)
I've checked that the hostnames of the ESXi server, CLI source server, and DHCP reservation all resolve in DNS both forward and reverse. I've checked the network throughput using the VCSA shell and netcat and it's excellent. (100+ MB/s) I've checked DNS from within the VCSA shell and every host I could think that could be related (source, ESXi host, etc.) all resolve perfectly and ping fine.
I've simplified the CLI deployment template to the bare essentials. Here's what I'm using (with the temp passwords intact because, hey, they're throwaway anyhow):
{
"__version": "1.2.0",
"__comments": "Deploy a vCenter 6.0 instance to manage zircon and ruby. Will live on zircon.",
"target.vcsa": {
"appliance": {
"deployment.network": "VM Network",
"deployment.option": "tiny-lstorage",
"name": "vcenter",
"thin.disk.mode": false
},
"esxi": {
"hostname": "esx60dl380.mydomain.net",
"username": "vcsa-deploy",
"password": "Temp2_ChangeL8R",
"datastore": "HP P410i RAID10"
},
"network": {
"ip.family": "ipv4",
"mode": "dhcp"
},
"os": {
"password": "/Osburl9",
"ntp.servers": [
"172.29.55.25"
],
"ssh.enable": true
},
"sso": {
"password": "5-fronLi",
"domain-name": "vsphere.local",
"site-name": "TESTING"
}
}
}
I'm wondering if the installer is getting confused about what's the guest and what's the ESXi server. Has anyone else seen this situation where it looks like the installer is trying to grab guest logs from the ESXi host? How would this installation work for anyone if that's the case?