This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

XG in ESXi - Panic and the VM stops <- Bug in VM tools

Hello

I find that every few days (24hrs to 48hrs), XG which is hosted in ESXi6.7U2 is in stop state.

Reading the logs - I stumbled onto this

2020-10-21T11:40:20.622Z| vmx| I125: GuestRpcSendTimedOut: message to toolbox-dnd timed out.
2020-10-21T11:40:32.498Z| vcpu-0| I125: VLANCE: Returning 0x0 for LANCE_EXTINT IN
2020-10-21T11:40:32.498Z| vcpu-0| I125: VLANCE: Ignoring LANCE_EXTINT OUT of 0x1
2020-10-21T11:40:32.700Z| vcpu-0| I125: VLANCE: IN on LANCE_MODE while not stopped: 0x433
2020-10-21T11:40:32.700Z| vcpu-0| I125: VLANCE: OUT on LANCE_MODE while not stopped: 0x433, word: 0x80
2020-10-21T11:40:32.701Z| vcpu-0| I125: VLANCE: OUT on LANCE_LADRF0 while not stopped: 0x433, word: 0x0
2020-10-21T11:40:32.701Z| vcpu-0| I125: VLANCE: OUT on LANCE_LADRF1 while not stopped: 0x433, word: 0x80
2020-10-21T11:40:32.701Z| vcpu-0| I125: VLANCE: OUT on LANCE_LADRF2 while not stopped: 0x433, word: 0x0
2020-10-21T11:40:32.701Z| vcpu-0| I125: VLANCE: OUT on LANCE_LADRF3 while not stopped: 0x433, word: 0x2
2020-10-21T11:40:49.363Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2020-10-21T11:41:30.243Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2020-10-21T11:41:45.685Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2020-10-21T11:41:49.337Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2020-10-21T11:42:41.189Z| vcpu-0| I125: Guest: toolbox: Version: 10.3.10.10540 (build-12406962)
2020-10-21T11:42:41.264Z| vcpu-0| W115: GuestRpc: application toolbox, changing channel 65535 -> 0
2020-10-21T11:42:41.264Z| vcpu-0| I125: GuestRpc: Channel 0, guest application toolbox.
2020-10-21T11:42:41.264Z| vcpu-0| I125: DEPLOYPKG: ToolsDeployPkg_Begin: state=0 err=0, msg=null
2020-10-21T11:42:41.273Z| vcpu-0| I125: Tools: Changing running status: 0 => 2.
2020-10-21T11:42:41.273Z| vcpu-0| I125: Tools: Removing Tools inactivity timer.
2020-10-21T11:42:41.274Z| vcpu-0| I125: TOOLS Received tools.set.version rpc call, version = TOOLS_VERSION_UNMANAGED, type is unknown
2020-10-21T11:42:41.274Z| vcpu-0| I125: TOOLS Setting toolsVersionStatus = TOOLS_STATUS_UNMANAGED
2020-10-21T11:42:41.274Z| vcpu-0| I125: Tools_SetVersionAndType did nothing; new tools version (2147483647) and type (0) match old Tools version and type
2020-10-21T11:42:41.298Z| vcpu-0| I125: Vix: [guestCommands.c:1967]: Error VIX_E_UNRECOGNIZED_COMMAND_IN_GUEST in VMAutomationTranslateGuestRpcError(): Tools failed to recognize guest command op code=62, result string="Unknown Command".
2020-10-21T11:42:41.646Z| vcpu-0| I125: TOOLS state change 3 returned status 1
2020-10-21T11:42:41.646Z| vcpu-0| I125: Vix: [mainDispatch.c:4156]: VMAutomationReportPowerStateChange: Reporting power state change (opcode=2, err=0).
2020-10-21T11:42:41.646Z| vcpu-0| I125: Tools: Changing running status: 2 => 1.
2020-10-21T11:42:44.759Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2020-10-21T11:42:47.558Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2020-10-21T11:43:55.606Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2020-10-21T11:44:45.516Z| svga| I125: MKSScreenShotMgr: Taking a screenshot
2020-10-24T07:29:47.322Z| vcpu-1| W115: MONITOR PANIC: vcpu-0:VERIFY devices/net/vlance_shared.c:1198
2020-10-24T07:29:47.322Z| vcpu-1| I125: Core dump with build build-13006603
2020-10-24T07:29:47.322Z| vcpu-0| I125: Exiting vcpu-0
2020-10-24T07:29:47.327Z| vcpu-1| I125: Writing monitor file `vmmcores.gz`
2020-10-24T07:29:47.332Z| vcpu-1| W115: Dumping core for vcpu-0
2020-10-24T07:29:47.332Z| vcpu-1| I125: VMK Stack for vcpu 0 is at 0x451a10793000
2020-10-24T07:29:47.332Z| vcpu-1| I125: Beginning monitor coredump
2020-10-24T07:29:47.529Z| mks| W115: Panic in progress... ungrabbing
2020-10-24T07:29:47.529Z| mks| I125: MKS: Release starting (Panic)
2020-10-24T07:29:47.529Z| mks| I125: MKS: Release finished (Panic)
2020-10-24T07:29:48.505Z| vcpu-1| I125: End monitor coredump
2020-10-24T07:29:48.505Z| vcpu-1| W115: Dumping core for vcpu-1
2020-10-24T07:29:49.507Z| vcpu-1| I125: VMK Stack for vcpu 1 is at 0x451a06b93000
2020-10-24T07:29:49.507Z| vcpu-1| I125: Beginning monitor coredump
2020-10-24T07:29:50.640Z| vcpu-1| I125: End monitor coredump
2020-10-24T07:29:56.339Z| vcpu-1| W115: A core file is available in "/vmfs/volumes/5ce57fb2-89f54b15-46e4-001999989301/Sophos/vmx-zdump.001"
2020-10-24T07:29:56.339Z| vcpu-1| I125: Msg_Post: Error
2020-10-24T07:29:56.339Z| vcpu-1| I125: [msg.log.error.unrecoverable] VMware ESX unrecoverable error: (vcpu-1)
2020-10-24T07:29:56.339Z| vcpu-1| I125+ vcpu-0:VERIFY devices/net/vlance_shared.c:1198
2020-10-24T07:29:56.340Z| vcpu-1| I125: [msg.panic.haveLog] A log file is available in "/vmfs/volumes/5ce57fb2-89f54b15-46e4-001999989301/Sophos/vmware.log".
2020-10-24T07:29:56.340Z| vcpu-1| I125: [msg.panic.requestSupport.withoutLog] You can request support.
2020-10-24T07:29:56.340Z| vcpu-1| I125: [msg.panic.requestSupport.vmSupport.vmx86]
2020-10-24T07:29:56.340Z| vcpu-1| I125+ To collect data to submit to VMware technical support, run "vm-support".
2020-10-24T07:29:56.340Z| vcpu-1| I125: [msg.panic.response] We will respond on the basis of your support entitlement.

So, as I read on web, this happens because Sophos has used a buggy version of opensource vmware tools in their compile (I used their VMware install, not the Disc install).

Upgraded to 18MR3

Can you double check my analysis, and what is the workaround.

Thank you

Nitin



This thread was automatically locked due to age.
Parents Reply Children