This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

UPS gets lost after a few days

Hi to all!

I am facing an annoying issue..

I am using an Eaton 5s1000 UPS connected via USB to my sophos UTM.

It is working great, alerts and all, but the UTM loses connection after a few days. It is a random thing not in specific time frames.

Below is the normal appearance:

When it loses connection I only see the UPS icon, but nothing in the percentage (the bar is totally empty)

The only way is to unplug the UPS and plug it back in -  then all is back to normal again.

Any ideas about what may be wrong?

Do you know of any command I can use (via shell) in order to avoid physically unplugging and plugging the UPS back (in case I am away e.g.)?



This thread was automatically locked due to age.
  • Hi ChriZ,

    I'm curious if running, as root, the following brings it back.

    /etc/init.d/upsd -c reload

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hello Bob!

    Thanks a lot for your help.

    Next time it happens (as i sai it is random and may take a while..) i will give this command a try.

    Cheers and thanks again!

     
    Sophos XG Home Licence.

    Machine: Checkpoint 3100 appliance (Intel Atom C2558 CPU, 6GB Ram, 250GB sata SSD)

  • Hello again, Bob!

    It has happened again today.

    giving the above command returns an error. Restarting the service results to failed..

    utm:/root # /etc/init.d/upsd -c reload
    Usage: /etc/init.d/upsd {start|stop|status|try-restart|restart|force-reload|reload|probe|powerdown|try-powerdown}
    utm:/root # /etc/init.d/upsd -c force-reload
    Usage: /etc/init.d/upsd {start|stop|status|try-restart|restart|force-reload|reload|probe|powerdown|try-powerdown}
    utm:/root # /etc/init.d/upsd force-reload
    Reload service NUT UPS                                               failed
    utm:/root # /etc/init.d/upsd restart
    Shutting down NUT UPS monitor                                        done
    Shutting down NUT UPS server                                         done
    Shutting down NUT UPS drivers                                        done
    Starting NUT UPS drivers                                             failed
    utm:/root #
    

    Any other ideas, please? (I am starting to think that there is something wrong with the connection - perhaps change the USB cable?. Seems like upsd is not starting because it sees no UPS connected - can anyone confirm that this is the standard behavior?)

    Thanks a lot again!

     
    Sophos XG Home Licence.

    Machine: Checkpoint 3100 appliance (Intel Atom C2558 CPU, 6GB Ram, 250GB sata SSD)

  • I'm just guessing, too, but I'd definitely change the cable after seeing that.

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Well, I was using a 2m cable from an old powerware UPS. I found the cable shipped with the  Eaton UPS. ( which is a 1m cable BTW) and replaced the 2m one. I 'll see how it goes.

    Perhaps the problem was the cable length...

     
    Sophos XG Home Licence.

    Machine: Checkpoint 3100 appliance (Intel Atom C2558 CPU, 6GB Ram, 250GB sata SSD)

  • Well, I have an update on this...

    Seems that the problem remains, even after replacing the cable with the one that came with the Eaton UPS.

    It is not happening very often, maybe once in a week, but it is still happening...

     
    Sophos XG Home Licence.

    Machine: Checkpoint 3100 appliance (Intel Atom C2558 CPU, 6GB Ram, 250GB sata SSD)

  • Well, a few months have passed...

    The problem is still there...

    In the meanwhile I have acquired another UPS (the same as this one - Eaton 5S1000)

    I have this second UPS connected to a linux server, running ubuntu 14.04

    I see the same thing happening in ubuntu, too, a few times: the connection is lost.

    But in ubuntu's case I see that after a few seconds it re-establishes connection

    However this is not the case with sophos... the only way to re establish the connection is to physically remove and reinsert the usb cable...

    Any ideas regarding how to - sort of - reset the usb connection like ubuntu does to reestablish connection? (not even sure what it does, TBH) 

     

    Later edit: Not sure if it is worth mentioning, but although I have enabled the notification on ups connection/disconnection I get no emails

    The notification email does work, though, because if the utm is connected normally to the ups and i unplug the usb i do get an email

    But when the miscommunication with the ups happens "automatically" I get nothing

     

     
    Sophos XG Home Licence.

    Machine: Checkpoint 3100 appliance (Intel Atom C2558 CPU, 6GB Ram, 250GB sata SSD)

  • "But when the miscommunication with the ups happens "automatically" I get nothing."

    So it sounds like the Linux system has some sort of keep-alive mechanism that should be added to the UTM for that brand of UPS.  Can you identify anything in the System messages, Fallback, Middleware and Self monitoring logs?

    Cheers - Bob

     
    Sophos UTM Community Moderator
    Sophos Certified Architect - UTM
    Sophos Certified Engineer - XG
    Gold Solution Partner since 2005
    MediaSoft, Inc. USA
  • Hello, Bob!

    Unfortunately I am not sure about the timeframe I should use for the searches. Because this happens (I get no email on disconnect) and I don't login frequently in the webgui, so I don't know when the communication was lost

    But now I have this post and I know I disconnected/reconnected physically the USB cable on the 20th of April.

    Today is the 22nd and I just logged in and the UPS is connected.

    I will keep an eye on it (will try to login daily and check) so next time I will have a timeframe to search through..

    Thanks!

     
    Sophos XG Home Licence.

    Machine: Checkpoint 3100 appliance (Intel Atom C2558 CPU, 6GB Ram, 250GB sata SSD)

  • OK, here I am again!

    It has been working perfectly with no disconnects recently.

    On Saturday was a maintenance day, when I shut down all my machines, blow out dust etc..

    Well, yesterday I noticed that communication was again lost...

    I have set a timeframe from the 29th of April, until the second of May while searching in the logs...

    Here we go:

     

     
    System Messages
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-15:27:21 utm upsmon[22821]: Signal 15: exiting
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-15:27:21 utm upsd[22817]: User upsmon@127.0.0.1 logged out from UPS [ups]
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-15:27:21 utm upsmon[22820]: upsmon parent: read
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-15:27:21 utm upsd[22817]: mainloop: Interrupted system call
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-15:27:21 utm upsd[22817]: Signal 15: exiting
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-21:38:00 utm upsmon[3766]: UPS running on battery<29>Apr 29 21:39:05 upsmon[3766]: UPS running on line
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-21:40:50 utm upsmon[3766]: UPS running on battery
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-21:41:25 utm upsmon[3766]: UPS running on line
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-21:44:20 utm upsmon[3766]: UPS running on battery
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-21:44:35 utm upsmon[3766]: UPS running on line
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-21:45:25 utm upsmon[3766]: UPS running on battery
    /var/log/system/2017/04/system-2017-04-29.log.gz:2017:04:29-21:46:00 utm upsmon[3766]: UPS running on line
    /var/log/system/2017/04/system-2017-04-30.log.gz:2017:04:30-20:18:40 utm upsmon[3766]: Signal 15: exiting
    /var/log/system/2017/04/system-2017-04-30.log.gz:2017:04:30-20:18:40 utm upsmon[3765]: upsmon parent: read
    /var/log/system/2017/04/system-2017-04-30.log.gz:2017:04:30-20:18:40 utm upsd[3762]: User upsmon@127.0.0.1 logged out from UPS [ups]<27>Apr 30 20:18:40 upsd[3762]: mainloop: Interrupted system call
    /var/log/system/2017/04/system-2017-04-30.log.gz:2017:04:30-20:18:40 utm upsd[3762]: Signal 15: exiting
     
    After I unplug the USB cable and plug it back in, this is what I get (additionally)
    /var/log/system.log:2017:05:03-21:42:39 utm upsd[5265]: listening on 127.0.0.1 port 3493
    /var/log/system.log:2017:05:03-21:42:39 utm upsd[5265]: Connected to UPS [ups]: usbhid-ups-ups
    /var/log/system.log:2017:05:03-21:42:39 utm upsd[5266]: Startup successful
    /var/log/system.log:2017:05:03-21:42:39 utm upsmon[5269]: Startup successful
    /var/log/system.log:2017:05:03-21:42:39 utm upsd[5266]: User upsmon@127.0.0.1 logged into UPS [ups]
     

    Fallback 

    /var/log/fallback/2017/04/fallback-2017-04-29.log.gz:2017:04:29-15:27:21 utm [daemon:info] usbhid-ups[22813]: Signal 15: exiting
    /var/log/fallback/2017/04/fallback-2017-04-30.log.gz:2017:04:30-20:18:39 utm [daemon:debug] usbhid-ups[3758]: libusb_get_interrupt: error submitting URB: No such device<31>Apr 30 20:18:39 usbhid-ups[3758]: libusb_get_report: error sending control message: No such device
    /var/log/fallback/2017/04/fallback-2017-04-30.log.gz:2017:04:30-20:18:40 utm [daemon:info] usbhid-ups[3758]: Signal 15: exiting
     
    After I unplug the USB cable and plug it back in, this is what I get (additionally)
    /var/log/fallback.log:2017:05:03-21:42:39 utm [daemon:info] usbhid-ups[5262]: Startup successful
     
    Middleware & Selfmonitoring logs return nothing... :(
     
    If it helps more, in live log, kernel messages, this is what I get after plugging again the cable 
     
    2017:05:03-21:42:26 utm kernel: [360117.135508] usb 3-1: USB disconnect, device number 4
    2017:05:03-21:42:30 utm kernel: [360120.655024] usb 3-1: new low-speed USB device number 5 using xhci_hcd
    2017:05:03-21:42:31 utm kernel: [360121.269399] usb 3-1: New USB device found, idVendor=0463, idProduct=ffff
    2017:05:03-21:42:31 utm kernel: [360121.269402] usb 3-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
    2017:05:03-21:42:31 utm kernel: [360121.269403] usb 3-1: Product: 5S
    2017:05:03-21:42:31 utm kernel: [360121.269405] usb 3-1: Manufacturer: EATON
    2017:05:03-21:42:31 utm kernel: [360121.269445] usb 3-1: ep 0x81 - rounding interval to 128 microframes, ep desc says 160 microframes
    2017:05:03-21:42:32 utm kernel: [360123.134718] hid-generic 0003:0463:FFFF.0003: hiddev0,hidraw0: USB HID v1.10 Device [EATON 5S] on usb-0000:00:14.0-1/input0
    2017:05:03-21:42:36 utm kernel: [360126.416096] usb 3-1: ep 0x81 - rounding interval to 128 microframes, ep desc says 160 microframes
    2017:05:03-21:46:33 utm kernel: [360363.953185] net_ratelimit: 7 callbacks suppressed
     
    Any ideas welcome!
     
     
    EDIT: Actually now that I read my own post and saw this 
    new low-speed USB device number 5 using xhci_hcd
     
    I thought it might worth checking in BIOS about disabling xhci (not sure if it has such a setting - might worth trying though...)
     
    EDIT2: No such setting... I saw, though, that I had connected the usb cable to a usb3 port.
    Connecting to a usb2 port yielded the same message in the log about xhci_hcd, but let's see how it goes...
     
     
     
    Sophos XG Home Licence.

    Machine: Checkpoint 3100 appliance (Intel Atom C2558 CPU, 6GB Ram, 250GB sata SSD)