The Postgres WAL archive_command runs approximately every minute, and attempts to run three times in succession. In all three cases, it fails (I can see this in /var/log/system.log). If I manually run the failing command (/usr/local/bin/postgres_cloud_backup store pg_xlog/000000010000000000000005 000000010000000000000005
, in my instance) I get the following output:
I, [2017-10-24T12:34:15.497969 #2846] INFO -- : Celluloid 0.17.3 is running in BACKPORTED mode. [ http://git.io/vJf3J ] 2017:10:24-12:34:15.539 INFO root: Application postgres_backup started root ............................................ *info -T - <Appenders::Stderr name="stderr"> - <Appenders::Syslog name="postgres_backup"> AWS ........................................... info +A -T Celluloid ..................................... info +A -T Logging ....................................... *off -A -T 2017:10:24-12:34:15.547 INFO CLI::PostgresCloudBackup: HA/AWS Call failed (retry later): Failed to complete request 2017:10:24-12:34:15.547 INFO root: Application postgres_backup ended
I've traced the error by modifying /usr/lib/ruby/gems/2.2.0/gems/sophos-iaas-1.0.0/lib/sophos/iaas/cli/postgres_cloud_backup.rb
so that calls to logger.debug are replaced with logger.info, and then re-running, which gives me the following output:
I, [2017-10-24T12:39:40.987493 #3882] INFO -- : Celluloid 0.17.3 is running in BACKPORTED mode. [ http://git.io/vJf3J ] 2017:10:24-12:39:41.050 INFO root: Application postgres_backup started root ............................................ *info -T - <Appenders::Stderr name="stderr"> - <Appenders::Syslog name="postgres_backup"> AWS ........................................... info +A -T Celluloid ..................................... info +A -T Logging ....................................... *off -A -T 2017:10:24-12:39:41.059 INFO CLI::PostgresCloudBackup: JSON RPC -> func_store_wal("/var/storage/pgsql92/data/pg_xlog/pg_xlog/000000010000000000000005", "000000010000000000000005") 2017:10:24-12:39:41.061 INFO CLI::PostgresCloudBackup: JSON RPC <- {"error"=>{"code"=>-32901, "message"=>"Service temporary unavailable. Readonly mode"}, "jsonrpc"=>"2.0", "id"=>6281} 2017:10:24-12:39:41.061 INFO CLI::PostgresCloudBackup: HA/AWS Call failed (retry later): Failed to complete request 2017:10:24-12:39:41.061 INFO root: Application postgres_backup ended
There are only two places in Sophos code that I can see that trigger an error code -32901, and they both live in
/usr/lib/ruby/gems/2.2.0/gems/sophos-iaas-1.0.0/lib/sophos/iaas/cloud-manager/infrastructure/data_service/postgres_wal.rb
; only one of them has "Service temporary unavailable," and that's on line 45.
Looking at this section of code, I spy an @readonly variable, which apparently is set to true.
Looking through the interface, nothing stands out to me as incorrect. This was happening in 9.411 and continues in 9.501. Can someone nudge me in the right direction to fix this backup failure in my logs, so that I can stop thinking about it?
This thread was automatically locked due to age.