Commit ee0352eb authored by Nicolas Wavrant's avatar Nicolas Wavrant

pbs: promises redirect ssh output to /dev/null

parent bb8b4569
......@@ -40,9 +40,11 @@ from slapos.recipe.librecipe import shlex
def promise(args):
# Redirect output to /dev/null
with open("/dev/null") as _dev_null:
ssh = subprocess.Popen(
[args['ssh_client'], '%(user)s@%(host)s' % args, '-p', '%(port)s' % args],
stdin=subprocess.PIPE, stdout=None, stderr=None
stdin=subprocess.PIPE, stdout=_dev_null, stderr=_dev_null
)
# Rdiff Backup protocol quit command
......
  • @Nicolas are you sure this is good idea? If a promise has some info to complain / print we don't want to loose it. Or am I missing something.

    /cc @rafael, @iv

  • Hello @kirr ,

    I'm not really proud of this commit, but still we pushed it because the subprocess is writing some binary on the output. As the output is processed afterwards by other script, I prefered hiding the output (and because I couldn't find a way to just clean it). Still, the promise is doing its advertising job, and most of the time the explanation of why this promise was failing can be found somewhere else (ie: "pbs promised failed" -> on runner0, "resilient-sshd" promise tells me that the ssh server is down).

    The binary outputs appears on ssh.stdin.write(quitcommand) and displays : b^@^@^@^@^@^@^@^Hquitting. I tried to pipe the output of the subprocess to then apply some decoding function to it, but nothing I tried worked.

    If you have some idea on how to handle this problem, I'll be glad to improve this commit.

  • @Nicolas ok, I see, thanks for feedback. The promise here talks to rdiff-backup server just giving to it its "QUIT" command and the server actually replies via rdiff-backup protocol probably about some header and then that it is quiting. So in this particular case hiding stdout maybe makes sense. But:

    • we should not hide stderr, because if there are errors they could be emitted there. And normally stderr is expected to be empty.

    • hiding details, or forcing people to indirectly see what is going on is not really helpful in my experience. e.g. in this case if this promise does not work can mean several things:

      1. sshd is not working
      2. rdiffbackup server behind this sshd is not working
      3. probably something else (e.g. all configured fine, but e.g. due to low memory situation rdiff-backup server just cannot be started or starts and is immediately killed by OOM killer or something like that).

    I think we want to see the details about why promise failed in the promise stderr output itself.

    P.S. here are some links about rdiff-backup protocol I've briefly grasped:

    in general if one wants to decode what happens with rdiff-backup on the wire, I think it is possible to call _get() in a loop and then you have all objects sent in one direction. Both directions could be probably handled too, if needed, with timestamps and proper intermixing, I think.

    If you want to decode just one reply I think you can call _get() on the output and check whether it is expected "quitting" reply.

    Kirill

  • Hello @kirr ,

    Indeed hiding stderr was a very poor decision from myself. As a temporary solution, I've decided to only hide stdout, as for the moment it's the one which causes problems. I'll try to put some logs in the targeted functions to see what's causing this binary output (I still don't know if it's rdiff-backup related, or openssh related).

    Thank you for the pointers,

    Nicolas

Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment