Commit d5d4f7c8 authored by Kirill Smelkov's avatar Kirill Smelkov

.

parent e1481152
......@@ -25,9 +25,9 @@
# encode converts string s into form suitable to be used in names of buildout sections.
#
# Such encoding is needed because buildout forbids to use spaces and many other
# characters in section names, which, in turn, leads to inability to use
# arbitrary strings for sections generated based on e.g. instance references
# retrieved from SlapOS Master.
# characters in section names, which, in turn, leads to inability to directly
# use arbitrary strings for sections generated based on e.g. instance
# references retrieved from SlapOS Master.
#
# With encoding it becomes possible to use arbitrary names for references
# without leading to instantiation failures like
......
  • Hello @kirr

    That's good to know. At first, this reminded me of nexedi/slapos.buildout@4e13dcb9 - but this is different, here it is a mechanism to escape keys, not values.

  • @jerome, thanks for feedback. Yes, it is used to be able to accept arbitrary reference for a shared instance, and use that reference in generated buildout sections, e.g. like this:

    https://lab.nexedi.com/kirr/slapos/blob/e11a0fbe/software/ors-amarisoft/ru/lopcomm/libinstance.jinja2.cfg#L23-50

    For sections that are wrapped with part and promise usage of B (= xbuildout.encode) is automatic:

    https://lab.nexedi.com/kirr/slapos/blob/e11a0fbe/software/ors-amarisoft/ru/libinstance.jinja2.cfg#L27-50
    https://lab.nexedi.com/kirr/slapos/blob/e11a0fbe/software/ors-amarisoft/ru/libinstance.jinja2.cfg#L224-225

    but referencing such sections would still require to use B around the name.

    We can use B in part unconditionally because encoding has the invariant that for the names, that buildout allows, encoding is identity if there is no _ symbol in the name. But in buildout profiles we usually prefer to use - instead of _ so that does not seem to create a problem in practice.

    Thanks once again for feedback. I think eventually this might be moved to a more generic place to be used everywhere instead of using things like

    {%   set slave_reference = orig_slave_reference.replace(' ', '_') -%}
    {%-     set domain_prefix = slave_instance.get('slave_reference').replace("-", "").replace("_", "").lower() %}

    https://lab.nexedi.com/nexedi/slapos/blob/1e10b3e7/software/backupserver/instance-pullrdiffbackup.cfg.in#L53
    https://lab.nexedi.com/nexedi/slapos/blob/1e10b3e7/software/rapid-cdn/instance-slave-list.cfg.in#L224

    and other similar places, because it will break for any other character, and also if there are two slaves, e.g. "abc def" and "abc_def" the replace will map them to the same identifier thus mixing in those two entities into one.

    There is no such problem with encoding that does not loose information because it will always map different references to different encoded strings.

  • I also realized that without this protection it is possible to inject buildout code via specially crafted shared-instance reference.

  • I also realized that without this protection it is possible to inject buildout code via specially crafted shared-instance reference.

    that's true, this might have been one reason for the introduction of dumps. In practice this seems extremely hard to correctly prevent injection.

    Anyway, this seems to make sense to make this more usable, one question is do we need two encodings (dumps and this). I guess we do, because dumps also preserves the type, so it's different.

  • @jerome, thanks for feedback and for mentioning dumps. I've added "see also" to dumps to encode description, and I also had to use dumps myself to serialize values as you desribe. Not only lists/dicts/etc but also strings - for the same reason to avoid buildout treating $ and other symbols as control ones.

    I agree it is better to avoid duplicating things and use only one encoding. I thought about it a bit and it looks like B and dumps complement each other. B can be thought as 'utf-8' or base64, while dumps is similar to pickle. After pickle does its jobs it is possible to encode the result via base64. Similarly in theory we could organize dumps to do its safe_repr thing and to later encode the result via B (instead of replacing $ and ; manually).

    However in theory it could be also possible to enhance buildout.encode, so that if given non-string it finds a way to also correctly encode it and further decode. This way it will be only B to remain, but the encoding of python list and dict will be less readable in .installed.cfg compared to what it is today.

    In practice this seems extremely hard to correctly prevent injection.

    After hitting breakage of my SR for real here and there I've changed its testsuite to inject buildout control characters into every reference of shared instance:

    https://lab.nexedi.com/kirr/slapos/blob/1a532ff5/software/ors-amarisoft/test/test.py#L123-127

    This automatically forced me to fix 99% of issues because otherwise running tests just fail. Of course this implicitly depends on the knowledge which places in the SR needs attention, and it will break if it is possible to pass in strings via different parameters.

    Currently the SR will also break if one abuses input parameters, and e.g. instead of "integer", as requested by input schema, passes in a string. This way it is still possible to inject code and there is no protection on the SR side from this attack. For such scenario, I believe, the proper protection is to require passed in parameters to be valid wrt SR schema and reject them otherwise.

    For completeness it is also possible to do something in spirit of MarkupSafe but do buildout-kind of escaping instead of html/xml.

  • P.S. is nexedi/slapos.buildout@d42b790b enough if a set option will contain \n, e.g.

    options['option'] = '${buildout_syntax_should_be_escaped}\n[ccc]\n'

    ?

  • Thanks for explaining, that's right it's too different things and the analogy with B is like base64 and dumps is like pickle is correct.

    If we generally worry about injection, we should also consider that injection can happen later, for example if we generate a shell scripts ( in the case of shell script, the wrapper recipe takes care of escaping but we don't always use it because sometimes we need more ) or config files. Modern software tend to use some format like json, yaml or toml for configuration, in that case if instead of generating config file with string substitution we generate the config with a proper json (we can use json for yaml) or toml serializer it should be OK.

    In any case, it seems good to have some utility to safely generate buildout sections or option names as well, but this battle of "safe buildout" seems really hard. In some cases, "brutally" refusing options with dollar, new lines and spaces would also do the job, but if we want to support, the approach of injecting options with $ \n or spaces during test and fix all problem is the way to go.

    nexedi/slapos.buildout@d42b790b seems not enough for \n if I remember correctly, it was made for a problem that $ was not escaped properly in installed.cfg

  • Jérome, thanks for feedback and you are welcome.

    Indeed rejecting special characters in references could be useful, but for me this story started from test failures because slapos.testing uses 'testing partition 0' (note the spaces) as partition reference by default. And I remember people using references with spaces for real deployments. I also tried to request instances on real slapos master with all space, \n, $ etc, and it really worked without master rejecting this and so with further breaking things for real on my instance:

    https://lab.nexedi.com/kirr/slapos/blob/a6183a42/software/ors-amarisoft/k/krequest-cb5.enb+#L84-95

    I originally hit this problem with just generated buildout becoming incorrect during the tests due to space, and it only later occurred to me that there is possibility of injection.

    I agree it all could be safer if we do not generate code at all and put config parameters into some JSON/YAML/TOML file, load that data in the receipe code - preferrably python code - and further handle it there. It could be also more robust, as you say. to generate configs for the software via this way. In my particular case the configuration for Amarisoft software is that YAML file: https://lab.nexedi.com/kirr/slapos/blob/a6183a42/software/ors-amarisoft/config/enb.jinja2.cfg which is currently generated via Jinja2 and string substitutions because I hesitated to break the current way of doing things via Jinja2 instead of doing it with proper programming language. But maybe in the longer term switching back to py recipes, working on data structures there and then generating the config file via yaml.dumps might be a good idea.

    I see about nexedi/slapos.buildout@d42b790b. So if a recipe generates an option with \n in its value then .installed.cfg will probably become invalid buildout, or valid buildout but with incorrect data.

    All this is indeed a huge attack surface. I always wonder is our CDN safe from this attack, and now I understand why for premium CDN we do not accept slapos requests from outside and require customers to create tickets for our operators to create or adjust frontends for them instead of using slapos directly. Still it is currently possible to break whole regular CDN with requesting just one tricky frontend instance, which, I think will be a huge DOS at the minimum.

  • Hello @kirr

    Ah that's good that the default partition reveal problems, because it's real that the instance name often have spaces. Now I remember that we sometimes intentionnaly put "harder" values: https://lab.nexedi.com/nexedi/slapos/blob/cd75648d9ab0d3986e9b136510c70d439933e8e3/software/theia/test/test.py#L281

    I see about nexedi/slapos.buildout@d42b790b. So if a recipe generates an option with \n in its value then .installed.cfg will probably become invalid buildout, or valid buildout but with incorrect data.

    I remember the "%(buildout_space)s" that we sometimes see in .installed.cfg , maybe an option with \n is supported actually :

    https://lab.nexedi.com/nexedi/slapos.buildout/blob/d42b790b4ba40a0807c432f39d6271d0c4b95e75/src/zc/buildout/buildout.py#L1695-1732

    ( and this code is probably relevant )

    That's good point that this is dangerous for CDN. I did not see it this way. Actually I looked at this mostly thinking at ERP5, where we don't care about injection because from an ERP5 instance, admin can execute arbitrary code anyway.

  • mentioned in commit nexedi/slapos@9c297883

    Toggle commit list
  • mentioned in commit 64092de9

    Toggle commit list
  • mentioned in commit 9f33419d

    Toggle commit list
  • mentioned in commit c83fa3cb

    Toggle commit list
  • Hello @jerome. Thanks for all the feedback back then and it is funny to see characters like 💥. I do not have much to add to the discussion - I just wanted to say that the xbuildout patch is now part of nexedi/slapos!1533 (merged) - it is patch named "software/ors-amarisoft: enb/generic: Protect from buildout code injection" there.

Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment