web_renderjs_ui: use lxml to extract data-i18n messages

The previous regular expression based approach sometimes could not extract
message properly. Using xml parser simplify code and fix several messages
that were not extracted properly, like messages containing ", [] or {}

This also fix some problems when looking for messages sources:
  - archived web pages were sometimes used instead of published ones
  - messages from gadgets implemented as page templates/OFS files were not
    extracted.

A few more unit tests for the scripts involved in this process are added.
5 jobs for master in 0 seconds
Status Job ID Name Coverage
  External
failed ERP5.CodingStyleTest-Master

passed ERP5.PerformanceTest-Master

00:23:55

failed ERP5.UnitTest-Master.Medusa

passed SlapOS.Eggs.UnitTest-Master.Python2

00:16:41

passed SlapOS.Eggs.UnitTest-Master.Python3

00:22:22