Synchronisation.txt 6.89 KB
Newer Older
Jean-Paul Smets's avatar
Jean-Paul Smets committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203
Problem

 General

   It's really difficult to synchronise different zope databases systems.
  (there is a product by Zope Corp which does that but is
  very expensive and not so public)

  There are two different problems

  - synchronising the same ODB (object database) on multiple servers / sites

  - synchronising some data stored on different ODB on multiple servers / sites

  The first case is the case of one company with a global ODB

  The second case is the case of 2 companies who want to share their
mutual orders but not other orders

  If case 2 is solved, then case 1 too. We shall solve case 2
  by using an intermediate XML format specific to each application

 Specific problems

   Most of synchronisations systems needs a database wich acts like a master, and
  all slaves are like a copy of the main database (case 1)
   Thoses systems don't allow to copy only a part of the data, they copy all
  objects from the master database to all slaves databases.


Proposed Solution

 <img src="simple_schema.jpg">

 This project proposes to create a Zope Product that have to synchronise
 databases. This synchronisation have to update only the part of the database
 needed.

  Specifically we want to achieve the following goals :

  o manage a list of subscribers who can take a subscription to a query.

  o manage queries applied to a database of objects.

  o generate XML in order to communicate with several databases.

  o use filters, so that generated XML only contains part of the attributes of
  objects in the ODB (1 XML file per Zope Objet / Document (ex. 1 XML file per
  order)). For example, private comments should not be exported in the generated
  XML file

  o synchronise databases with the content of the XML.

  There's already a generic synchronisation system for Zope : ZSyncer. But this system
  is too limited for all our needs.

Details


 Objects

  All objects will have the following attributes

  - id : the standard object id

  - uid : a global object id wich is unique to each ZODB

  - rid : the standard object id in the master ODB the object
        was subscribed from (remote id).

  - sid : the id of the subsription/synchronisation object wich this
        object was generated from.

  A mapping from id to rid and rid to id is available through the
  subscription sid



 First step : initialisation :

        <img src="steps.jpg">

        1 We create a subscription inside portal_synchronisations in the slave.
          We define for that subscription

            - id, title, description

            - the URL of the master (which contains an index of all
              files we can sync) - this URL defines a public
              synchronisation handle in the master server

            - a local query (SQL Method) to define the local realm of
              synchronisation

            - an import Conduit in order to validate data,
              calculate differences and import differences. This is where
              we define the format of the data we receive from the master

            - a local PageTemplate/DTML to converte local objects to XML.
              This is where we define both the format we send to the master
              and the realm of attributes we synchronise (ie. the mapping)


        2 The slave sends the "initialisation package" (SyncML specification), in this package, 
        appears several elements :

            - the SynchHdr element with several informations about the protocol used. We will
            put the query in this element, and also authentification informations.

            - the Alert element, in order to specify what kind of synchronisation we want,
            for us it will be the Alert 200 wich specifies a client-initiated, two way
            synchronisation.

        3 We create a publication in the master database. We define

            - id, title, description

            - a local query (SQL Method) to define the local realm of
              publication (which objects do we publish)

            - a local PageTemplate/DTML to convert local objects to XML
              (what do we publish in each object and with what format)

            - an import Conduit in order to be able to commit changes
              sent by subscribed databases into the master database (OPTION)

            	A this step, we produce all XML files from master ODB

        4 The master sends the "Initialisation package" to the client with :

            - the SynchHdr element with several informations about the protocol used, 
            and also authentification informations.

            - the Status element, in order to respond to the alert command sent by the client.
            A code for the status is returned, it will be typically 212 if the authentification
            is fine for the client.

            - differences between the two databases.

            - the Alert element, in order to specify what kind of synchronisation we want,
            for us it will be 201 wich specifies a client-initiated, two-way slow-synchronisation.
            This just means that both the server and the client sends all their data. Each
            data must specify the DTD used.
        
		
        //4 The master updates is database with the objects from the client. Then
        //the master generate an xml file of differences between the two databases (tricky if
        //at the beginning the client don't have any data), differences only all objects 
        //corresponding to the query.
        At this step, we may eventually record the subscriber in the publication
        of the master database for... (OPTION). This is like
        the subscribtion is becoming member of mail list to be informed of
        changes of a publication

            //- if the slave already have some data (for example an addressbook) corresponding
            //to the query (for example all people from the company Nexedi), then thoses data
            //must be sent by the same time. Each data must specify the DTD used.


        //6 Finally the slave import in his databases all objects from the master.



 Second step : synchronisation :

    General idea

        M master

        S slave

        C conflicts

        t continuous time

        tn discrete time

        1 calculate the difference on the slave DS(tn) = S(tn) - S(tn-1)

        1 upload the differences to the master or send by email or whatever

        1 calculate the difference on the master DM(tn) = M(tn) - M(tn-1)

        1 update the master with differences from the slave M(tn) <- M(tn) + DS(tn)

        1 Manage conflicts and if possible solve them on the master C(tn)

        1 upload the differences and conflicts resolution to the slave

        1 update the slave S(tn) <- S(tn) + DM(tn) + C(tn)

    Detail implementation
        
        blabla


References

 XMLDiff - http://www.garshol.priv.no/download/xmltools/prod/xmldiff.html

 SyncML - http://www.syncml.org
	
 ZSyncer - http://www.zope.org/Members/andym/ZSyncer