Overview ======== Morango is a pure-Python database replication engine for Django that supports peer-to-peer syncing of data. It is structured as a Django app that can be included in projects to make specific application models syncable. Developed in support of the `Kolibri `__ product ecosystem, Morango includes some important features including: - A certificate-based authentication system to protect privacy and integrity of data - A change-tracking system to support calculation of differences between databases across low-bandwidth connections - A set of constructs to support data partitioning Motivating user story --------------------- Imagine a scenario where we have four instances of Kolibri: - *Home* is a tablet used at home by a learner with no internet access - *Facility* is a laptop at a nearby school, also with no internet access - *City* is a laptop in a nearby city - *Cloud* is a server online in the cloud On *Facility*, a coach assigns resources to a learner's user account. The learner brings *Home* to the school and syncs with *Facility*, getting only their assignments and no private data about other learners. The learner uses *Home* for a week, engaging with the assigned resources. They (and other learners) bring their tablets back to school and sync again with *Facility*. The coach can now see the recent user engagement data for their class. An admin user wants to get the recent user engagement data from the *Facility* device onto their *City* device. In order to achieve this, the admin may bring *City* to the remote area. Once *City* arrives in the remote area, *Facility* and *City* can sync over the school's local network. Finally, the admin brings *City* back to the city and syncs with *Cloud* over the internet. At this point, *Facility*, *City*, and *Cloud* all have the same data. Now, imagine a second admin in another city syncs their own laptop (*City 2*) with *Cloud*. Now they too would have the recent data from *Facility*. Objectives ---------- - **User experience:** Streamline the end-user syncing process as much as possible - **Privacy:** Only sync data to devices and users authorized to access that data - **Flexibility:** Afford the ability to sync only a subset of the data - **Efficiency:** Minimize storage, bandwidth, and processing power - **Integrity:** Protect data from accidental and malicious data corruption - **Peer-to-peer:** Devices should be able to communicate without a central server - **Eventual consistency:** Eventually all devices will converge to the same data Usage in Kolibri ---------------- Morango is not the only way that Kolibri instances communicate with each other and other services. Some other ways Kolibri communicates are: - Discovering other Kolibri instances with Zeroconf - Calling REST APIs for getting meta-information about discovered Kolibri instances - Calling REST APIs for sending anonymous usage statistics to an LE telemetry server - Calling REST APIs for browsing content available for import from Studio and other Kolibri instances - Downloading static channel database and media files from Studio and other Kolibri instances Morango's certificate, change-tracking, and partitioning features are useful especially in situations where diff-based updates and guarantees about distributed data consistency and coherence are useful.