I have a question to
https://crickapi.docs.apiary.io/#reference/crick-api-for-watson/frames/push-(not-yet-synchronized)-user's-frames
To synchronization is proposed POST. When POST is dedicated to creating resources.
For both updating and creating resources PUT should be used.
Full synchronization requires also answer for the following questions:
Lets A be a client (for example Watson CLI)
Lets B be a server (for example app.crick.io api)
Question 1. Which participant of synchronization contains a source of truth?
a) both have the same level
b) client
c) server
if a - both have equal value
Question 2. Which behavior should be considered as correct when two resources have other values but the same id. The data model of time frame does not contain last modification time. Even if, to do correct synchronization we need also background - previous common version. This is an open question. Related with lacking docs about synchronization.
Question 3. Should be allowed data deletion? If A has resource but B has not then synchronization means that resource should be added to B, or deleted from A? If we select adding strategy on how to remove the resource, is deleting strategy, how to add?
These not all questions but I do not have infinite time, so go to next possibility.
if b - client (Watson cli is master)
Question 4. Then it should start synchronization by get data from the server, process it by comparison with data in Watson, then send POST only to them that are not created on the server (not synchronized yet)
and it is a source of this question because we have an endpoint for both get all frames
and POST lacking frames
but it is an incomplete approach. What about update frames that change PUT / PATCH and remove frames that were removed DELETE. And what is a relation among taking the logic of synchronization in Watson CLI in relation to recommendation of @SpotlightKid from
jazzband/Watson#40
that in 2015 typed
To not bloat the Watson distribution with too many sync backends (and their dependencies), I propose to use a plugin framework to load backend implementations and to specify the API that they have to support.
Question 5. What is a scenario when in one backend it connected two clients? O one with data second without. Should synchronization with first create data on the server, and on the second remove? Taking into account that only GET and POST are implemented I suspect that rather, first synchronization creates data on the server, second move them to the second client, but when I remove the frame from the first client and synchronize again this frame rather will occur on the client that will be removed from the server. Should be it considered as a bug?
Actually "Watson deleted frames do not sync with crick"
#111
if c - the server is master, and cli slave
It is rather not probably because of synchronization means in this case that you can create data only on the server. But when I had seen issue
jazzband/Watson#171
I decided to add the next question
Question 6. Who is a person that has to decide voice on this topic? @jmaupetit typed
We must re-consider our synchronization strategy which —at the time of writing— overrides local changes between two sync events.
It is related with my question about integration with external sources of data that uses his own identifiers.
jazzband/Watson#190
It is related with not finished discussion about logic of synchronization there
Syncing with server overrides local changes #171
And lacking documentation there.
jazzband/Watson#165
I can send my propositions. What should I do?
- Do research about synchronization protocols [today]
- Propose protocol [today]
- Wait for an answer for question [1 month]
- Wrap everything together and publish a draft of the specification of synchronization [1 week]
- Wait for fixes and opinions from community [1 month]
- Learn Go + react, I know c, c++, python, vue, so it will be easy [1 month]
- Implement this specification [1 month]
- Wait for accepting pull request [1 month]
When everything will go great we will have working synchronization in half of 2019 and many issues connected with it will be closed.
So let's start.
- Research on synchronization:
https://en.wikipedia.org/wiki/Data_synchronization
We have
- file synchronization
- version control
- distributed filesystems
- mirroring
I propose version control.
set reconciliation problem can be solved by
- Wholesale transfer
- Timestamp synchronization
- Mathematical synchronization
I poropose matchematical synchronization
In Error handling paragraph there is a sentence
The simplest approach is to have a single master instance that is the sole source of truth.
But I propose another approach - accept any modification and store list of modifications. When two modifications overlapping, then merge them with "mathematical synchronization" that I will describe later.
Proposed tools
http://thesecretlivesofdata.com/raft/
There is PDF
raft.pdf
and finally a list of implementations
https://raft.github.io/#implementations
So props:
- has many implementations, are widelly known
- works in a distributed network of nodes,
Questions:
should we consider Watson cli like rarf node or client?
Answer:
It could be node only if have a public address, but it is to send a request to them, but this is hard to achieve.
So Watson cli should be a client in this model.
Cons:
- it seems to be overengineered.
- it needs cluster of servers to works efficiently
- we rather looking for simple sollutin like "storage everywhere", "server -> serverless"
I reseatrched some solutions and finally finised on stackoverflow asking this question
https://stackoverflow.com/questions/54385016/simple-synchronization-protocol-for-array-of-objects
This is instantly draft of my proposition how to solve problem of synchronization. It this model Serverless lambda + text file stored anywhere can be replaced by crick backend and postgress, but vision of serverless (that are free today for small number of requests) and static file storage (that is also free for personal users) for me is more attractive than backend that must be served.
I have a question to
To synchronization is proposed POST. When POST is dedicated to creating resources.
For both updating and creating resources PUT should be used.
Full synchronization requires also answer for the following questions:
Lets A be a client (for example Watson CLI)
Lets B be a server (for example app.crick.io api)
Question 1. Which participant of synchronization contains a source of truth?
a) both have the same level
b) client
c) server
if a - both have equal value
Question 2. Which behavior should be considered as correct when two resources have other values but the same id. The data model of time frame does not contain last modification time. Even if, to do correct synchronization we need also background - previous common version. This is an open question. Related with lacking docs about synchronization.
Question 3. Should be allowed data deletion? If A has resource but B has not then synchronization means that resource should be added to B, or deleted from A? If we select adding strategy on how to remove the resource, is deleting strategy, how to add?
These not all questions but I do not have infinite time, so go to next possibility.
if b - client (Watson cli is master)
Question 4. Then it should start synchronization by get data from the server, process it by comparison with data in Watson, then send POST only to them that are not created on the server (not synchronized yet)
and it is a source of this question because we have an endpoint for both get all frames
and POST lacking frames
but it is an incomplete approach. What about update frames that change PUT / PATCH and remove frames that were removed DELETE. And what is a relation among taking the logic of synchronization in Watson CLI in relation to recommendation of @SpotlightKid from
that in 2015 typed
Question 5. What is a scenario when in one backend it connected two clients? O one with data second without. Should synchronization with first create data on the server, and on the second remove? Taking into account that only GET and POST are implemented I suspect that rather, first synchronization creates data on the server, second move them to the second client, but when I remove the frame from the first client and synchronize again this frame rather will occur on the client that will be removed from the server. Should be it considered as a bug?
Actually "Watson deleted frames do not sync with crick"
if c - the server is master, and cli slave
It is rather not probably because of synchronization means in this case that you can create data only on the server. But when I had seen issue
I decided to add the next question
Question 6. Who is a person that has to decide voice on this topic? @jmaupetit typed
It is related with my question about integration with external sources of data that uses his own identifiers.
It is related with not finished discussion about logic of synchronization there
And lacking documentation there.
I can send my propositions. What should I do?
When everything will go great we will have working synchronization in half of 2019 and many issues connected with it will be closed.
So let's start.
We have
I propose version control.
set reconciliation problem can be solved by
I poropose matchematical synchronization
In Error handling paragraph there is a sentence
But I propose another approach - accept any modification and store list of modifications. When two modifications overlapping, then merge them with "mathematical synchronization" that I will describe later.
Proposed tools
There is PDF
raft.pdf
and finally a list of implementations
So props:
Questions:
should we consider Watson cli like rarf node or client?
Answer:
It could be node only if have a public address, but it is to send a request to them, but this is hard to achieve.
So Watson cli should be a client in this model.
Cons:
I reseatrched some solutions and finally finised on stackoverflow asking this question
This is instantly draft of my proposition how to solve problem of synchronization. It this model Serverless lambda + text file stored anywhere can be replaced by crick backend and postgress, but vision of serverless (that are free today for small number of requests) and static file storage (that is also free for personal users) for me is more attractive than backend that must be served.