# Swarm A *swarm* (group chat) is a set of participants capable of resilient, decentralized communication. For example, if two participants lose connectivity with the rest of the group (e.g., during an Internet outage) but can still reach each other over a LAN or subnetwork, they can exchange messages locally and then synchronize with the rest of the group once connectivity is restored. A *swarm* is defined by the following properties: 1. Ability to split and merge based on network connectivity. 2. History synchronization. Every participant must be able to send a message to the entire group. 3. No central authority. Can not rely on any server. 4. Non-repudiation. Devices must be able to verify past messages' validity and to replay the entire history. 5. Perfect Forward Secrecy (PFS) is provided on the transport channels. Storage is handled by each device. The main idea is to get a synchronized Merkle tree with the participants. We identified four modes for swarms that we want to implement: * **ONE_TO_ONE**: A private conversation between two endpoints—either between two users or with yourself. * **ADMIN_INVITES_ONLY**: A swarm in which only the administrator can invite members (for example, a teacher-managed classroom). * **INVITES_ONLY**: A closed swarm that admits members strictly by invitation; no one may join without explicit approval. * **PUBLIC**: A public swarm that anyone can join without prior invitation (For example a forum). ## Scenarios ### Create a Swarm *Bob wants to create a new swarm* 1. Bob creates a local Git repository. 2. Then, he creates an initial signed commit with the following: * His public key in `/admins` * His device certificate in ̀ /devices` * His CRL in ̀ /crls` 3. The hash of the first commit becomes the **ID** of the conversation 4. *Bob* announces to his other devices that he created a new conversation. This is done via an invite to join the group sent through the DHT to other devices linked to that account. ### Adding someone *Bob adds Alice* 1. *Bob* adds Alice to the repo: * Adds the invited URI in `/invited` * Adds the CRL into `/crls` 2. *Bob* sends a request on the DHT. ### Receiving an invite *Alice gets the invite to join the previously created swarm* 1. *Alice* accepts the invite (if she declines, nothing happens; she will remain in the "invited" list, and will never receive any messages) 2. A peer-to-peer connection is established between *Alice* and *Bob*. 3. *Alice* pulls the Git repository from *Bob*. **WARNING this means that messages require a connection, not from the DHT as it is today.** 4. *Alice* validates the commits from *Bob*. 5. To validate that *Alice* is a member, she removes the invite from `/invited` directory, then adds her certificate to the `/members` directory 6. Once all commits are validated and syncronized to her device, *Alice* discovers other members of the group. with these peers, she will then construct the **DRT** with *Bob* as a bootstrap. ### Sending a message *Alice sends a message to Bob* 1. *Alice* creates a commit message. She constructs a JSON payload containing the MIME type and message body. For example: ```json { "type": "text/plain", "body": "hello" } ``` 2. *Alice* ensure her device credentials are present. If *Alice*’s device certificate or its associated CRL isn’t already stored in the repository, she adds them so that other participants can verify the commit. 3. *Alice* commits to the repository (Because Jami relies primarily on commit-message metadata rather than file contents, merge conflicts are rare; the only potential conflicts would involve CRLs or certificates, which are versioned in a dedicated location). 4. *Alice* announces the commit via the **DRT** with a service message and pings the DHT for mobile devices (they must receive a push notification). ```{note} To notify other devices, the sender transmits a SIP message with `type: application/im-gitmessage-id`. The JSON payload includes the deviceId (the sender’s), the conversationId and the reference (hash) of the new commit. ``` ### Receiving a message *Bob receives a message from Alice* 1. *Bob* performs a Git pull on *Alice*'s repository. 2. All incoming commits MUST be verified by a hook. 3. If all commits are valid, commits are stored and displayed.*Bob* then announces the message via the DRT for other devices. 4. If any commit is invalid, pull is aborted. *Alice* must restore her repository to a correct state before retrying. ### Validating a commit To avoid users pushing some unwanted commits (with conflicts, false messages, etc), this is how each commit (from the oldest to the newest one) MUST be validated before merging a remote branch: ```{note} 1. If the validation fails, the fetch is ignored and we do not merge the branch (and remove the data), and the user should be notified. 2. If a fetch is too big, it's not merged. ``` + For each incoming commit, ensure that the sending device is currently authorized and that the issuer’s certificate exists under /members or /admins, and the device’s certificate under /devices. + Then handle one of three cases, based on the commit’s parent count: + Merge Commit (2 parents). No further validation is required, merges are always accepted. + Initial Commit (0 parents). Validate that this is the very first repository snapshot: + Admin certificate is added. + Device certificate is added. + CRLs (Certificate Revocation Lists) are added. + No other files are present. + Ordinary Commit (1 parent). The commit message must be JSON with a top‑level `type` field. Handle each `type` as follows: + If `text` (or any non–file‑modifying MIME type) + Signature is valid against the author’s certificate in the repo. + No unexpected files are added or removed. + If `vote` + `voteType` is one of the supported values (e.g. "ban", "unban"). + The vote matches the signing user. + The signer is an admin, their device is present, and not themselves banned. + No unexpected files are added or removed. + If `member` + If `adds` + Properly signed by the inviter. + New member’s URI appears under `/invited`. + No unexpected files are added or removed. + If ONE_TO_ONE, ensure exactly one admin and one member. + If ADMIN_INVITES_ONLY, the inviter must be an admin. + If `joins` + Properly signed by the joining device. + Device certificate added under `/devices`. + Corresponding invite removed from `/invited` and certificate added to `/members`. + No unexpected files are added or removed. + If `banned` + Vote is valid per the `vote` rules above. + Ban is issued by an admin. + Target’s certificate moved to /banned. + Only files related to the ban vote are removed. + No unexpected files are added or removed. + Fallback. If the commit’s type or structure is **unrecognized**, reject it and notify the peer (or user) that they may be running an outdated version or attempting unauthorized changes. ### Ban a device ```{important} Jami source code tends to use the terms **(un)ban**, while the user interface uses the terms **(un)block**. ``` *Alice, Bob, Carla, Denys are in a swarm. Alice issues a ban against Denys.* In a fully peer‑to‑peer system with no central authority, this simple action exposes three core challenges: 1. Untrusted Timestamps: Commit timestamps cannot be relied upon for ordering ban events, as any device can forge or replay commits with arbitrary dates. 2. Conflicting bans: In cases where multiple admin devices exist, network partitions can result in conflicting ban decisions. For instance, if Alice can communicate with Bob but not with Denys and Carla, while Carla can communicate with Denys, conflicting bans may occur. If Denys bans Alice while Alice bans Denys, the group’s state becomes unclear when all members eventually reconnect and merge their conversation histories. 3. Compromised or expired devices: Devices can be compromised, stolen, or have their certificates expire. The system must allow banning such devices and ensure they cannot manipulate their certificate or commit timestamps to send unauthorized messages or falsify their expiration status. Similar systems (with distributed group systems) are not so much, but these are some examples: + [mpOTR doesn't define how to ban someone](https://www.cypherpunks.ca/~iang/pubs/mpotr.pdf) + Signal, without any central server for group chat (EDIT: they recently change that point), doesn't give the ability to ban someone from a group. This voting system needs a human action to ban someone or must be based on the CRLs info from the repository (because we can not trust external CRLs). ### Remove a device from a conversation This is the only part that MUST have a consensus to avoid conversation's split, like if two members kick each other from the conversation, what will see the third one? This is needed to detect revoked devices, or simply avoid getting unwanted people present in a public room. The process is pretty similar between a member and a device: *Alice removes Bob* ```{important} Alice **MUST** be an admin to vote. ``` + First, she votes for banning Bob. To do that, she creates the file in /votes/ban/members/uri_bob/uri_alice (members can be replaced by devices for a device, or invited for invites or admins for admins) and commits + Then she checks if the vote is resolved. This means that >50% of the admins agree to ban Bob (if she is alone, it's sure it's more than 50%). + If the vote is resolved, files into /votes/ban can be removed, all files for Bob in /members, /admins, /invited, /CRLs, /devices can be removed (or only in /devices if it's a device that is banned) and Bob's certificate can be placed into /banned/members/bob_uri.crt (or /banned/devices/uri.crt if a device is banned) and committed to the repo + Then, Alice informs other users (outside Bob) *Alice (admin) re-adds Bob (banned member)* + If she votes for unbanning Bob. To do that, she creates the file in /votes/unban/members/uri_bob/uri_alice (members can be replaced by devices for a device, or invited for invites or admins for admins) and commits + Then she checks if the vote is resolved. This means that >50% of the admins agree to ban Bob (if she is alone, it's sure it's more than 50%). + If the vote is resolved, files into /votes/unban can be removed, all files for Bob in /members, /admins, /invited, /CRLs, can be re-added (or only in /devices if it's a device that is unbanned) and committed to the repo ### Remove a conversation 1. Save in convInfos removed=time::now() (like removeContact saves in contacts) that the conversation is removed and sync with other user's devices 2. Now, if a new commit is received for this conversation it's ignored 3. Now, if Jami startup and the repo is still present, the conversation is not announced to clients 4. Two cases: a. If no other member in the conversation we can immediately remove the repository b. If still other members, commit that we leave the conversation, and now wait that at least another device sync this message. This avoids the fact that other members will still detect the user as a valid member and still sends new message notifications. 5. When we are sure that someone is synched, remove erased=time::now() and sync with other user's devices 6. All devices owned by the user can now erase the repository and related files ## How to specify a mode Modes can not be changed through time. Or it's another conversation. So, this data is stored in the initial commit message. The commit message will be the following: ```json { "type": "initial", "mode": 0, } ``` For now, "mode" accepts values 0 (ONE_TO_ONE), 1 (ADMIN_INVITES_ONLY), 2 (INVITES_ONLY), 3 (PUBLIC) ### Processes for 1:1 chats The goal here is to keep the old API (addContact/removeContact, sendTrustRequest/acceptTrustRequest/discardTrustRequest) to create a chat with a peer and its contact. This still implies some changes that we cannot ignore: The process is still the same, an account can add a contact via addContact, then send a TrustRequest via the DHT. But two changes are necessary: 1. The TrustRequest embeds a "conversationId" to inform the peer what conversation to clone when accepting the request 2. TrustRequest are retried when contact come backs online. It's not the case today (as we don't want to generate a new TrustRequest if the peer discard the first). So, if an account receives a trust request, it will be automatically ignored if the request with a related conversation is declined (as convRequests are synched) Then, when a contact accepts the request, a period of sync is necessary, because the contact now needs to clone the conversation. removeContact() will remove the contact and related 1:1 conversations (with the same process as "Remove a conversation"). The only note here is that if we ban a contact, we don't wait for sync, we just remove all related files. #### Tricky scenarios There are some cases where two conversations can be created. This is at least two of those scenarios: 1. Alice adds Bob. 2. Bob accepts. 3. Alice removes Bob. 4. Alice adds Bob. or 1. Alice adds Bob and Bob adds Alice at the same time, but both are not connected together. In this case, two conversations are generated. We don't want to remove messages from users or choose one conversation here. So, sometimes two conversations between the same members will be shown. It will generate some bugs during the transition time (as we don't want to break API, the inferred conversation will be one of the two shown conversations, but for now it's "ok-ish", will be fixed when clients will fully handle conversationId for all APIs (calls, file transfer, etc)). ```{important} After accepting a conversation's request, there is a time the daemon needs to retrieve the distant repository. During this time, clients MUST show a syncing view to give informations to the user. While syncing: * ConfigurationManager::getConversations() will return the conversation's id even while syncing. * ConfigurationManager::conversationInfos() will return {{"syncing": "true"}} if syncing. * ConfigurationManager::getConversationMembers() will return a map of two URIs (the current account and the peer who sent the request). ``` ### Conversations requests specification Conversations requests are represented by a **Map** with the following keys: + id: the conversation ID + from: URI of the sender + received: timestamp + title: (optional) name for the conversation + description: (optional) + avatar: (optional) ### Conversation's profile synchronization To be identifiable, a conversation generally needs some metadata, like a title (eg: Jami), a description (eg: some links, what is the project, etc), and an image (the logo of the project). Those metadata are optional but shared across all members, so need to be synced and incorporated in the requests. #### Storage in the repository The profile of the conversation is stored in a classic vCard file at the root (`/profile.vcf`) like: ``` BEGIN:VCARD VERSION:2.1 FN:TITLE DESCRIPTION:DESC END:VCARD ``` #### Synchronization To update the vCard, a user with enough permissions (by default: =ADMIN) needs to edit `/profile.vcf` and will commit the file with the mimetype `application/update-profile`. The new message is sent via the same mechanism and all peers will receive the **MessageReceived** signal from the daemon. The branch is dropped if the commit contains other files or too big or if done by a non-authorized member (by default: & prefs); // Retrieve preferences std::map getConversationPreferences(const std::string& accountId, const std::string& conversationId); // Emitted when preferences are updated (via setConversationPreferences or by syncing with other devices) struct ConversationPreferencesUpdated { constexpr static const char* name = "ConversationPreferencesUpdated"; using cb_type = void(const std::string& /*accountId*/, const std::string& /*conversationId*/, std::map /*preferences*/); }; ``` ### Merge conflicts management Because two admins can change the description at the same time, a merge conflict can occur on `profile.vcf`. In this case, the commit with the higher hash (eg ffffff > 000000) will be chosen. #### APIs The user got 2 methods to get and set conversation's metadatas: ```xml Update conversation's infos (supported keys: title, description, avatar) Get conversation's infos (mode, title, description, avatar) ``` where `infos` is a `map` with the following keys: + mode: READ-ONLY + title + description + avatar #### Re-import an account (link/export) The archive MUST contain conversationId to be able to retrieve conversations on new commits after a re-import (because there is no invite at this point). If a commit comes for a conversation not present there are two possibilities: + The conversationId is there, in this case, the daemon is able to re-clone this conversation + The conversationId is missing, so the daemon asks (via a message `{{"application/invite", conversationId}}`) a new invite that the user needs to (re)accepts ```{important} A conversation can only be retrieved if a contact or another device is there, else it will be lost. There is no magic. ``` ## Used protocols ### Git #### Why this choice Each conversation will be a Git repository. This choice is motivated by: 1. We need to sync and order messages. The Merkle Tree is the perfect structure to do that and can be linearized by merging branches. Moreover, because it's massively used by Git, it's easy to sync between devices. 2. Distributed by nature. Massively used. Lots of backends and pluggable. 3. Can verify commits via hooks and massively used crypto 4. Can be stored in a database if necessary 5. Conflicts are avoided by using commit messages, not files. #### What we have to validate + Performance? `git.lock` can be low + Hooks in libgit2 + Multiple pulls at the same time? #### Limits History can not be deleted. To delete a conversation, the device has to leave the conversation and create another one. However, non-permanent messages (like messages readable only for some minutes) can be sent via a special message via the DRT (like Typing or Read notifications). #### Structure ``` / - invited - admins (public keys) - members (public keys) - devices (certificates of authors to verify commits) - banned - devices - invited - admins - members - votes - ban - members - uri - uriAdmin - devices - uri - uriAdmin - unban - members - uri - uriAdmin - CRLs ``` ### File transfer This new system overhauls file sharing: the entire history is now kept in sync, so any device in the conversation can instantly access past files. Rather than forcing the sender to push files directly—an approach that was fragile in the face of connection drops and often required manual retries—devices simply download files when they need them. Moreover, once one device has downloaded a file, it can act as a host for others, ensuring files remain available even if the original sender goes offline. #### Protocol The sender adds a new commit in the conversation with the following format: ``` value["tid"] = "RANDOMID"; value["displayName"] = "DISPLAYNAME"; value["totalSize"] = "SIZE OF THE FILE"; value["sha3sum"] = "SHA3SUM OF THE FILE"; value["type"] = "application/data-transfer+json"; ``` and creates a link in `${data_path}/conversation_data/${conversation_id}/${file_id}` where `file_id=${commitid}_${value["tid"]}.${extension}` Then, the receiver can now download the files by contacting the devices hosting the file by opening a channel with `name="data-transfer://" + conversationId + "/" + currentDeviceId() + "/" + fileId` and store the info that the file is waiting in `${data_path}/conversation_data/${conversation_id}/waiting` The device receiving the connection will accepts the channel by verifying if the file can be sent (if sha3sum is correct and if file exists). The receiver will keep the first opened channel, close the others and write into a file (with the same path as the sender: `${data_path}/conversation_data/${conversation_id}/${file_id}`) all incoming data. When the transfer is finished or the channel closed, the sha3sum is verified to validate that the file is correct (else it's deleted). If valid, the file will be removed from the waiting. In case of failure, when a device of the conversation will be back online, we will ask for all waiting files by the same way. ### Call in Swarm #### Idea A swarm conversation can have multiple rendez-vous. A rendez-vous is defined by the following URI: "accountUri/deviceId/conversationId/confId" where accountUri/deviceId describes the host. The host can be determined via two ways: + In the swarm metadatas. Where it's stored like the title/desc/avatar of the room + Or the initial caller. When starting a call, the host will add a new commit to the repository, with the URI to join (accountUri/deviceId/conversationId/confId). This will be valid till the end of the call (announced by a commit with the duration to show) So every part will receive the infos that a call has started and will be able to join it by calling it. #### Attacks? * Avoid Git bombs #### Notes The timestamp of a commit can be trusted because it's editable. Only the user's timestamp can be trusted. ### TLS Git operations, control messages, files, and other things will use a p2p TLS v1.3 link with only ciphers which guaranty PFS. So each key is renegotiated for each new connexion. ### DHT (UDP) Used to send messages for mobiles (to trigger push notifications) and to initiate TCP connexions. ### Network activity #### Process to invite someone Alice wants to invite Bob: 1. Alice adds bob to a conversation 2. Alice generates an invite: { "application/invite+json" : { "conversationId": "$id", "members": [{...}] }} 3. Two possibilities for sending the message a. If not connected, via the DHT b. Else, Alice sends on the SIP channel 4. Two possibilities for Bob a. Receives the invite, a signal is emitted for the client b. Not connected, so will never receive the request cause Alice must not know if Bob just ignored or blocked Alice. The only way is to regenerate a new invite via a new message (cf. next scenario) #### Process to send a message to someone Alice wants to send a message to Bob: 1. Alice adds a message in the repo, giving an ID 2. Alice gets a message received (from herself) if successful 3. Two possibilities, alice and bob are connected, or not. In both case a message is crafted: { "application/im-gitmessage-id" : "{"id":"$convId", "commit":"$commitId", "deviceId": "$alice_device_hash"}"}. a. If not connected, via the DHT b. Else, Alice sends on the SIP channel 4. Four possibilities for Bob: a. Bob is not connected to Alice, so if he trusts Alice, ask for a new connection and go to b. b. If connected, fetch from Alice and announce new messages c. Bob doesn't know that conversation. Ask through the DHT to get an invite first to be able to accept that conversation ({"application/invite", conversationId}) d. Bob is disconnected (no network, or just closed). He will not receive the new message but will try to sync when the next connection will occur ### Implementation ![Diagram: swarm chat classes](images/swarm-chat-classes-diagram.jpg) ### Supported messages #### Initial message ```json { "type": "initial", "mode": 0, "invited": "URI" } ``` Represents the first commit of a repository and contains the mode: ```cpp enum class ConversationMode : int { ONE_TO_ONE = 0, ADMIN_INVITES_ONLY, INVITES_ONLY, PUBLIC } ``` and `invited` if mode = 0. #### Text message ```json { "type": "text/plain", "body": "content", "react-to": "id (optional)" } ``` Or for an edition: ```json { "type": "application/edited-message", "body": "content", "edit": "id of the edited commit" } ``` #### Calls Show the end of a call (duration in milliseconds): ```json { "type": "application/call-history+json", "to": "URI", "duration": "3000" } ``` Or for hosting a call in a group (when it starts) ```json { "type": "application/call-history+json", "uri": "host URI", "device": "device of the host", "confId": "hosted confId" } ``` A second commit with the same JSON + `duration` is added at the end of the call when hosted. #### Add a file ```json { "type": "application/data-transfer+json", "tid": "unique identifier of the file", "displayName": "File name", "totalSize": "3000", "sha3sum": "a sha3 sum" } ``` `totalSize` is in bits, #### Updating profile ```json { "type": "application/update-profile", } ``` #### Member event ```json { "type": "member", "uri": "member URI", "action": "add/join/remove/ban" } ``` When a member is invited, join or leave or is kicked from a conversation #### Vote event Generated by administrators to add a vote for kicking or un-kicking someone. ```json { "type": "vote", "uri": "member URI", "action": "ban/unban" } ``` --------------- **!! OLD DRAFT !!** ```{note} Following notes are not organized yet. Just some line of thoughts. ``` ## Crypto improvements. For a serious group chat feature, we also need serious crypto. With the current design, if a certificate is stolen as the previous DHT values of a conversation, the conversation can be decrypted. Maybe we need to go to something like **Double ratchet**. ```{note} A lib might exist to implement group conversations. ``` Needs ECC support in OpenDHT ## Usage ### Add Roles? There is two major use case for group chats: 1. Something like a Mattermost in a company, with private channels, and some roles (admin/spectator/bot/etc) or for educations (where only a few are active). 2. Horizontal conversations like a conversation between friends. Jami will be for which one? #### Implementation idea A certificate for a group that sign user with a flag for a role. Adding or revoking can also be done. ### Join a conversation + Only via a direct invite + Via a link/QR Code/whatever + Via a room name? (a **hash** on the DHT) ## What we need + Confidentiality: members outside of the group chat should not be able to read messages in the group + Forward secrecy: if any key from the group is compromised, previous messages should remain confidential (as much as possible) + Message ordering: There is a need to have messages in the right order + Synchronization: There is also a need to be sure to have all messages at soon as possible. + Persistence: Actually, a message on the DHT lives only 10 minutes. Because it's the best timing calculated for this kind of DHT. To persist data, the node must re-put the value on the DHT every 10 minutes. Another way to do when the node is offline is to let nodes re-put the data. But, if after 10 minutes, 8 nodes are still here, they will do 64 requests (and it's exponential). The current way to avoid spamming for that is queried. This will still do 64 requests but limit the max redundancy to 8 nodes. ## Other distributed ways + IPFS: Need some investigation + BitMessage: Need some investigation + Maidsafe: Need some investigation ### Based on current work we have Group chat can be based on the same work we already have for multi-devices (but here, with a group certificate). Problems to solve: 1. History sync. This needs to move the database from the client into the daemon. 2. If nobody is connected, the synchronization can not be done, and the person will never see the conversation ### Another dedicated DHT Like a DHT with a superuser. (Not convinced) ## File transfer Currently, the file transfer algorithm is based on a TURN connection (See {doc}`file-transfer`). In the case of a big group, this will be bad. We first need a p2p implement for the file transfer. Implement the RFC for p2p transfer. Other problem: currently there is no implementation for TCP support for ICE in PJSIP. This is mandatory for this point (in PJSIP or homemade) ## Resources + + Robust distributed synchronization of networked linear systems with intermittent information (Sean Phillips and Ricardo G.Sanfelice)