diff --git a/README.md b/README.md index e5a3aca..d893399 100644 --- a/README.md +++ b/README.md @@ -10,22 +10,23 @@ An advanced abuse management tool. It's a Synapse module that allows server owne This software attempts to resolve the complex problem of how to share pointers to rooms containing abusive content in order to block or report activity. These room lists are sensitive and sharing them can not only aid people in blocking this content but also direct bad actors to said content. -The goal of this tool is to simply block this content as efficiently as possible across many small-medium sized servers. +The goal of this tool is to block abusive content across many small to medium sized servers without exposing the sensitive room_ids in the "source list" to those servers/users. It will also "freeze" abusive accounts and provide real-time alerts about their activity to homeserver owners. ## Features Easy, setup for homeserver owners via Synapse Module - Private, hashing is used to prevent redlight clients from sharing room_ids with redlight servers + Private, hashing is used to prevent redlight clients from sharing user_ids and room_ids with redlight servers Decentralised, many people can run redlight servers with their own blocking policies, redlight clients are free to pick a provider - Safe, access restrictions and ratelimiting are used to guard the content of Redlight list + Safe, access restrictions and ratelimiting are used to guard the sensitive content of the source list + Real-time alerting, attempts to access abusive rooms and account freezes are sent to an "alert room" on the redlight clients homeserver ## How it Works -"Redlight servers" will be trusted homeservers that are modified, they'll cache Redlight list in memory while providing an API interface to "Redlight clients". Redlight servers will pick their own "content tags" that they are filtering, which by extension will allow clients to pick a level of filtering that suits them. +"Redlight servers" will be trusted homeservers that are modified, they'll cache the source list in memory while providing an API interface to "Redlight clients". Redlight servers will pick their own "content tags" that they are filtering, which by extension will allow clients to pick a level of filtering that suits them. -Redlight clients will be untrusted homeservers that are whitelisted by their desired Redlight server. When a user on a client homeserver attempts to join a room, the hash of the room_id will be sent to the redlight server, which will confirm or deny if the room is abusive, the client then denies the user entry to that room if it is flagged. +Redlight clients will be untrusted homeservers that are whitelisted by their desired Redlight server. When a user on a client homeserver attempts to join a room, the hashes of the user_id and room_id will be sent to the redlight server, which will confirm or deny if the room is abusive, the client then denies the user entry to that room if it is flagged. If a user attempts to access abusive rooms multiple times their account will be "frozen" and attempts to access any further rooms will be denied. Real-time alerts about users accessing abusive rooms or account freezes will also be sent to an alert room on the redlight clients server. For a more detailed description of how it will work please consult the [Technical Specification](./technical_spec.md). diff --git a/technical_spec.md b/technical_spec.md index 4e3752e..b983aec 100644 --- a/technical_spec.md +++ b/technical_spec.md @@ -17,13 +17,13 @@ Redlight Client - Will be untrusted homeservers that are whitelisted by their de ## The Core Issue -You might be wondering, why not just release a list of hashes of these room_ids and openly let people filter them? +You might be wondering, why not just release a list of these room_ids (or their hashes) and openly let people filter them? -Anything that can be used to identify abusive content can be used to identify abusive content ultimately. It's why access to these tools is typically so restricted. The problem isn’t that the hashes could be reversed, its that the hashes can be used to identify the abuse content. +Ultimately anything that can be used to identify abusive content can be used to identify abusive content. It's why access to these tools is typically so restricted. The problem isn’t that the hashes could be reversed, it's that the hashes can be used to identify the abuse content. Imagine if you have 100 room_ids and you know 1 is abusive, well you could us an openly distributed hash-list to find that content and do the right thing and block it. Or you could use it to locate that content a lot faster to consume it and break the law with it. -This repository attempts to be a solution to this problem. +The redlight system attempts to be a solution to this problem. ## Chain of Trust @@ -39,19 +39,35 @@ This creates a chain of trust where each party using this system must be account The following methods will be used to secure the source list: -- Avoid writing the source list to disk, redlight servers will simply pull the latest copy and store it in memory only. -- Whitelisting clients, redlight servers will only serve approved clients. -- Ratelimiting the amount of requests, if a client homeservers user is finding too many rooms they could be limited? -- Ratelimiting the amount of hits, if a client is finding too many abusive rooms in a specified timeframe their access will be automatically cut-off, forcing them to ask their redlight server to re-enable them. +- Avoid writing the source list to disk, redlight servers will simply pull the latest copy of the source list and store it in memory only. +- Whitelisting clients, redlight servers will only serve approved redlight clients, filtering requests by their IP. +- Ratelimiting the amount of requests, if a client homeserver or user is making too many requests the server or user in question will be cut off from the redlight server. (It will start throwing errors with every request.) +- Ratelimiting the amount of hits, if a clients user is finding too many abusive rooms in a specified timeframe that account will be "frozen", this means the redlight server will return an error (deny access) to every future join request that user account makes. -# Other Design Goals +## Account Freezes -Blocking not monitoring, to avoid scope creep the point of this system will only be to block access to known abusive rooms. +If a user_id attempts to access an abusive room 3 or more times the redlight server will start throwing errors for every future join request made by that account. This effectively "freezes" the user account and prevents any further illicit activity. It also prevents the user from reverse engineering the source list by attempting to enter many abusive rooms. -Client homeserver privacy, by double hashing room_ids before sending them to redlight servers analysis and collection about the rooms accessed by a redlight clients users becomes unfeasible. +Freezes on accounts are performed by the redlight server, to unfreeze an account a request needs to be made by the redlight client to their redlight server. -Config-driven and stateless, ideally redlight clients and servers will be "stateless", so no data will persist between reboots and their behaviour will be defined entirely in configuration files. + +## Real-time Alerting + +The redlight client module will alert the homeserver owners via an "alert room", where notifications of the following will be sent: +- If a user attempts to enter an abusive room and is denied access. +- If a users account has been frozen by the redlight server. + +This allows homeserver owners and moderators to act quickly in response to these incidents. + + +## Other Design Goals + +Blocking not monitoring, to avoid scope creep the point of this system will only be to block access to known abusive rooms. Further monitoring and reporting of any users entering abusive rooms should be performed by the owners of that homeserver. (The redlight client.) + +Client homeserver privacy, by double hashing the user_ids and room_ids before sending them to redlight server, analysis and collection about the rooms accessed by a redlight clients users is limited. + +~~Config-driven and stateless, ideally redlight clients and servers will be "stateless", so no data will persist between reboots and their behaviour will be defined entirely in configuration files.~~ ## LJS-4.draft: Abuse Lookup API V1 @@ -87,3 +103,18 @@ return either `200 OK` to signify a match or `204 No Content` to signify no matc "report_id": "b973d82a-6932-4cad-ac9f-f647a3a9d204", } ``` + + +## Unanswered Questions + +- How many join requests from a user is abnormal? + +More then 10 join requests in a minute might seem suspicious... (What about bots?) + +- How many join requests from a redlight client homeserver is abnormal? + +- How many join requests can a redlight server even process? + +- What other methods (besides IP) could be used to restrict requests from redlight client homeservers? + +- What other methods could be used to secure the source list and prevent interception/leaking?