fediblockhole-misskey/README.md

# FediBlockHole

A tool for keeping a Mastodon instance blocklist synchronised with remote lists.

## Features

 - Read block lists from multiple remote instances
 - Read block lists from multiple URLs, including local files
 - Write a unified block list to a local CSV file
 - Push unified blocklist updates to multiple remote instances
 - Control import and export fields

## Installing

Installs using `pip`.

Clone the repo and install from source like this:

```
python3 -m pip install .
```

Installation adds a commandline tool: `fediblock-sync`

Once things stablise a bit more, I'll upload the package to PyPI.

Instance admins who want to use this tool will need to add an Application at
`https://<instance-domain>/settings/applications/` so they can authorize the
tool to create and update domain blocks with an OAuth token. 

### Reading remote instance blocklists

If a remote instance makes its domain blocks public, you don't need
a token to read them.

If a remote instance only shows its domain blocks to local accounts
you'll need to have a token with `read:blocks` authorization set up.
If you have an account on that instance, you can get a token by setting up a new
Application at `https://<instance-domain>/settings/applications/`.

To read admin blocks from a remote instance, you'll need to ask the instance
admin to add a new Application at
`https://<instance-domain>/settings/applications/` and then tell you the access
token.

The application needs the `admin:read:domain_blocks` OAuth scope, but
unfortunately this scope isn't available in the current application screen
(v4.0.2 of Mastodon at time of writing). There is a way to do it with scopes,
but it's really dangerous, so I'm not going to tell you what it is here.

A better way is to ask the instance admin to connect to the PostgreSQL database
and add the scope there, like this:

```
UPDATE oauth_access_tokens
    SET scopes='admin:read:domain_blocks'
    WHERE token='<your_app_token>';
```

When that's done, FediBlockHole should be able to use its token to authorise
adding or updating domain blocks via the API.

### Writing instance blocklists

To write domain blocks into an instance requires both the
`admin:read:domain_blocks` and `admin:write:domain_blocks` OAuth scopes. The
`read` scope is used to read the current list of domain blocks so we update ones
that already exist, rather than trying to add all new ones and clutter up the
instance.

Again, there's no way to do this (yet) on the application admin
screen so we need to ask our destination admins to update the application
permissions similar to reading domain blocks:

```
UPDATE oauth_access_tokens
    SET scopes='admin:read:domain_blocks admin:write:domain_blocks'
    WHERE token='<your_app_token>';
```

When that's done, FediBlockHole should be able to use its token to authorise
adding or updating domain blocks via the API.

## Configuring

Once you have your applications and tokens and scopes set up, create a
configuration file for FediBlockHole to use. You can put it anywhere and use the
`-c <configfile>` commandline parameter to tell FediBlockHole where it is.

Or you can use the default location of `/etc/default/fediblockhole.conf.toml`.

As the filename suggests, FediBlockHole uses TOML syntax.

There are 3 key sections:
 
 1. `blocklist_urls_sources`: A list of URLS to read CSV formatted blocklists from
 1. `blocklist_instance_sources`: A list of instances to read blocklists from via API
 1. `blocklist_instance_destinations`: A list of instances to write blocklists to via API

### URL sources

The URL sources is a list of URLs to fetch a CSV formatted blocklist from.

The required fields are `domain` and `severity`.

Optional fields that the tool understands are `public_comment`, `private_comment`, `obfuscate`, `reject_media` and `reject_reports`.

### Instance sources

The tool can also read domain_blocks from instances directly.

The configuration is a list of dictionaries of the form:
```
{ domain = '<domain_name>', token = '<BearerToken>', admin = false }
```

The `domain` is the fully-qualified domain name of the API host for an instance
you want to read or write domain blocks to/from. 

The `token` is an optional OAuth token for the application that's configured in
the instance to allow you to read/write domain blocks, as discussed above.

`admin` is an optional field that tells the tool to use the more detailed admin
API endpoint for domain_blocks, rather than the more public API endpoint that
doesn't provide as much detail. You will need a `token` that's been configured to
permit access to the admin domain_blocks scope, as detailed above.

### Instance destinations

The tool supports pushing a unified blocklist to multiple instances.

Configure the list of instances you want to push your blocklist to in the
`blocklist_instance_detinations` list. Each entry is of the form:

```
{ domain = '<domain_name>', token = '<BearerToken>', max_followed_severity = 'silence' }
```

The fields `domain` and `token` are required. `max_followed_severity` is optional.

The `domain` is the hostname of the instance you want to push to. The `token` is
an application token with both `admin:read:domain_blocks` and
`admin:write:domain_blocks` authorization.

The optional `max_followed_severity` setting sets a per-instance limit on the
severity of a domain_block if there are accounts on the instance that follow
accounts on the domain to be blocked. If `max_followed_severity` isn't set, it
defaults to 'silence'.

This setting exists to give people time to move off an instance that is about to
be defederated and bring their followers from your instance with them. Without
it, if a new Suspend block appears in any of the blocklists you subscribe to (or
a block level increases from Silence to Suspend) and you're using the default
`max` mergeplan, the tool would immediately suspend the instance, cutting
everyone on the blocked instance off from their existing followers on your
instance, even if they move to a new instance. If you actually want that
outcome, you can set `max_followed_severity = 'suspend'` and use the `max`
mergeplan.

Once the follow count drops to 0, the tool will automatically use the highest severity it finds again (if you're using the `max` mergeplan).


## Using the tool

Once you've configured the tool, run it like this:

```
fediblock-sync -c <configfile_path>
```

If you put the config file in `/etc/default/fediblockhole.conf.toml` you don't need to pass in the config file path.

## More advanced configuration

For a list of possible configuration options, check the `--help` and read the
sample configuration file in `etc/sample.fediblockhole.conf.toml`.

### save_intermediate

This option tells the tool to save the unmerged blocklists it fetches from
remote instances and URLs into separate files. This is handy for debugging, or
just to have a non-unified set of blocklist files.

Works with the `savedir` setting to control where to save the files.

These are parsed blocklists, not the raw data, and so will be affected by `import_fields`.

The filename is based on the URL or domain used so you can tell where each list came from.

### savedir

Sets where to save intermediate blocklist files. Defaults to `/tmp`.

### no_push_instance

Defaults to False.

When set, the tool won't actually try to push the unified blocklist to any
configured instances.

If you want to see what the tool would try to do, but not actually apply any
updates, use `--dryrun`.

### no_fetch_url

Skip the fetching of blocklists from any URLs that are configured.

### no_fetch_instance

Skip the fetching of blocklists from any remote instances that are configured.

### mergeplan

If two (or more) blocklists define blocks for the same domain, but they're
different, `mergeplan` tells the tool how to resolve the conflict.

`max` is the default. It uses the _highest_ severity block it finds as the one
that should be used in the unified blocklist.

`min` does the opposite. It uses the _lowest_ severity block it finds as the one
to use in the unified blocklist.

A full discussion of severities is beyond the scope of this README, but here is
a quick overview of how it works for this tool.

The severities are:

 - **noop**, level 0: This is essentially an 'unblock' but you can include a
   comment.
 - **silence**, level 1: A silence adds friction to federation with an instance.
 - **suspend**, level 2: A full defederation with the instance.

With `mergeplan` set to `max`, _silence_ would take precedence over _noop_, and
_suspend_ would take precedence over both.

With `mergeplan` set to `min`, _silence_ would take precedence over _suspend_,
and _noop_ would take precedence over both.

You would want to use `max` to ensure that you always block with whichever your
harshest fellow admin thinks should happen.

You would want to use `min` to ensure that your blocks do what your most lenient
fellow admin thinks should happen.

### import_fields

`import_fields` controls which fields will be imported from remote
instances and URL blocklists, and which fields are pushed to instances from the
unified blocklist.

The fields `domain` and `severity` are always included, so only define extra
fields, if you want them.

You can't export fields you haven't imported, so `export_fields` should be a
subset of `import_fields`, but you can run the tool multiple times. You could,
for example, include lots of fields for an initial import to build up a
comprehensive list for export, combined with the `--no-push-instances` option so
you don't actually apply the full list to anywhere.

Then you could use a different set of options when importing so you have all the
detail in a file, but only push `public_comment` to instances.

### export_fields

`export_fields` controls which fields will get saved to the unified blocklist
file, if you export one.

The fields `domain` and `severity` are always included, so only define extra
fields, if you want them.
First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00			`# FediBlockHole`

			`A tool for keeping a Mastodon instance blocklist synchronised with remote lists.`

			`## Features`

Updated README with more detailed config help. Updated sample config file with new options. 2022-12-20 06:24:56 +00:00			`- Read block lists from multiple remote instances`
			`- Read block lists from multiple URLs, including local files`
			`- Write a unified block list to a local CSV file`
			`- Push unified blocklist updates to multiple remote instances`
			`- Control import and export fields`
First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00
			`## Installing`

Updated README. 2023-01-09 22:00:15 +00:00			Installs using `pip`.

			`Clone the repo and install from source like this:`

			```
			`python3 -m pip install .`
			```

			Installation adds a commandline tool: `fediblock-sync`

			`Once things stablise a bit more, I'll upload the package to PyPI.`

First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00			`Instance admins who want to use this tool will need to add an Application at`
Add ability to set max severity level if an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5 2023-01-09 05:51:30 +00:00			`https://<instance-domain>/settings/applications/` so they can authorize the
			`tool to create and update domain blocks with an OAuth token.`
First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00
			`### Reading remote instance blocklists`

Add ability to set max severity level if an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5 2023-01-09 05:51:30 +00:00			`If a remote instance makes its domain blocks public, you don't need`
			`a token to read them.`
First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00
Add ability to set max severity level if an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5 2023-01-09 05:51:30 +00:00			`If a remote instance only shows its domain blocks to local accounts`
			you'll need to have a token with `read:blocks` authorization set up.
			`If you have an account on that instance, you can get a token by setting up a new`
			Application at `https://<instance-domain>/settings/applications/`.

			`To read admin blocks from a remote instance, you'll need to ask the instance`
			`admin to add a new Application at`
			`https://<instance-domain>/settings/applications/` and then tell you the access
			`token.`

			The application needs the `admin:read:domain_blocks` OAuth scope, but
			`unfortunately this scope isn't available in the current application screen`
			`(v4.0.2 of Mastodon at time of writing). There is a way to do it with scopes,`
			`but it's really dangerous, so I'm not going to tell you what it is here.`
First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00
			`A better way is to ask the instance admin to connect to the PostgreSQL database`
			`and add the scope there, like this:`

			```
			`UPDATE oauth_access_tokens`
			`SET scopes='admin:read:domain_blocks'`
			`WHERE token='<your_app_token>';`
			```

			`When that's done, FediBlockHole should be able to use its token to authorise`
			`adding or updating domain blocks via the API.`

			`### Writing instance blocklists`

			`To write domain blocks into an instance requires both the`
			`admin:read:domain_blocks` and `admin:write:domain_blocks` OAuth scopes. The
			`read` scope is used to read the current list of domain blocks so we update ones
			`that already exist, rather than trying to add all new ones and clutter up the`
			`instance.`

			`Again, there's no way to do this (yet) on the application admin`
			`screen so we need to ask our destination admins to update the application`
			`permissions similar to reading domain blocks:`

			```
			`UPDATE oauth_access_tokens`
			`SET scopes='admin:read:domain_blocks admin:write:domain_blocks'`
			`WHERE token='<your_app_token>';`
			```

			`When that's done, FediBlockHole should be able to use its token to authorise`
			`adding or updating domain blocks via the API.`

			`## Configuring`

			`Once you have your applications and tokens and scopes set up, create a`
			`configuration file for FediBlockHole to use. You can put it anywhere and use the`
			`-c <configfile>` commandline parameter to tell FediBlockHole where it is.

			Or you can use the default location of `/etc/default/fediblockhole.conf.toml`.

			`As the filename suggests, FediBlockHole uses TOML syntax.`

Add ability to set max severity level if an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5 2023-01-09 05:51:30 +00:00			`There are 3 key sections:`

			1. `blocklist_urls_sources`: A list of URLS to read CSV formatted blocklists from
			1. `blocklist_instance_sources`: A list of instances to read blocklists from via API
			1. `blocklist_instance_destinations`: A list of instances to write blocklists to via API

			`### URL sources`

			`The URL sources is a list of URLs to fetch a CSV formatted blocklist from.`

			The required fields are `domain` and `severity`.

			Optional fields that the tool understands are `public_comment`, `private_comment`, `obfuscate`, `reject_media` and `reject_reports`.
First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00
Add ability to set max severity level if an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5 2023-01-09 05:51:30 +00:00			`### Instance sources`
First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00
Add ability to set max severity level if an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5 2023-01-09 05:51:30 +00:00			`The tool can also read domain_blocks from instances directly.`

			`The configuration is a list of dictionaries of the form:`
First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00			```
Add ability to set max severity level if an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5 2023-01-09 05:51:30 +00:00			`{ domain = '<domain_name>', token = '<BearerToken>', admin = false }`
First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00			```

			The `domain` is the fully-qualified domain name of the API host for an instance
Add ability to set max severity level if an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5 2023-01-09 05:51:30 +00:00			`you want to read or write domain blocks to/from.`

			The `token` is an optional OAuth token for the application that's configured in
			`the instance to allow you to read/write domain blocks, as discussed above.`

			`admin` is an optional field that tells the tool to use the more detailed admin
			`API endpoint for domain_blocks, rather than the more public API endpoint that`
			doesn't provide as much detail. You will need a `token` that's been configured to
			`permit access to the admin domain_blocks scope, as detailed above.`

			`### Instance destinations`

			`The tool supports pushing a unified blocklist to multiple instances.`

			`Configure the list of instances you want to push your blocklist to in the`
			`blocklist_instance_detinations` list. Each entry is of the form:

			```
			`{ domain = '<domain_name>', token = '<BearerToken>', max_followed_severity = 'silence' }`
			```

			The fields `domain` and `token` are required. `max_followed_severity` is optional.

			The `domain` is the hostname of the instance you want to push to. The `token` is
			an application token with both `admin:read:domain_blocks` and
			`admin:write:domain_blocks` authorization.

			The optional `max_followed_severity` setting sets a per-instance limit on the
			`severity of a domain_block if there are accounts on the instance that follow`
			accounts on the domain to be blocked. If `max_followed_severity` isn't set, it
			`defaults to 'silence'.`

			`This setting exists to give people time to move off an instance that is about to`
			`be defederated and bring their followers from your instance with them. Without`
			`it, if a new Suspend block appears in any of the blocklists you subscribe to (or`
			`a block level increases from Silence to Suspend) and you're using the default`
			`max` mergeplan, the tool would immediately suspend the instance, cutting
			`everyone on the blocked instance off from their existing followers on your`
			`instance, even if they move to a new instance. If you actually want that`
			outcome, you can set `max_followed_severity = 'suspend'` and use the `max`
			`mergeplan.`

			Once the follow count drops to 0, the tool will automatically use the highest severity it finds again (if you're using the `max` mergeplan).

First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00
			`## Using the tool`

			`Once you've configured the tool, run it like this:`

			```
Updated README. 2023-01-09 22:00:15 +00:00			`fediblock-sync -c <configfile_path>`
First working version. Only deals with instances directly, not files. Includes basic instructions on how to configure and use. Includes example config file. 2022-12-19 20:53:28 +00:00			```

Add ability to set max severity level if an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5 2023-01-09 05:51:30 +00:00			If you put the config file in `/etc/default/fediblockhole.conf.toml` you don't need to pass in the config file path.
Updated README with more detailed config help. Updated sample config file with new options. 2022-12-20 06:24:56 +00:00
			`## More advanced configuration`

			For a list of possible configuration options, check the `--help` and read the
			sample configuration file in `etc/sample.fediblockhole.conf.toml`.

Add ability to set max severity level if an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5 2023-01-09 05:51:30 +00:00			`### save_intermediate`
Updated README with more detailed config help. Updated sample config file with new options. 2022-12-20 06:24:56 +00:00
			`This option tells the tool to save the unmerged blocklists it fetches from`
			`remote instances and URLs into separate files. This is handy for debugging, or`
			`just to have a non-unified set of blocklist files.`

			Works with the `savedir` setting to control where to save the files.

			These are parsed blocklists, not the raw data, and so will be affected by `import_fields`.

			`The filename is based on the URL or domain used so you can tell where each list came from.`

			`### savedir`

			Sets where to save intermediate blocklist files. Defaults to `/tmp`.

			`### no_push_instance`

			`Defaults to False.`

			`When set, the tool won't actually try to push the unified blocklist to any`
			`configured instances.`

			`If you want to see what the tool would try to do, but not actually apply any`
			updates, use `--dryrun`.

			`### no_fetch_url`

			`Skip the fetching of blocklists from any URLs that are configured.`

			`### no_fetch_instance`

			`Skip the fetching of blocklists from any remote instances that are configured.`

			`### mergeplan`

			`If two (or more) blocklists define blocks for the same domain, but they're`
			different, `mergeplan` tells the tool how to resolve the conflict.

			`max` is the default. It uses the _highest_ severity block it finds as the one
			`that should be used in the unified blocklist.`

			`min` does the opposite. It uses the _lowest_ severity block it finds as the one
			`to use in the unified blocklist.`

			`A full discussion of severities is beyond the scope of this README, but here is`
			`a quick overview of how it works for this tool.`

			`The severities are:`

			`- noop, level 0: This is essentially an 'unblock' but you can include a`
			`comment.`
			`- silence, level 1: A silence adds friction to federation with an instance.`
			`- suspend, level 2: A full defederation with the instance.`

			With `mergeplan` set to `max`, _silence_ would take precedence over _noop_, and
			`_suspend_ would take precedence over both.`

			With `mergeplan` set to `min`, _silence_ would take precedence over _suspend_,
			`and _noop_ would take precedence over both.`

			You would want to use `max` to ensure that you always block with whichever your
			`harshest fellow admin thinks should happen.`

			You would want to use `min` to ensure that your blocks do what your most lenient
			`fellow admin thinks should happen.`

			`### import_fields`

			`import_fields` controls which fields will be imported from remote
			`instances and URL blocklists, and which fields are pushed to instances from the`
			`unified blocklist.`

			The fields `domain` and `severity` are always included, so only define extra
			`fields, if you want them.`

			You can't export fields you haven't imported, so `export_fields` should be a
			subset of `import_fields`, but you can run the tool multiple times. You could,
			`for example, include lots of fields for an initial import to build up a`
			comprehensive list for export, combined with the `--no-push-instances` option so
			`you don't actually apply the full list to anywhere.`

			`Then you could use a different set of options when importing so you have all the`
			detail in a file, but only push `public_comment` to instances.

			`### export_fields`

			`export_fields` controls which fields will get saved to the unified blocklist
			`file, if you export one.`

			The fields `domain` and `severity` are always included, so only define extra
			`fields, if you want them.`