Merge pull request #9 from eigenmagic/gentleblock
Block instances 'gently' so people on them have time to escape.
This commit is contained in:
commit
bb84f1e239
99
README.md
99
README.md
|
@ -13,17 +13,28 @@ A tool for keeping a Mastodon instance blocklist synchronised with remote lists.
|
||||||
## Installing
|
## Installing
|
||||||
|
|
||||||
Instance admins who want to use this tool will need to add an Application at
|
Instance admins who want to use this tool will need to add an Application at
|
||||||
`https://<instance-domain>/settings/applications/` they can authorise with an
|
`https://<instance-domain>/settings/applications/` so they can authorize the
|
||||||
OAuth token. For each instance you connect to, add this token to the config file.
|
tool to create and update domain blocks with an OAuth token.
|
||||||
|
|
||||||
### Reading remote instance blocklists
|
### Reading remote instance blocklists
|
||||||
|
|
||||||
To read admin blocks from a remote instance, you'll need to ask the instance admin to add a new Application at `https://<instance-domain>/settings/applications/` and then tell you the access token.
|
If a remote instance makes its domain blocks public, you don't need
|
||||||
|
a token to read them.
|
||||||
|
|
||||||
The application needs the `admin:read:domain_blocks` OAuth scope, but unfortunately this
|
If a remote instance only shows its domain blocks to local accounts
|
||||||
scope isn't available in the current application screen (v4.0.2 of Mastodon at
|
you'll need to have a token with `read:blocks` authorization set up.
|
||||||
time of writing). There is a way to do it with scopes, but it's really
|
If you have an account on that instance, you can get a token by setting up a new
|
||||||
dangerous, so I'm not going to tell you what it is here.
|
Application at `https://<instance-domain>/settings/applications/`.
|
||||||
|
|
||||||
|
To read admin blocks from a remote instance, you'll need to ask the instance
|
||||||
|
admin to add a new Application at
|
||||||
|
`https://<instance-domain>/settings/applications/` and then tell you the access
|
||||||
|
token.
|
||||||
|
|
||||||
|
The application needs the `admin:read:domain_blocks` OAuth scope, but
|
||||||
|
unfortunately this scope isn't available in the current application screen
|
||||||
|
(v4.0.2 of Mastodon at time of writing). There is a way to do it with scopes,
|
||||||
|
but it's really dangerous, so I'm not going to tell you what it is here.
|
||||||
|
|
||||||
A better way is to ask the instance admin to connect to the PostgreSQL database
|
A better way is to ask the instance admin to connect to the PostgreSQL database
|
||||||
and add the scope there, like this:
|
and add the scope there, like this:
|
||||||
|
@ -68,20 +79,74 @@ Or you can use the default location of `/etc/default/fediblockhole.conf.toml`.
|
||||||
|
|
||||||
As the filename suggests, FediBlockHole uses TOML syntax.
|
As the filename suggests, FediBlockHole uses TOML syntax.
|
||||||
|
|
||||||
There are 2 key sections:
|
There are 3 key sections:
|
||||||
|
|
||||||
|
1. `blocklist_urls_sources`: A list of URLS to read CSV formatted blocklists from
|
||||||
|
1. `blocklist_instance_sources`: A list of instances to read blocklists from via API
|
||||||
|
1. `blocklist_instance_destinations`: A list of instances to write blocklists to via API
|
||||||
|
|
||||||
1. `blocklist_instance_sources`: A list of instances to read blocklists from
|
### URL sources
|
||||||
1. `blocklist_instance_destinations`: A list of instances to write blocklists to
|
|
||||||
|
|
||||||
Each is a list of dictionaries of the form:
|
The URL sources is a list of URLs to fetch a CSV formatted blocklist from.
|
||||||
|
|
||||||
|
The required fields are `domain` and `severity`.
|
||||||
|
|
||||||
|
Optional fields that the tool understands are `public_comment`, `private_comment`, `obfuscate`, `reject_media` and `reject_reports`.
|
||||||
|
|
||||||
|
### Instance sources
|
||||||
|
|
||||||
|
The tool can also read domain_blocks from instances directly.
|
||||||
|
|
||||||
|
The configuration is a list of dictionaries of the form:
|
||||||
```
|
```
|
||||||
{ domain = '<domain_name>', token = '<BearerToken>' }
|
{ domain = '<domain_name>', token = '<BearerToken>', admin = false }
|
||||||
```
|
```
|
||||||
|
|
||||||
The `domain` is the fully-qualified domain name of the API host for an instance
|
The `domain` is the fully-qualified domain name of the API host for an instance
|
||||||
you want to read or write domain blocks to/from. The `BearerToken` is the OAuth
|
you want to read or write domain blocks to/from.
|
||||||
token for the application that's configured in the instance to allow you to
|
|
||||||
read/write domain blocks, as discussed above.
|
The `token` is an optional OAuth token for the application that's configured in
|
||||||
|
the instance to allow you to read/write domain blocks, as discussed above.
|
||||||
|
|
||||||
|
`admin` is an optional field that tells the tool to use the more detailed admin
|
||||||
|
API endpoint for domain_blocks, rather than the more public API endpoint that
|
||||||
|
doesn't provide as much detail. You will need a `token` that's been configured to
|
||||||
|
permit access to the admin domain_blocks scope, as detailed above.
|
||||||
|
|
||||||
|
### Instance destinations
|
||||||
|
|
||||||
|
The tool supports pushing a unified blocklist to multiple instances.
|
||||||
|
|
||||||
|
Configure the list of instances you want to push your blocklist to in the
|
||||||
|
`blocklist_instance_detinations` list. Each entry is of the form:
|
||||||
|
|
||||||
|
```
|
||||||
|
{ domain = '<domain_name>', token = '<BearerToken>', max_followed_severity = 'silence' }
|
||||||
|
```
|
||||||
|
|
||||||
|
The fields `domain` and `token` are required. `max_followed_severity` is optional.
|
||||||
|
|
||||||
|
The `domain` is the hostname of the instance you want to push to. The `token` is
|
||||||
|
an application token with both `admin:read:domain_blocks` and
|
||||||
|
`admin:write:domain_blocks` authorization.
|
||||||
|
|
||||||
|
The optional `max_followed_severity` setting sets a per-instance limit on the
|
||||||
|
severity of a domain_block if there are accounts on the instance that follow
|
||||||
|
accounts on the domain to be blocked. If `max_followed_severity` isn't set, it
|
||||||
|
defaults to 'silence'.
|
||||||
|
|
||||||
|
This setting exists to give people time to move off an instance that is about to
|
||||||
|
be defederated and bring their followers from your instance with them. Without
|
||||||
|
it, if a new Suspend block appears in any of the blocklists you subscribe to (or
|
||||||
|
a block level increases from Silence to Suspend) and you're using the default
|
||||||
|
`max` mergeplan, the tool would immediately suspend the instance, cutting
|
||||||
|
everyone on the blocked instance off from their existing followers on your
|
||||||
|
instance, even if they move to a new instance. If you actually want that
|
||||||
|
outcome, you can set `max_followed_severity = 'suspend'` and use the `max`
|
||||||
|
mergeplan.
|
||||||
|
|
||||||
|
Once the follow count drops to 0, the tool will automatically use the highest severity it finds again (if you're using the `max` mergeplan).
|
||||||
|
|
||||||
|
|
||||||
## Using the tool
|
## Using the tool
|
||||||
|
|
||||||
|
@ -91,14 +156,14 @@ Once you've configured the tool, run it like this:
|
||||||
fediblock_sync.py -c <configfile_path>
|
fediblock_sync.py -c <configfile_path>
|
||||||
```
|
```
|
||||||
|
|
||||||
If you put the config file in `/etc/default/fediblockhole.conf.toml` you don't need to pass the config file path.
|
If you put the config file in `/etc/default/fediblockhole.conf.toml` you don't need to pass in the config file path.
|
||||||
|
|
||||||
## More advanced configuration
|
## More advanced configuration
|
||||||
|
|
||||||
For a list of possible configuration options, check the `--help` and read the
|
For a list of possible configuration options, check the `--help` and read the
|
||||||
sample configuration file in `etc/sample.fediblockhole.conf.toml`.
|
sample configuration file in `etc/sample.fediblockhole.conf.toml`.
|
||||||
|
|
||||||
### keep_intermediate
|
### save_intermediate
|
||||||
|
|
||||||
This option tells the tool to save the unmerged blocklists it fetches from
|
This option tells the tool to save the unmerged blocklists it fetches from
|
||||||
remote instances and URLs into separate files. This is handy for debugging, or
|
remote instances and URLs into separate files. This is handy for debugging, or
|
||||||
|
|
|
@ -108,7 +108,8 @@ def sync_blocklists(conf: dict):
|
||||||
for dest in conf.blocklist_instance_destinations:
|
for dest in conf.blocklist_instance_destinations:
|
||||||
domain = dest['domain']
|
domain = dest['domain']
|
||||||
token = dest['token']
|
token = dest['token']
|
||||||
push_blocklist(token, domain, merged.values(), conf.dryrun, import_fields)
|
max_followed_severity = dest.get('max_followed_severity', 'silence')
|
||||||
|
push_blocklist(token, domain, merged.values(), conf.dryrun, import_fields, max_followed_severity)
|
||||||
|
|
||||||
def merge_blocklists(blocklists: dict, mergeplan: str='max') -> dict:
|
def merge_blocklists(blocklists: dict, mergeplan: str='max') -> dict:
|
||||||
"""Merge fetched remote blocklists into a bulk update
|
"""Merge fetched remote blocklists into a bulk update
|
||||||
|
@ -125,7 +126,7 @@ def merge_blocklists(blocklists: dict, mergeplan: str='max') -> dict:
|
||||||
domain = newblock['domain']
|
domain = newblock['domain']
|
||||||
# If the domain has two asterisks in it, it's obfuscated
|
# If the domain has two asterisks in it, it's obfuscated
|
||||||
# and we can't really use it, so skip it and do the next one
|
# and we can't really use it, so skip it and do the next one
|
||||||
if '**' in domain:
|
if '*' in domain:
|
||||||
log.debug(f"Domain '{domain}' is obfuscated. Skipping it.")
|
log.debug(f"Domain '{domain}' is obfuscated. Skipping it.")
|
||||||
continue
|
continue
|
||||||
|
|
||||||
|
@ -177,7 +178,7 @@ def apply_mergeplan(oldblock: dict, newblock: dict, mergeplan: str='max') -> dic
|
||||||
blockdata['severity'] = newblock['severity']
|
blockdata['severity'] = newblock['severity']
|
||||||
|
|
||||||
# If obfuscate is set and is True for the domain in
|
# If obfuscate is set and is True for the domain in
|
||||||
# any blocklist then obfuscate is set to false.
|
# any blocklist then obfuscate is set to True.
|
||||||
if newblock.get('obfuscate', False):
|
if newblock.get('obfuscate', False):
|
||||||
blockdata['obfuscate'] = True
|
blockdata['obfuscate'] = True
|
||||||
|
|
||||||
|
@ -253,7 +254,7 @@ def fetch_instance_blocklist(host: str, token: str=None, admin: bool=False,
|
||||||
url = urlstring.strip('<').rstrip('>')
|
url = urlstring.strip('<').rstrip('>')
|
||||||
|
|
||||||
log.debug(f"Found {len(domain_blocks)} existing domain blocks.")
|
log.debug(f"Found {len(domain_blocks)} existing domain blocks.")
|
||||||
# Remove fields not in import list
|
# Remove fields not in import list.
|
||||||
for row in domain_blocks:
|
for row in domain_blocks:
|
||||||
origrow = row.copy()
|
origrow = row.copy()
|
||||||
for key in origrow:
|
for key in origrow:
|
||||||
|
@ -274,18 +275,98 @@ def delete_block(token: str, host: str, id: int):
|
||||||
)
|
)
|
||||||
if response.status_code != 200:
|
if response.status_code != 200:
|
||||||
if response.status_code == 404:
|
if response.status_code == 404:
|
||||||
log.warn(f"No such domain block: {id}")
|
log.warning(f"No such domain block: {id}")
|
||||||
return
|
return
|
||||||
|
|
||||||
raise ValueError(f"Something went wrong: {response.status_code}: {response.content}")
|
raise ValueError(f"Something went wrong: {response.status_code}: {response.content}")
|
||||||
|
|
||||||
|
def fetch_instance_follows(token: str, host: str, domain: str) -> int:
|
||||||
|
"""Fetch the followers of the target domain at the instance
|
||||||
|
|
||||||
|
@param token: the Bearer authentication token for OAuth access
|
||||||
|
@param host: the instance API hostname/IP address
|
||||||
|
@param domain: the domain to search for followers of
|
||||||
|
@returns: int, number of local followers of remote instance accounts
|
||||||
|
"""
|
||||||
|
api_path = "/api/v1/admin/measures"
|
||||||
|
url = f"https://{host}{api_path}"
|
||||||
|
|
||||||
|
key = 'instance_follows'
|
||||||
|
|
||||||
|
# This data structure only allows us to request a single domain
|
||||||
|
# at a time, which limits the load on the remote instance of each call
|
||||||
|
data = {
|
||||||
|
'keys': [
|
||||||
|
key
|
||||||
|
],
|
||||||
|
key: { 'domain': domain },
|
||||||
|
}
|
||||||
|
|
||||||
|
# The Mastodon API only accepts JSON formatted POST data for measures
|
||||||
|
response = requests.post(url,
|
||||||
|
headers={
|
||||||
|
'Authorization': f"Bearer {token}",
|
||||||
|
},
|
||||||
|
json=data,
|
||||||
|
)
|
||||||
|
if response.status_code != 200:
|
||||||
|
if response.status_code == 403:
|
||||||
|
log.error(f"Cannot fetch follow information for {domain} from {host}: {response.content}")
|
||||||
|
|
||||||
|
raise ValueError(f"Something went wrong: {response.status_code}: {response.content}")
|
||||||
|
|
||||||
|
# Get the total returned
|
||||||
|
follows = int(response.json()[0]['total'])
|
||||||
|
return follows
|
||||||
|
|
||||||
|
def check_followed_severity(host: str, token: str, domain: str,
|
||||||
|
severity: str, max_followed_severity: str='silence'):
|
||||||
|
"""Check an instance to see if it has followers of a to-be-blocked instance"""
|
||||||
|
|
||||||
|
# If the instance has accounts that follow people on the to-be-blocked domain,
|
||||||
|
# limit the maximum severity to the configured `max_followed_severity`.
|
||||||
|
follows = fetch_instance_follows(token, host, domain)
|
||||||
|
if follows > 0:
|
||||||
|
log.debug(f"Instance {host} has {follows} followers of accounts at {domain}.")
|
||||||
|
if SEVERITY[severity] > SEVERITY[max_followed_severity]:
|
||||||
|
log.warning(f"Instance {host} has {follows} followers of accounts at {domain}. Limiting block severity to {max_followed_severity}.")
|
||||||
|
return max_followed_severity
|
||||||
|
else:
|
||||||
|
return severity
|
||||||
|
|
||||||
|
def is_change_needed(oldblock: dict, newblock: dict, import_fields: list):
|
||||||
|
"""Compare block definitions to see if changes are needed"""
|
||||||
|
# Check if anything is actually different and needs updating
|
||||||
|
change_needed = []
|
||||||
|
|
||||||
|
for key in import_fields:
|
||||||
|
try:
|
||||||
|
oldval = oldblock[key]
|
||||||
|
newval = newblock[key]
|
||||||
|
log.debug(f"Compare {key} '{oldval}' <> '{newval}'")
|
||||||
|
|
||||||
|
if oldval != newval:
|
||||||
|
log.debug("Difference detected. Change needed.")
|
||||||
|
change_needed.append(key)
|
||||||
|
break
|
||||||
|
|
||||||
|
except KeyError:
|
||||||
|
log.debug(f"Key '{key}' missing from block definition so cannot compare. Continuing...")
|
||||||
|
continue
|
||||||
|
|
||||||
|
return change_needed
|
||||||
|
|
||||||
def update_known_block(token: str, host: str, blockdict: dict):
|
def update_known_block(token: str, host: str, blockdict: dict):
|
||||||
"""Update an existing domain block with information in blockdict"""
|
"""Update an existing domain block with information in blockdict"""
|
||||||
api_path = "/api/v1/admin/domain_blocks/"
|
api_path = "/api/v1/admin/domain_blocks/"
|
||||||
|
|
||||||
id = blockdict['id']
|
try:
|
||||||
blockdata = blockdict.copy()
|
id = blockdict['id']
|
||||||
del blockdata['id']
|
blockdata = blockdict.copy()
|
||||||
|
del blockdata['id']
|
||||||
|
except KeyError:
|
||||||
|
import pdb
|
||||||
|
pdb.set_trace()
|
||||||
|
|
||||||
url = f"https://{host}{api_path}{id}"
|
url = f"https://{host}{api_path}{id}"
|
||||||
|
|
||||||
|
@ -308,12 +389,20 @@ def add_block(token: str, host: str, blockdata: dict):
|
||||||
headers={'Authorization': f"Bearer {token}"},
|
headers={'Authorization': f"Bearer {token}"},
|
||||||
data=blockdata
|
data=blockdata
|
||||||
)
|
)
|
||||||
if response.status_code != 200:
|
if response.status_code == 422:
|
||||||
raise ValueError(f"Something went wrong: {response.status_code}: {response.content}")
|
# A stricter block already exists. Probably for the base domain.
|
||||||
|
err = json.loads(response.content)
|
||||||
|
log.warning(err['error'])
|
||||||
|
|
||||||
|
elif response.status_code != 200:
|
||||||
|
|
||||||
|
raise ValueError(f"Something went wrong: {response.status_code}: {response.content}")
|
||||||
|
|
||||||
def push_blocklist(token: str, host: str, blocklist: list[dict],
|
def push_blocklist(token: str, host: str, blocklist: list[dict],
|
||||||
dryrun: bool=False,
|
dryrun: bool=False,
|
||||||
import_fields: list=['domain', 'severity']):
|
import_fields: list=['domain', 'severity'],
|
||||||
|
max_followed_severity='silence',
|
||||||
|
):
|
||||||
"""Push a blocklist to a remote instance.
|
"""Push a blocklist to a remote instance.
|
||||||
|
|
||||||
Merging the blocklist with the existing list the instance has,
|
Merging the blocklist with the existing list the instance has,
|
||||||
|
@ -326,47 +415,42 @@ def push_blocklist(token: str, host: str, blocklist: list[dict],
|
||||||
"""
|
"""
|
||||||
log.info(f"Pushing blocklist to host {host} ...")
|
log.info(f"Pushing blocklist to host {host} ...")
|
||||||
# Fetch the existing blocklist from the instance
|
# Fetch the existing blocklist from the instance
|
||||||
# Force use of the admin API
|
# Force use of the admin API, and add 'id' to the list of fields
|
||||||
|
if 'id' not in import_fields:
|
||||||
|
import_fields.append('id')
|
||||||
serverblocks = fetch_instance_blocklist(host, token, True, import_fields)
|
serverblocks = fetch_instance_blocklist(host, token, True, import_fields)
|
||||||
|
|
||||||
# Convert serverblocks to a dictionary keyed by domain name
|
# # Convert serverblocks to a dictionary keyed by domain name
|
||||||
knownblocks = {row['domain']: row for row in serverblocks}
|
knownblocks = {row['domain']: row for row in serverblocks}
|
||||||
|
|
||||||
for newblock in blocklist:
|
for newblock in blocklist:
|
||||||
|
|
||||||
log.debug(f"applying newblock: {newblock}")
|
log.debug(f"Applying newblock: {newblock}")
|
||||||
oldblock = knownblocks.get(newblock['domain'], None)
|
oldblock = knownblocks.get(newblock['domain'], None)
|
||||||
if oldblock:
|
if oldblock:
|
||||||
log.debug(f"Block already exists for {newblock['domain']}, checking for differences...")
|
log.debug(f"Block already exists for {newblock['domain']}, checking for differences...")
|
||||||
|
|
||||||
# Check if anything is actually different and needs updating
|
change_needed = is_change_needed(oldblock, newblock, import_fields)
|
||||||
change_needed = False
|
|
||||||
|
|
||||||
for key in import_fields:
|
|
||||||
try:
|
|
||||||
oldval = oldblock[key]
|
|
||||||
newval = newblock[key]
|
|
||||||
log.debug(f"Compare {key} '{oldval}' <> '{newval}'")
|
|
||||||
|
|
||||||
if oldval != newval:
|
|
||||||
log.debug("Difference detected. Change needed.")
|
|
||||||
change_needed = True
|
|
||||||
break
|
|
||||||
|
|
||||||
except KeyError:
|
|
||||||
log.debug(f"Key '{key}' missing from block definition so cannot compare. Continuing...")
|
|
||||||
continue
|
|
||||||
|
|
||||||
if change_needed:
|
if change_needed:
|
||||||
log.info(f"Change detected. Updating domain block for {oldblock['domain']}")
|
# Change might be needed, but let's see if the severity
|
||||||
blockdata = oldblock.copy()
|
# needs to change. If not, maybe no changes are needed?
|
||||||
blockdata.update(newblock)
|
newseverity = check_followed_severity(host, token, oldblock['domain'], newblock['severity'], max_followed_severity)
|
||||||
if not dryrun:
|
if newseverity != oldblock['severity']:
|
||||||
update_known_block(token, host, blockdata)
|
newblock['severity'] = newseverity
|
||||||
# add a pause here so we don't melt the instance
|
change_needed.append('severity')
|
||||||
time.sleep(1)
|
|
||||||
else:
|
# Change still needed?
|
||||||
log.info("Dry run selected. Not applying changes.")
|
if change_needed:
|
||||||
|
log.info(f"Change detected. Updating domain block for {oldblock['domain']}")
|
||||||
|
blockdata = oldblock.copy()
|
||||||
|
blockdata.update(newblock)
|
||||||
|
if not dryrun:
|
||||||
|
update_known_block(token, host, blockdata)
|
||||||
|
# add a pause here so we don't melt the instance
|
||||||
|
time.sleep(1)
|
||||||
|
else:
|
||||||
|
log.info("Dry run selected. Not applying changes.")
|
||||||
|
|
||||||
else:
|
else:
|
||||||
log.debug("No differences detected. Not updating.")
|
log.debug("No differences detected. Not updating.")
|
||||||
|
@ -385,6 +469,9 @@ def push_blocklist(token: str, host: str, blocklist: list[dict],
|
||||||
'reject_reports': newblock.get('reject_reports', False),
|
'reject_reports': newblock.get('reject_reports', False),
|
||||||
'obfuscate': newblock.get('obfuscate', False),
|
'obfuscate': newblock.get('obfuscate', False),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# Make sure the new block doesn't clobber a domain with followers
|
||||||
|
blockdata['severity'] = check_followed_severity(host, token, newblock['domain'], max_followed_severity)
|
||||||
log.info(f"Adding new block for {blockdata['domain']}...")
|
log.info(f"Adding new block for {blockdata['domain']}...")
|
||||||
if not dryrun:
|
if not dryrun:
|
||||||
add_block(token, host, blockdata)
|
add_block(token, host, blockdata)
|
||||||
|
@ -514,4 +601,4 @@ if __name__ == '__main__':
|
||||||
args = augment_args(args)
|
args = augment_args(args)
|
||||||
|
|
||||||
# Do the work of syncing
|
# Do the work of syncing
|
||||||
sync_blocklists(args)
|
sync_blocklists(args)
|
||||||
|
|
|
@ -17,11 +17,11 @@ blocklist_url_sources = [
|
||||||
|
|
||||||
# List of instances to write blocklist to
|
# List of instances to write blocklist to
|
||||||
blocklist_instance_destinations = [
|
blocklist_instance_destinations = [
|
||||||
# { domain = 'eigenmagic.net', token = '<read_write_token>' },
|
# { domain = 'eigenmagic.net', token = '<read_write_token>', max_followed_severity = 'silence'},
|
||||||
]
|
]
|
||||||
|
|
||||||
## Store a local copy of the remote blocklists after we fetch them
|
## Store a local copy of the remote blocklists after we fetch them
|
||||||
#keep_intermediate = true
|
#save_intermediate = true
|
||||||
|
|
||||||
## Directory to store the local blocklist copies
|
## Directory to store the local blocklist copies
|
||||||
# savedir = '/tmp'
|
# savedir = '/tmp'
|
||||||
|
|
Loading…
Reference in New Issue