Add ability to set max severity level if
an instance has followers of accounts on a to-be-blocked domain. Refactored the change detection code. Fixed a bug in config of intermediate blocklists saving. Updated README documentation. Updated sample config. Addresses #5
This commit is contained in:
parent
a134870f14
commit
55184210d4
99
README.md
99
README.md
|
@ -13,17 +13,28 @@ A tool for keeping a Mastodon instance blocklist synchronised with remote lists.
|
|||
## Installing
|
||||
|
||||
Instance admins who want to use this tool will need to add an Application at
|
||||
`https://<instance-domain>/settings/applications/` they can authorise with an
|
||||
OAuth token. For each instance you connect to, add this token to the config file.
|
||||
`https://<instance-domain>/settings/applications/` so they can authorize the
|
||||
tool to create and update domain blocks with an OAuth token.
|
||||
|
||||
### Reading remote instance blocklists
|
||||
|
||||
To read admin blocks from a remote instance, you'll need to ask the instance admin to add a new Application at `https://<instance-domain>/settings/applications/` and then tell you the access token.
|
||||
If a remote instance makes its domain blocks public, you don't need
|
||||
a token to read them.
|
||||
|
||||
The application needs the `admin:read:domain_blocks` OAuth scope, but unfortunately this
|
||||
scope isn't available in the current application screen (v4.0.2 of Mastodon at
|
||||
time of writing). There is a way to do it with scopes, but it's really
|
||||
dangerous, so I'm not going to tell you what it is here.
|
||||
If a remote instance only shows its domain blocks to local accounts
|
||||
you'll need to have a token with `read:blocks` authorization set up.
|
||||
If you have an account on that instance, you can get a token by setting up a new
|
||||
Application at `https://<instance-domain>/settings/applications/`.
|
||||
|
||||
To read admin blocks from a remote instance, you'll need to ask the instance
|
||||
admin to add a new Application at
|
||||
`https://<instance-domain>/settings/applications/` and then tell you the access
|
||||
token.
|
||||
|
||||
The application needs the `admin:read:domain_blocks` OAuth scope, but
|
||||
unfortunately this scope isn't available in the current application screen
|
||||
(v4.0.2 of Mastodon at time of writing). There is a way to do it with scopes,
|
||||
but it's really dangerous, so I'm not going to tell you what it is here.
|
||||
|
||||
A better way is to ask the instance admin to connect to the PostgreSQL database
|
||||
and add the scope there, like this:
|
||||
|
@ -68,20 +79,74 @@ Or you can use the default location of `/etc/default/fediblockhole.conf.toml`.
|
|||
|
||||
As the filename suggests, FediBlockHole uses TOML syntax.
|
||||
|
||||
There are 2 key sections:
|
||||
There are 3 key sections:
|
||||
|
||||
1. `blocklist_instance_sources`: A list of instances to read blocklists from
|
||||
1. `blocklist_instance_destinations`: A list of instances to write blocklists to
|
||||
1. `blocklist_urls_sources`: A list of URLS to read CSV formatted blocklists from
|
||||
1. `blocklist_instance_sources`: A list of instances to read blocklists from via API
|
||||
1. `blocklist_instance_destinations`: A list of instances to write blocklists to via API
|
||||
|
||||
Each is a list of dictionaries of the form:
|
||||
### URL sources
|
||||
|
||||
The URL sources is a list of URLs to fetch a CSV formatted blocklist from.
|
||||
|
||||
The required fields are `domain` and `severity`.
|
||||
|
||||
Optional fields that the tool understands are `public_comment`, `private_comment`, `obfuscate`, `reject_media` and `reject_reports`.
|
||||
|
||||
### Instance sources
|
||||
|
||||
The tool can also read domain_blocks from instances directly.
|
||||
|
||||
The configuration is a list of dictionaries of the form:
|
||||
```
|
||||
{ domain = '<domain_name>', token = '<BearerToken>' }
|
||||
{ domain = '<domain_name>', token = '<BearerToken>', admin = false }
|
||||
```
|
||||
|
||||
The `domain` is the fully-qualified domain name of the API host for an instance
|
||||
you want to read or write domain blocks to/from. The `BearerToken` is the OAuth
|
||||
token for the application that's configured in the instance to allow you to
|
||||
read/write domain blocks, as discussed above.
|
||||
you want to read or write domain blocks to/from.
|
||||
|
||||
The `token` is an optional OAuth token for the application that's configured in
|
||||
the instance to allow you to read/write domain blocks, as discussed above.
|
||||
|
||||
`admin` is an optional field that tells the tool to use the more detailed admin
|
||||
API endpoint for domain_blocks, rather than the more public API endpoint that
|
||||
doesn't provide as much detail. You will need a `token` that's been configured to
|
||||
permit access to the admin domain_blocks scope, as detailed above.
|
||||
|
||||
### Instance destinations
|
||||
|
||||
The tool supports pushing a unified blocklist to multiple instances.
|
||||
|
||||
Configure the list of instances you want to push your blocklist to in the
|
||||
`blocklist_instance_detinations` list. Each entry is of the form:
|
||||
|
||||
```
|
||||
{ domain = '<domain_name>', token = '<BearerToken>', max_followed_severity = 'silence' }
|
||||
```
|
||||
|
||||
The fields `domain` and `token` are required. `max_followed_severity` is optional.
|
||||
|
||||
The `domain` is the hostname of the instance you want to push to. The `token` is
|
||||
an application token with both `admin:read:domain_blocks` and
|
||||
`admin:write:domain_blocks` authorization.
|
||||
|
||||
The optional `max_followed_severity` setting sets a per-instance limit on the
|
||||
severity of a domain_block if there are accounts on the instance that follow
|
||||
accounts on the domain to be blocked. If `max_followed_severity` isn't set, it
|
||||
defaults to 'silence'.
|
||||
|
||||
This setting exists to give people time to move off an instance that is about to
|
||||
be defederated and bring their followers from your instance with them. Without
|
||||
it, if a new Suspend block appears in any of the blocklists you subscribe to (or
|
||||
a block level increases from Silence to Suspend) and you're using the default
|
||||
`max` mergeplan, the tool would immediately suspend the instance, cutting
|
||||
everyone on the blocked instance off from their existing followers on your
|
||||
instance, even if they move to a new instance. If you actually want that
|
||||
outcome, you can set `max_followed_severity = 'suspend'` and use the `max`
|
||||
mergeplan.
|
||||
|
||||
Once the follow count drops to 0, the tool will automatically use the highest severity it finds again (if you're using the `max` mergeplan).
|
||||
|
||||
|
||||
## Using the tool
|
||||
|
||||
|
@ -91,14 +156,14 @@ Once you've configured the tool, run it like this:
|
|||
fediblock_sync.py -c <configfile_path>
|
||||
```
|
||||
|
||||
If you put the config file in `/etc/default/fediblockhole.conf.toml` you don't need to pass the config file path.
|
||||
If you put the config file in `/etc/default/fediblockhole.conf.toml` you don't need to pass in the config file path.
|
||||
|
||||
## More advanced configuration
|
||||
|
||||
For a list of possible configuration options, check the `--help` and read the
|
||||
sample configuration file in `etc/sample.fediblockhole.conf.toml`.
|
||||
|
||||
### keep_intermediate
|
||||
### save_intermediate
|
||||
|
||||
This option tells the tool to save the unmerged blocklists it fetches from
|
||||
remote instances and URLs into separate files. This is handy for debugging, or
|
||||
|
|
|
@ -108,7 +108,8 @@ def sync_blocklists(conf: dict):
|
|||
for dest in conf.blocklist_instance_destinations:
|
||||
domain = dest['domain']
|
||||
token = dest['token']
|
||||
push_blocklist(token, domain, merged.values(), conf.dryrun, import_fields)
|
||||
max_followed_severity = dest.get('max_followed_severity', 'silence')
|
||||
push_blocklist(token, domain, merged.values(), conf.dryrun, import_fields, max_followed_severity)
|
||||
|
||||
def merge_blocklists(blocklists: dict, mergeplan: str='max') -> dict:
|
||||
"""Merge fetched remote blocklists into a bulk update
|
||||
|
@ -125,7 +126,7 @@ def merge_blocklists(blocklists: dict, mergeplan: str='max') -> dict:
|
|||
domain = newblock['domain']
|
||||
# If the domain has two asterisks in it, it's obfuscated
|
||||
# and we can't really use it, so skip it and do the next one
|
||||
if '**' in domain:
|
||||
if '*' in domain:
|
||||
log.debug(f"Domain '{domain}' is obfuscated. Skipping it.")
|
||||
continue
|
||||
|
||||
|
@ -177,7 +178,7 @@ def apply_mergeplan(oldblock: dict, newblock: dict, mergeplan: str='max') -> dic
|
|||
blockdata['severity'] = newblock['severity']
|
||||
|
||||
# If obfuscate is set and is True for the domain in
|
||||
# any blocklist then obfuscate is set to false.
|
||||
# any blocklist then obfuscate is set to True.
|
||||
if newblock.get('obfuscate', False):
|
||||
blockdata['obfuscate'] = True
|
||||
|
||||
|
@ -253,7 +254,7 @@ def fetch_instance_blocklist(host: str, token: str=None, admin: bool=False,
|
|||
url = urlstring.strip('<').rstrip('>')
|
||||
|
||||
log.debug(f"Found {len(domain_blocks)} existing domain blocks.")
|
||||
# Remove fields not in import list
|
||||
# Remove fields not in import list.
|
||||
for row in domain_blocks:
|
||||
origrow = row.copy()
|
||||
for key in origrow:
|
||||
|
@ -274,18 +275,98 @@ def delete_block(token: str, host: str, id: int):
|
|||
)
|
||||
if response.status_code != 200:
|
||||
if response.status_code == 404:
|
||||
log.warn(f"No such domain block: {id}")
|
||||
log.warning(f"No such domain block: {id}")
|
||||
return
|
||||
|
||||
raise ValueError(f"Something went wrong: {response.status_code}: {response.content}")
|
||||
|
||||
def fetch_instance_follows(token: str, host: str, domain: str) -> int:
|
||||
"""Fetch the followers of the target domain at the instance
|
||||
|
||||
@param token: the Bearer authentication token for OAuth access
|
||||
@param host: the instance API hostname/IP address
|
||||
@param domain: the domain to search for followers of
|
||||
@returns: int, number of local followers of remote instance accounts
|
||||
"""
|
||||
api_path = "/api/v1/admin/measures"
|
||||
url = f"https://{host}{api_path}"
|
||||
|
||||
key = 'instance_follows'
|
||||
|
||||
# This data structure only allows us to request a single domain
|
||||
# at a time, which limits the load on the remote instance of each call
|
||||
data = {
|
||||
'keys': [
|
||||
key
|
||||
],
|
||||
key: { 'domain': domain },
|
||||
}
|
||||
|
||||
# The Mastodon API only accepts JSON formatted POST data for measures
|
||||
response = requests.post(url,
|
||||
headers={
|
||||
'Authorization': f"Bearer {token}",
|
||||
},
|
||||
json=data,
|
||||
)
|
||||
if response.status_code != 200:
|
||||
if response.status_code == 403:
|
||||
log.error(f"Cannot fetch follow information for {domain} from {host}: {response.content}")
|
||||
|
||||
raise ValueError(f"Something went wrong: {response.status_code}: {response.content}")
|
||||
|
||||
# Get the total returned
|
||||
follows = int(response.json()[0]['total'])
|
||||
return follows
|
||||
|
||||
def check_followed_severity(host: str, token: str, domain: str,
|
||||
severity: str, max_followed_severity: str='silence'):
|
||||
"""Check an instance to see if it has followers of a to-be-blocked instance"""
|
||||
|
||||
# If the instance has accounts that follow people on the to-be-blocked domain,
|
||||
# limit the maximum severity to the configured `max_followed_severity`.
|
||||
follows = fetch_instance_follows(token, host, domain)
|
||||
if follows > 0:
|
||||
log.debug(f"Instance {host} has {follows} followers of accounts at {domain}.")
|
||||
if SEVERITY[severity] > SEVERITY[max_followed_severity]:
|
||||
log.warning(f"Instance {host} has {follows} followers of accounts at {domain}. Limiting block severity to {max_followed_severity}.")
|
||||
return max_followed_severity
|
||||
else:
|
||||
return severity
|
||||
|
||||
def is_change_needed(oldblock: dict, newblock: dict, import_fields: list):
|
||||
"""Compare block definitions to see if changes are needed"""
|
||||
# Check if anything is actually different and needs updating
|
||||
change_needed = []
|
||||
|
||||
for key in import_fields:
|
||||
try:
|
||||
oldval = oldblock[key]
|
||||
newval = newblock[key]
|
||||
log.debug(f"Compare {key} '{oldval}' <> '{newval}'")
|
||||
|
||||
if oldval != newval:
|
||||
log.debug("Difference detected. Change needed.")
|
||||
change_needed.append(key)
|
||||
break
|
||||
|
||||
except KeyError:
|
||||
log.debug(f"Key '{key}' missing from block definition so cannot compare. Continuing...")
|
||||
continue
|
||||
|
||||
return change_needed
|
||||
|
||||
def update_known_block(token: str, host: str, blockdict: dict):
|
||||
"""Update an existing domain block with information in blockdict"""
|
||||
api_path = "/api/v1/admin/domain_blocks/"
|
||||
|
||||
id = blockdict['id']
|
||||
blockdata = blockdict.copy()
|
||||
del blockdata['id']
|
||||
try:
|
||||
id = blockdict['id']
|
||||
blockdata = blockdict.copy()
|
||||
del blockdata['id']
|
||||
except KeyError:
|
||||
import pdb
|
||||
pdb.set_trace()
|
||||
|
||||
url = f"https://{host}{api_path}{id}"
|
||||
|
||||
|
@ -308,12 +389,20 @@ def add_block(token: str, host: str, blockdata: dict):
|
|||
headers={'Authorization': f"Bearer {token}"},
|
||||
data=blockdata
|
||||
)
|
||||
if response.status_code != 200:
|
||||
if response.status_code == 422:
|
||||
# A stricter block already exists. Probably for the base domain.
|
||||
err = json.loads(response.content)
|
||||
log.warning(err['error'])
|
||||
|
||||
elif response.status_code != 200:
|
||||
|
||||
raise ValueError(f"Something went wrong: {response.status_code}: {response.content}")
|
||||
|
||||
def push_blocklist(token: str, host: str, blocklist: list[dict],
|
||||
dryrun: bool=False,
|
||||
import_fields: list=['domain', 'severity']):
|
||||
import_fields: list=['domain', 'severity'],
|
||||
max_followed_severity='silence',
|
||||
):
|
||||
"""Push a blocklist to a remote instance.
|
||||
|
||||
Merging the blocklist with the existing list the instance has,
|
||||
|
@ -326,47 +415,42 @@ def push_blocklist(token: str, host: str, blocklist: list[dict],
|
|||
"""
|
||||
log.info(f"Pushing blocklist to host {host} ...")
|
||||
# Fetch the existing blocklist from the instance
|
||||
# Force use of the admin API
|
||||
# Force use of the admin API, and add 'id' to the list of fields
|
||||
if 'id' not in import_fields:
|
||||
import_fields.append('id')
|
||||
serverblocks = fetch_instance_blocklist(host, token, True, import_fields)
|
||||
|
||||
# Convert serverblocks to a dictionary keyed by domain name
|
||||
# # Convert serverblocks to a dictionary keyed by domain name
|
||||
knownblocks = {row['domain']: row for row in serverblocks}
|
||||
|
||||
for newblock in blocklist:
|
||||
|
||||
log.debug(f"applying newblock: {newblock}")
|
||||
log.debug(f"Applying newblock: {newblock}")
|
||||
oldblock = knownblocks.get(newblock['domain'], None)
|
||||
if oldblock:
|
||||
log.debug(f"Block already exists for {newblock['domain']}, checking for differences...")
|
||||
|
||||
# Check if anything is actually different and needs updating
|
||||
change_needed = False
|
||||
|
||||
for key in import_fields:
|
||||
try:
|
||||
oldval = oldblock[key]
|
||||
newval = newblock[key]
|
||||
log.debug(f"Compare {key} '{oldval}' <> '{newval}'")
|
||||
|
||||
if oldval != newval:
|
||||
log.debug("Difference detected. Change needed.")
|
||||
change_needed = True
|
||||
break
|
||||
|
||||
except KeyError:
|
||||
log.debug(f"Key '{key}' missing from block definition so cannot compare. Continuing...")
|
||||
continue
|
||||
change_needed = is_change_needed(oldblock, newblock, import_fields)
|
||||
|
||||
if change_needed:
|
||||
log.info(f"Change detected. Updating domain block for {oldblock['domain']}")
|
||||
blockdata = oldblock.copy()
|
||||
blockdata.update(newblock)
|
||||
if not dryrun:
|
||||
update_known_block(token, host, blockdata)
|
||||
# add a pause here so we don't melt the instance
|
||||
time.sleep(1)
|
||||
else:
|
||||
log.info("Dry run selected. Not applying changes.")
|
||||
# Change might be needed, but let's see if the severity
|
||||
# needs to change. If not, maybe no changes are needed?
|
||||
newseverity = check_followed_severity(host, token, oldblock['domain'], newblock['severity'], max_followed_severity)
|
||||
if newseverity != oldblock['severity']:
|
||||
newblock['severity'] = newseverity
|
||||
change_needed.append('severity')
|
||||
|
||||
# Change still needed?
|
||||
if change_needed:
|
||||
log.info(f"Change detected. Updating domain block for {oldblock['domain']}")
|
||||
blockdata = oldblock.copy()
|
||||
blockdata.update(newblock)
|
||||
if not dryrun:
|
||||
update_known_block(token, host, blockdata)
|
||||
# add a pause here so we don't melt the instance
|
||||
time.sleep(1)
|
||||
else:
|
||||
log.info("Dry run selected. Not applying changes.")
|
||||
|
||||
else:
|
||||
log.debug("No differences detected. Not updating.")
|
||||
|
@ -385,6 +469,9 @@ def push_blocklist(token: str, host: str, blocklist: list[dict],
|
|||
'reject_reports': newblock.get('reject_reports', False),
|
||||
'obfuscate': newblock.get('obfuscate', False),
|
||||
}
|
||||
|
||||
# Make sure the new block doesn't clobber a domain with followers
|
||||
blockdata['severity'] = check_followed_severity(host, token, newblock['domain'], max_followed_severity)
|
||||
log.info(f"Adding new block for {blockdata['domain']}...")
|
||||
if not dryrun:
|
||||
add_block(token, host, blockdata)
|
||||
|
|
|
@ -17,11 +17,11 @@ blocklist_url_sources = [
|
|||
|
||||
# List of instances to write blocklist to
|
||||
blocklist_instance_destinations = [
|
||||
# { domain = 'eigenmagic.net', token = '<read_write_token>' },
|
||||
# { domain = 'eigenmagic.net', token = '<read_write_token>', max_followed_severity = 'silence'},
|
||||
]
|
||||
|
||||
## Store a local copy of the remote blocklists after we fetch them
|
||||
#keep_intermediate = true
|
||||
#save_intermediate = true
|
||||
|
||||
## Directory to store the local blocklist copies
|
||||
# savedir = '/tmp'
|
||||
|
|
Loading…
Reference in New Issue