README: Update readme

Now that we can also purge local media (optional!), we need to adapt the README.
This commit is contained in:
Sebastian Spaeth 2023-09-18 14:01:38 +02:00
parent c1ee679f9e
commit 7b62b49df4

View File

@ -1,19 +1,20 @@
Cleanmedia # Cleanmedia
==========
A poor man's data retention policy for dendrite servers. A poor man's data retention policy for dendrite servers.
USAGE ## USAGE
=====
Check the command line options with --help. You mainly pass it the dendrite Check the command line options with --help. You mainly pass it the dendrite
configuration file as a means to find a) the media directory and b) the postgres configuration file as a means to find a) the media directory and b) the postgres
credentials for the dendrite data base. credentials for the dendrite data base.
You can also pass in the number of days you want to keep remote media. You can also pass in the number of days you want to keep remote
media. Optionally, you may also purge media from local users on the
homeserver.
How it works: ### How it works
-------------
#### Purge remote media (default)
cleanmedia scours the database for all entries in the media repository cleanmedia scours the database for all entries in the media repository
where user_id is an empty string (that is, the media was not uploaded where user_id is an empty string (that is, the media was not uploaded
@ -24,40 +25,55 @@ configurable via command line and a default of 30 days)
This includes a number of remote media that we might want to keep This includes a number of remote media that we might want to keep
(e.g. avatar images of users on remote home servers). (e.g. avatar images of users on remote home servers).
But the main idea behind focusing on remote media is that a server The main idea behind focusing on remote media is that a server
should be able to refetch remote media in case it is needed. It would should be able to refetch remote media in case it is needed.
also make sense to delete local media, but that is more
complicated. (possible scenarios: local media older than Y days, rooms
that have been left by all users and are thus "unreachable", rooms that
have been upgraded but have users left in it, media that has not been "accessed" the last Y days,....)
But finding out these things and setting all these policies will be #### Purging "local" media (optional)
way more difficult and in some cases we do not have the information
we'd need (e.g. if a media is part of an avatar image, or when media
has been accessed the last time).
In addition it performs some sanity checks and warns if inconsistencies occur: It also makes sense to delete local media, and it is possible using the
1) Are there thumbnails in the db that do not have option -l, but that is more complicated. (Local means, originating by
corresponding media file entries (in the db)? users on our homeserver.)
Requirements a) we might be the only source of our user's media, so any local media
---- that we purge might not be retrievable by anyone anymore - ever.
b) it is not easy to decide which local media are safe to purge.
Possible scenarios: local media older than Y days, rooms that have been
left by all users and are thus "unreachable", rooms that have been
upgraded but have users left in it, media that has not been "accessed"
the last Y days, ....
Finding out these things and setting all these policies is way more
difficult and in some cases we do not have the information we'd need
(e.g. when media has been accessed the last time).
Right now, we purge all older local media, except for user avatar
images.
#### Sanity checks
In addition, we perform some sanity checks and warns if inconsistencies
occur:
1) Are there thumbnails in the db that do not have corresponding media
file entries (in the db)?
## Requirements
- Python >= 3.8 - Python >= 3.8
- psycopg2 - psycopg2
- yaml - yaml
Todo ## Todo
----
- Sanity checks: Are files on the file system that the db does not - Sanity checks: Are files on the file system that the db does not
know about? know about?
LICENSE ## LICENSE
=======
This code is released under the GNU GPL v3 or any later version. This code is released under the GNU GPL v3 or any later version.
There is no warranty for correctness or data that might be **There is no warranty for correctness or data that might be
accidentally deleted. Assume the worst and hope for the best! accidentally deleted. Assume the worst and hope for the best!**