From 7b62b49df494b6fcca3d84f43d5abea5823eaa36 Mon Sep 17 00:00:00 2001
From: Sebastian Spaeth <Sebastian@SSpaeth.de>
Date: Mon, 18 Sep 2023 14:01:38 +0200
Subject: [PATCH] README: Update readme

Now that we can also purge local media (optional!), we need to adapt the README.
---
 README.md | 72 +++++++++++++++++++++++++++++++++----------------------
 1 file changed, 44 insertions(+), 28 deletions(-)

diff --git a/README.md b/README.md
index d7eb871..261c75b 100644
--- a/README.md
+++ b/README.md
@@ -1,19 +1,20 @@
-Cleanmedia
-==========
+# Cleanmedia
 
 A poor man's data retention policy for dendrite servers.
 
-USAGE
-=====
+## USAGE
 
 Check the command line options with --help. You mainly pass it the dendrite
 configuration file as a means to find a) the media directory and b) the postgres
 credentials for the dendrite data base.
 
-You can also pass in the number of days you want to keep remote media.
+You can also pass in the number of days you want to keep remote
+media. Optionally, you may also purge media from local users on the
+homeserver.
 
-How it works:
--------------
+### How it works
+
+#### Purge remote media (default)
 
 cleanmedia scours the database for all entries in the media repository
 where user_id is an empty string (that is, the media was not uploaded
@@ -24,40 +25,55 @@ configurable via command line and a default of 30 days)
 This includes a number of remote media that we might want to keep
 (e.g. avatar images of users on remote home servers).
 
-But the main idea behind focusing on remote media is that a server
-should be able to refetch remote media in case it is needed. It would
-also make sense to delete local media, but that is more
-complicated. (possible scenarios: local media older than Y days, rooms
-that have been left by all users and are thus "unreachable", rooms that
-have been upgraded but have users left in it, media that has not been "accessed" the last Y days,....)
+The main idea behind focusing on remote media is that a server
+should be able to refetch remote media in case it is needed.
 
-But finding out these things and setting all these policies will be
-way more difficult and in some cases we do not have the information
-we'd need (e.g. if a media is part of an avatar image, or when media
-has been accessed the last time).
+#### Purging "local" media (optional)
 
-In addition it performs some sanity checks and warns if inconsistencies occur:
- 1) Are there thumbnails in the db that do not have
-    corresponding media file entries (in the db)?
+It also makes sense to delete local media, and it is possible using the
+option -l, but that is more complicated. (Local means, originating by
+users on our homeserver.)
 
-Requirements
-----
+a) we might be the only source of our user's media, so any local media
+that we purge might not be retrievable by anyone anymore - ever.
+
+b) it is not easy to decide which local media are safe to purge.
+
+Possible scenarios: local media older than Y days, rooms that have been
+left by all users and are thus "unreachable", rooms that have been
+upgraded but have users left in it, media that has not been "accessed"
+the last Y days, ....
+
+Finding out these things and setting all these policies is way more
+difficult and in some cases we do not have the information we'd need
+(e.g. when media has been accessed the last time).
+
+Right now, we purge all older local media, except for user avatar
+images.
+
+#### Sanity checks
+
+In addition, we perform some sanity checks and warns if inconsistencies
+occur:
+
+1) Are there thumbnails in the db that do not have corresponding media
+    file entries (in the db)?
+
+## Requirements
 
  - Python >= 3.8
  - psycopg2
  - yaml
 
 
-Todo
-----
+## Todo
 
 - Sanity checks: Are files on the file system that the db does not
   know about?
 
-LICENSE
-=======
+## LICENSE
 
 This code is released under the GNU GPL v3 or any later version.
 
-There is no warranty for correctness or data that might be
-accidentally deleted. Assume the worst and hope for the best!
\ No newline at end of file
+**There is no warranty for correctness or data that might be
+accidentally deleted. Assume the worst and hope for the best!**
\ No newline at end of file