Update: social.tchncs.de maintenance started around 7:30pm MESZ.
Update2: almost there.. final rsync
to new minio instance running...go rsync, go!!
Update3: Mastodon has been moved
In Short
Today i will move social.tchncs.de on t2 to the t2 replacement machine as t2 got a failing ssd.
This will take a half hour upto two hours.
Later today or tomorrow i will start migrating Illuna Minetest and GitLab.
In Long
What happened?!
One of the two disks in Raid1 shows errors, our hoster (online.net) is for some reason unable to replace it and instead offered a replacement machine.
I have a limited window of time to do the migration... (until 2. July).
Why such a long maintenance?
As i don't have the budget for a failover machine i could have switch to and made it become the master server,
i never really attempted to do such an operation, of course it is possible to replicate the database and do a
very clean move to the replacement machine.
I actually even prepared a test setup for exactly this at home.
But as my network connection is quite unpredictable, it can even take a half day to do something which normally takes like two minutes.
And yes, this literally happened the recent days multible times.
$image-83312a53-ace0-4ae6-811b-e0d474a6ce8d
For this reason i decided to skip this testing setup and don't try what i figured out in production yet,
instead i'll do it the classic way and import a database backup to the new machine.
Everything else of course is already prepared so the only things that will be done during the maintenance are:
- doing final backup of database and import it on the new server
- doing a final sync of uploads and related to the (new) minio server (you may notice that images will be loaded from f2.tchncs.de and may wanna unblock this if you use strict browseraddons like uMatrix)
- waiting upto 600 seconds for the new ip adress to be assigned to the social.tchncs.de domain
I am now on this instance, will this happen soon again?
Well of course a defective component does not ask when to become defective.
However, you can normally expect from a hoster to be able to replace defective parts,
therefore i plan to move to a different hoster later this year.
Also, i will continue doing tests with the current test replication setup so the next time this servermigration happens, it will be done in notime (on the user side).
It's just bad luck yet that my network connection is this unstable and i lost a ton of time.
Wait, if your connection is this unstable as written above, how do you ensure that you do it in max. 2h?
I will sit in the nature with my smartphone doing hotspotting... .
$image-5864bd2c-8a2b-4c19-89ab-500ff9124242
There is still a few days left, why don't you keep on experimenting with the replication way?
Because i want a few days as buffer to finish everything up and to be able to react to things i may forget and stuff before they take the old server away.
Beside that, this external defined timewindow of ten days is already impacting high priority things in my reallife.
Why don't you do this when Mastodon is less busy?
This would mean i need to do this betweek 2am-4am in my timezone and i want to have a clear mind when i do that.
Also as written above, it has to do with things in reallife that don't allow me such things right now.