How to update the servers the right way

December 23, 2021

1

At Flussonic we are constantly working to improve our products and frequently release updates to correct bugs and to add new features to the software.

However, we know that updates can be scary for some of our customers as they can cause stress and some things stop working the same way they did before. For this reason, we have written this article with some tips to follow to help our users with the update process to make it less traumatic. Follow our advice step by step and you will see how everything will be much better.

TL; DR

  • Always reads the changelog. It is very useful to know in advance the changes introduced in the new version. We always publish changelog with each version.
  • Make backups periodically.
  • Don’t update on a Friday at the end of the workday. If you have any problem, no one from the technical service department will be available to help you.
  • If there is something you don’t understand, don’t panic. Contact our technical support requesting clarification and we will gladly help you. We recommend asking any questions you may have to clarify doubts before updating.

1

Read the Changelog

The changelog contains a list of changes introduced in the new version, with a short description explaining each one of them.

Please read carefully and pay special attention to the new features. We always include links to our knowledge base with detailed instructions.

For example, in the Flussonic 21.12 changelog, we mentioned WebRTC 9 times. Take a few minutes to learn how the new settings will help you improve your service and train the WebRTC work on staging server.

Upgrade staging first

Most operators do not use staging for cost reasons. However, one or two servers should be sufficient for small projects and a short downtime should not translate into losses for the business.

When there are no dedicated servers for staging, production servers undergo configuration experiments and can become inactive when updating the software, updating the operating system, and performing hardware maintenance. The backup server is an excellent alternative to use as staging. It is always on standby and connects only during peak load hours. When the service is running regularly, you can disable the backup server while you familiarize yourself with the new version.

It is recommended to first update the Staging server, open the web interface, visually ensure that it has started and that it has started capturing streams.

Then we move on to instrumental monitoring.

Monitoring

Monitoring is an important part of any service. Without adequate and constant monitoring we can only deduce that things are working correctly. It is basically saying “well everything seems to be working correctly”.

With detailed monitoring, we can talk about actual metrics and understand exactly what is going on: “none of the N metrics deviated, all Z channels have been captured, the input bitrate has not changed”, etc.

Flussonic has a built-in Prometheus exporter and a Grafana Dashboard. These are modern monitoring tools you should be using.

Learn more at https://flussonic.com/doc/api/monitoring-flussonic-with-prometheus/

Enable monitoring of all servers, configure notifications, examine peak load hours, schedule server updates, and view metrics. This way, you will save the call to technical support.

The worst time to update

If you only have one server, there is no way to transfer the load from one node to another. So the most tempting option is to update the server when the service has less demand from users: it is usually late at night or early in the morning and even on weekends.

This is the worst time to update because it is very likely that you will have to work overtime and the technical support service will not be available. If an error occurs, it is likely that you will panic when you see that you cannot solve the problem immediately and when you realize that most of your users are about to return to your service.

We recommend starting an additional server so that you can perform updates during business hours and can rehearse the update in Staging.

So what is the best time to update?

It is better to plan the service update at the beginning of the working day and after making sure that you have taken all the steps to be prepared:

  • After having studied the changelog in detail
  • After having tried the new version in Staging
  • After having configured monitoring
  • After backups are made
  • There is a clear understanding of how to return the system to its original state, if necessary
  • The load has been distributed among the other nodes
  • Subscribers and other departments are notified of work being done

Flussonic is running on Linux, so the update should be done by a system administrator with reliable skills in package management, reading logs, and editing text files. It is important to be able to work with systemd and read other system logs, except the Flussonic logs.

If you are unsure of your Linux skills, please contact support and confirm that the engineers will be online during your update process.

Simply put, the best time to upgrade is when you are completely sure you are ready.

In practice, regular updates take no more than 15 minutes, including the time spent reading the changelog.

3

Don’t forget about backups

A common mistake is not taking a backup of Flussonic: Flussonic Media Server does not store personal data, it does not store the results of people’s work (eg code or text) and it is not a file-sharing service.

Flussonic can be set once and forgotten, obediently switches bytes from input to output, and doesn’t remember itself. Until a “crash” occurs:

  • the administrator accidentally modified the configuration
  • the server’s file system is corrupted
  • the current configuration is not compatible with the previous version of Flussonic
  • the server is physically out of service, you must deploy it from scratch on new hardware

If there is no backup, then adding even a few dozen threads, their names and sources from memory is not a quick or pleasant procedure.

Very simple if you have a .txt or Excel file with provider sources and very complicated if you need to scan the network and re-add several hundred IP cameras and distribute rights.

That is why it is necessary to backup every day:

  1. Flussonic: flussonic.conf is downloaded using crontab. I, in addition, it will be good to collect the configuration through the API
  2. Watcher: make a backup of the PostgreSQL database, we upload it to reliable storage or another server

Try to restore the service from scratch from backups on another “clean” server. This will ensure that the backup contains all the necessary data.

Reverting a service

Did something go wrong? The package has not been installed? Have you detected a reduction in traffic from tracking? Have some sources stopped working? Have subscribers started calling for assistance?

Don’t panic, calmly go to the Flussonic web interface and describe the situation on the upload debugging page.

The received ID is sent to support@flussonic.com or creates a ticket.

If the situation is not critical, wait for a response from the technical support team. Our experts will help you stabilize the service or update the configuration for a new version.

Is the situation critical? Install the previous version of Flussonic, the one just before the new one, not the version from a year ago.

Be prepared because you will need to roll back the dependencies and flussonic.conf or the Watcher database will not be supported, because its structure has been updated for the new version.

For such cases, we made backups and trained on backup and staging servers. More importantly, Flussonic technical support will be available for you. Try not to perform actions that you do not understand and do not copy commands from the Internet. Contact support, tell us where your backups are stored, and describe the steps you have already completed.

Don’t update everything at once

- Hi support, please help me rollback, I have updated and my service is down!
- Let’s see … Your config file contained more unused options. I have rolled back your server. Please read the changelog and prepare to update.
- And what to do with the other 11 servers?

There are also such cases: the commands to update the server are copied to the clipboard and inserted sequentially in a dozen terminals. After updating all the servers, it becomes clear that none of them are working due to a small bug.

Never update all servers at the same time. Upgrade one server, wait a bit, repeat with the next one. If you have a lot of servers then it is worth doing using Ansible (we will teach you how to do it or you may enable the extended support).

Autor:
Maksim Klyushkov
Palavras chave:
backup upgrade