How We Created Our Own Video Storage System
June 25, 2020
To develop and manage a video surveillance service, it’s necessary to solve many issues. One of the key issues is where and how to store the immense amount of video data. There are many options, but most of them may be unprofitable for a telecom operator providing a video surveillance service. When using simple DVRs, video is stored on only one disk, and if the disk is damaged, it is irretrievably lost. You can make the storage more durable, create an archive, but it’s more expensive, and RAID does not solve all backup problems.
This is our story about how we chose the optimal video storage scheme for our customers, which options we tried, and why we became disappointed in them. As a result, we came to the idea of creating another, this time our own, storage system.
The very first question that needs to be addressed before choosing a place and method of storing video is how much space is required. A 1 to 8 megabits per minute stream size takes up from 10 to 80 GB of storage. In the case of our customers, telecom operators offering video surveillance services, we are not talking about one channel, but about hundreds and thousands. How do you manage the storage of such a huge amount of data in order to avoid its loss and unreasonably high costs?
How Do You Reduce Video Size?
The first logical question is how can you reduce the volume of the video stream so that less storage space is required? There are options - some were offered by us, some were offered to us.
Option 1: Storing video in jpeg Frames
Previously, it was fine to store the videos in jpeg format, as a series of frames (this format is called motion jpeg or MJPEG). Video made up of independent jpeg files is about 10 times larger than a similar video compressed with the h264 codec. This type of storage is now used only by professional storage systems for original digital content (MxPEG, etc.). They are completely satisfied with the volume occupied by the frames since their priority is to preserve the maximum initial video quality. But we are dealing with bulk video delivery, so this type of storage is not compatible with our tasks and requirements. MJPEG is an outdated format and only for those cameras that lack the processing power to compress H264. MJPEG was used only for very small resolutions (640x480 and less) and small fps (up to 1 or even 0.5 fps). Today, MJPEG continues to be used in those situations where the communication channel is so narrow that 1 frame in a few seconds is enough. For example, it can be used in video surveillance for security or live broadcast from a ship.
This approach has nothing to do with volume reduction, but given that we were asked such a question, it makes sense to mention it. This method of video compression can only be compared to storing raw video.
Option 2: Transcoding
We can use another method - transcoding. The main idea of this method is that the processor on the camera is weak and the server is strong. So you can spend more processor cycles on compression and get more compressed video. In practice, with a good image (if there is no snow or clouds), you can get up to 5 times video compression on the CPU compared to the camera.
Alas, this method is suitable when you can allocate 1 computer for 10-40 cameras, which is very expensive. Today, such expenses can only be afforded by video analytics but not by regular video storage.
Option 3: Motion Recording
If we cannot reduce the video that comes from the cameras, then we can at least partially store/record it.
You can configure video recording by motion or other triggers. The cameras, in this case, only record if something happens on the video, and thus a lot of space is saved. But when saving space in this way, you can lose a number of frames with essential information in them. For example, the face of a robber if there was a break-in at the point where the camera was installed. In this case, the whole sense of video surveillance security disappears. There are pre-recording mechanisms, for example, when the minute is written before the sensor is triggered. They help not only save disk space but also save essential video information like an intruder’s face.
We offer our customers another method: everything is written to disc, and then in a few days only the fragments where there was movement remain. This reduces the risk of the alarm sensor not working (for movement, door opening, or sound) and you can store the video for much longer, and this may be necessary to analyze when the robber came into the building before the crime.
Historically, one of the first ways of storing video is with DVRs, and telecom operators often start with it. At the moment, video storage has stepped forward so far that choosing DVRs is unprofitable and not always convenient. Installation and maintenance of DVRs are a significant expense. Another point you have to consider is the protection of the DVR. In the event of theft, it disappears along with all the records stored on it, if the storage is not uploaded to a separate server in the cloud. And we are talking only about one DVR, where the recording of one disc is designed for 12 to 32 cameras. The more cameras that are serviced by the telecom operator, the more DVRs will need to be installed, and the higher the cost of servicing them.
We saw how much pain the telecom operators have building the service on the basis of DVRs. The installers constantly travel in circles repairing DVRs, so we really do not recommend storing videos exclusively this way.
Servers and RAID
It is much more reliable to record the footage on the server’s hard disk, but the question arises here: how will we distribute the recorded video? Reading from the HDD spindle is not possible - this will quickly wear the disk out, which means we would need a cache for video storage. It’s possible to write a cache if we are talking about a small archive for a limited number of channels. For this, you can try to duplicate data.
What hard drives can you use for this? It would seem that the larger the drive, the better, but the 12-terabyte drives are not very suitable for a small video: the drive is large, and in order to fully use it with a small storage depth, you need to save a lot of recording streams to it at the same time, which the drive can’t really handle well. It uses magnetic tiled recordings, and the software has to work very carefully with it: try not to overwrite, but to append, and then erase the entire hard drive. Hard disk space can be enough for 1000 cameras, but with a standard recording speed of 150-200 megabits per second, we can only cover 100 cameras at a time. Also, the larger the hard drive, the less it will be used and thus the less economical it will be.
To duplicate data, many of our customers want to use RAID on their servers. RAID duplicates data between hard drives so that the system can safely survive the failure of one of them. Sounds good, but we really don’t recommend using it. In our experience, many users have never carried out tests to restore it under high load. These tests include turning off the drive from the server on the go. And here we are faced with the specifics of the video: all the loads are very smooth, which makes it very easy to maximize the limit of the system. When the RAID is at 90% load, the drive crashes, and then the load rises above 100%. When we try to re-upload the video, everything turns out even worse. If for some reason we do not have time to record a frame, a queue of the frames accumulates, the recording speed decreases, and the slightest delay leads to the loss of entire fragments of video.
There are even worse situations: not only the first, but also the second hard drive may fail due to increased load, and this is a fairly common problem for RAID (especially if hard drives from the same batch were purchased).
Enterprise Storages and Hard Drives
Here enterprise-class storage can come into play - many of our customers turn to them, but the results are mixed.
Full-fledged enterprise-class storage has undoubted advantages: scalability (although it’s usually not clear what it means, it’s more like a fancy word they like to use), a certain fault tolerance (for a lot of money). However, we were faced with obvious disadvantages: a disk shelf is, in fact, a separate computer, moreover, it’s quite expensive.
Disk shelves, or SANs, are designed to serve a dynamic number of virtual environments where it is not known in advance how much space will be needed. In the case of video storage, they become too expensive and inconvenient, because everything, as a rule, ends with a configuration “one video server per one shelf”. This turns into us buying a computer for receiving video and next to it putting another computer for copying and recording. So, the number of computers is doubling for no particular purpose.
Our customers do not want to complicate their task by buying additional servers and external storage, and we can understand that. Alternatively, we have the Amazon S3 object storage. The advantages are a simple and intuitive interface, extremely rare data loss, excellent fault tolerance. But the only disadvantage outweighs all the advantages at once: it is incredibly expensive.
It would seem that Amazon S3 costs ridiculously little for one write operation. But if we record video in an indexed archive format, we get more than one write operation per segment. Say we have 20 segments of video (3 seconds each) per minute. They form 1,200 segments per hour and thus, 28,800 segments per day with continuous video recording. If we record 1000 cameras in 3-second segments in the storage, we get about 28 million segments per day or 850 million per month.
The tiny cost of $ 0.005 per 1,000 requests (at the time of writing this article) turns into $ 4,300 per month for only one entry or about $ 4.5 dollars per camera. Such a price tag makes the subscriber video surveillance service unprofitable because there is also the cost of storage, the cost of channels, and servers.
Ceph object storage can become an alternative to Amazon S3. The benefits are the ability to back up and segment recording, and the hidden cost of ownership. Someone may say that Ceph is free, but it can be said that Amazon rivals spawn like mushrooms after rain. So, the price tag they offer is not prohibitively huge, and the cloud file storage service is not so cheap in itself. To operate Ceph, you need a good administrator with the expertise to repair it. We encourage customers to work with Ceph only if they are ready to conduct system tests, which consist of abruptly disconnecting the hard drive from the server or disconnecting the whole server. If tests are not carried out, then the team will simply not have the expertise to handle a situation where the hard drive fails.
Our Way Out
In the end, having tried all these options, we stated the following requirements for the storage system:
- fault tolerance
- relatively low cost - the video surveillance service should pay off
- reliability - provided that one hard drive starts to work slowly or fails, this should not affect the rest of the system
- the ability to not only save, but also to distribute video
- maximum simplicity of design: we have no desire, ability and need to get involved in writing our own file system
As a result, we came to the development of our own system under the production title Flussonic RAID. The server administrator mounts all the hard drives separately, there are no dependencies between them, and metadata is placed on a separate SSD. Video is evenly distributed across these disks in separate fragments, and, accordingly, in the event of a failure, the maximum data that we lose is the data on one particular disk and individual fragments of one video. For us, the most important thing is that even if 9 out of 10 disks fail, the 10th will still work, and the data from it will not disappear. The system is scalable: you can buy a server with 20 slots for hard drives and expand flexibly, buying additional ones as needed.
In fact, we transferred the task of managing a pack of hard drives to the application level. We made a more specialized, simple, and effective solution for our specific task than generalized RAID5, which is good for everyone but not ideal for anyone. The main reason this has become possible is that the video is not dubbed! When it’s written on the disk, you can only erase it. This is a very convenient pattern of append-only data.
We decided to abandon data duplication inside one server.
- Firstly, it introduces all the problems that a regular RAID5 has (degraded state during disk recovery)
- Secondly, it does not protect against server failure. To back up data today it’s cheaper and easier to duplicate data between servers.
Even in such a simple mechanism, there were a lot of difficulties, and one of the most subtle details is the choice of a disc for recording the next piece of data. On the one hand, the disks must be loaded evenly, on the other hand, we want the bulk of the system to be idle with 70 disks, and only active units should be used for writing (for energy-saving purposes).
We plan to improve the system further. We are currently planning on adding a good intuitive interface for monitoring migration status, a smooth shutdown of failed disks, event API, SMART monitoring, as well as the ability to monitor all the hard drives on the same cluster at the same time.
Our customers are already using this new system and are very pleased that the speed and responsiveness of the system have increased significantly. A local drawdown in the operation of one hard drive no longer slows down the entire system. For all this, the cost of storage is relatively low, making the video surveillance service cost-effective for the operator.