The recent series of unprecedented terabit DDoS attacks have put on flutter even those, who is quite accustomed to things like that. The situation is likely to repeat itself shortly, and may turn out to be much more destructive, since the major causes of such a powerful botnet as Mirai uprise have not been eliminated.
IP cameras security lack is actually a widely known secret for many reasons, like the long-standing practice of their usage, access difficulties, and passivity of manufacture.
I’d like to tell about a few aspects of this issue and consider some measures that can make the problem less vexed.
The current situation
There are millions of IP cameras sold in the world nowadays. The very first link found in the Internet mentions 200 million surveillance cameras, and this number looks quite realistic.
Not all the CCTV cameras are IP cameras, there’re still lots of analog cameras on the market. Analog cameras are clear, convenient in their own way, and are absolutely predictable, since they have no buggy software in them – just old good hardware. However, despite of the latest burst of AHD and HDCVI, IP cameras keep on replacing the analog ones due to the better image quality and higher flexibility.
IP camera consists of a sensor, a chip for video processing and encoding, general processor, peripherals like GPIO, SDcard and Ethernet, software inside, lens and body.
In spite of the seemingly vast variety of IP cameras on the market (it may look as if there are thousands of manufacturers or even more), there are actually not that many of those who really make something. If we lower the price bar and thus do not consider such major players as Axis, it turns out that in the price band up to, let’s say, $150 per camera, there’re just a few companies.
To be exact, there are 3-4 sensors manufacturers, 4-6 of chips+processors (joint chips) producers, and a limited number of those who assemble all this stuff on a board. Even bodies do not vary a lot, and at the same time there are loads of guys ready to put all that together, rig it up and attach the whatever required label to the box. The most interesting part of this environment are software developers. To a large extent their engineering solutions gave free rein to Mirai.
There are quite a few of them either, the number of software producers in this field have not managed to grow to any considerable values. All the info is pretty much concealed, and this makes problems. Almost all the cameras have a regular Linux with kernel modules inside. These modules provide a means to configure the chip, that processes video from the sensor, and to get this video in the memory of the program.There are cameras that do not use Linux: generally there is one large program in ring 0 in them (if it’s correct to say so about ARM at all), but that’s really expensive in terms of development and thus is not a common case nowadays. To pass video to the network is considered a mere formality, and this very formality is carried out in a pretty mediocre way.
Not only have the majority of cheap cameras mastered their outstanding skills of data loss in TCP, they also use mostly the same series-produced password. And that’s what we can see in the botnet source.
Let alone those mysterious magic UDP packets, that allow network settings changes. All the crazy mess we make fun of in Hollywood movies about «I’ve entered their system and now I’m going to switch their cameras to a static image» turns out to be not crazy at all, it proves to be sad and crude reality.
How come that manufacturers are so careless? Their attitude to the situation resembles unease of parents who prohibit their kids to use “bad ”words and say “there is no such a word”, while the object defined by this word obviously exists. It’s almost impossible to get any exact info about camera internals from the firmware producers, as a rule they hold the opinion that a camera is a closed device with probably no Linux in it. Or maybe there is Linux installed, but that’s nobody’s business, please don’t look into it, look into the ads booklets instead.
Let’s try to understand what could cause such a situation.
First of all we should remember that the subject matter of video surveillance is traditionally related to isolated networks. All that takes its origin from 4 cameras, connected directly to a display and arranged into a mosaic on the screen visible to a security guard (btw, we are really good in making server mosaic, however it’s not for security but for space saving). Security guy is sitting with a baton at hand all the shift long and is ready to rush anywhere he notices a problem within 1250 milliseconds. Neuronet processes any deviation way better than Nvidia Tesla. The only point here is that this neuronet likes taking a beer or a nap sometimes, and this fact messes it all up.
Connection to such a network requires first of all physical presence, and it can be prevented with the help of cameras pointed at the cable all over its run. Some people say that in some national security services these cables are placed inside hermetically sealed tubes with high pressure, so that in case of perforating attempt pressure sensor would go off.
What is important to understand: so far almost nobody provides even TLS encoding of outgoing video! The entire subject matter of video surveillance is not really aimed at the inherent safety of video infrastructure. At least, not that much as stated by those who eagerly take our money for the protection against dire threats.
Secondly, IP cameras are usually a part of security environment, i.e. a part of software-hardware-human-service complex. In other words, as a rule an IP camera is switched to a registrator, registrator is monitored by a security guard, security guard has a alarm button, alarm button sends a signal to police. The fact, that people nowadays put surveillance cameras in their entrance halls, and municipal government install thousands of them on the streets, is a bit at odds with the way it was supposed to be. I.e. there appeared new ways of usage of a definitely good device, for which this device has not been actually intended.
Thirdly, IP camera is a very isolated device. It’s extremely difficult to take a camera and configure this very cam, since it has almost no input/output interface.
This is a subject for a separate post and we’ll cover it as well, but right now let’s just remember the following point: it is really inconvenient to set up an IP camera.
So, IP cameras have suddenly got in demand as OTT devices, with online access to them. OTT — over the top — means that service is provided not inside a monitorable network, but in the public Internet or at least at a joint of several networks. I.e. that may happen when there’s no guaranteed speed and one can talk just of “the speed of Internet”, which is rather uncertain. However, as we have already found out, “the speed of Internet” from a camera in Brazil to Dyn servers is quite enough to skip through the Dyn’s net.
Where there was an isolated system, that terminated itself on the security guard, now appeared a device, which people can turn to for video via the Internet whenever they want to see what’s going on in their shop, to check if the mailbox is still there on its place, to have a look at the kids playing in the backyard. Obviously, cameras manufacturers (mostly Chinese, since only their cameras can be afforded) accepted the market challenge. There appeared mobile apps, (so far there is actually no way to watch video without ActiveX on IP cameras, since as a matter of fact cameras provide no web-access to video), that refer directly to the camera and display video from it.
Here we can face an interesting situation: people are taught to set some outward ports in order to get online access. Many clients like that do not even use the standard RTSP protocol, but send video via a curious and exotic thing that runs on 34567 protocol instead (we are studying its structure at the moment), but who cares if one can just set all the ports outwardly? Who would ever need my little camera pointed at the parking place in front of my door?
So, the important background of Mirai launch was the fact, that cameras producers have indiscriminately made people solve the problem of ports forwarding and flooded the Internet with the devices, capable of generating dozens Mbits of traffic, and these devices have been provided mostly with the same root password. Amazing, isn’t it? All these treasures have been here, right under our feet, for at least 5 recent years.
At the moments it is not clear how this issue is going to be resolved, and that’s actually not our responsibility. Maybe, one can access all the cameras found in the Internet in bulk and brick them, or these cameras can be patched also in bulk, or Internet providers can be forced to block these cameras. We’d like to discuss a different thing: how to prevent turning a camera into such a botnet member.
Modern access to cameras
Progress goes on, and today there are two basic ways of getting online access to IP cameras: direct access and access via a server in the Internet.
Direct access is reasonable for a local network, but as we have already found out, it is not always a good idea to access a modern IP camera directly by the Internet. Even when the security issues are fixed, there will still stay the problem with the simultaneous access to the cameras. For example, in kindergartens the request of admins not to access the cameras by more than four parents at a time looks just touching.
There is really much of ADSL Internet today, the key feature of which is A, “asymmetric”. Downloading a lot is easy, but uploading a lot is pretty hard.
So let’s refer to the online services that provide access to cameras.
Online service of camera access (VSaaS, video surveillance as a service), besides the opportunity to watch a camera remotely, can also offer some other interesting options, and the major one is a remote registrator. In many cases the video registrator, i.e. the device video is recorded to, is the first thing that falls under the acts of crime. If video goes to the Internet right after it has been recorded, it is very difficult to take it out from there, an attempt to do it would be similar to slipping money into a traffic camera. This will not save the cameras, of course, and recording will be interrupted, but the chances to get the pictures of the abusers and to show them to the police significantly grow.
Does it cost much to upload all the video to the Internet? Not long ago the very idea of watching a movie online could scare us to death and made us claw hold of our wallets, and nowadays it is a common practice that has replaced CD shelves. The same goes on with the unused backward traffic: why not to upload, especially if your provider delivers this service? Some cameras make a record to the cloud when they detect a motion, however this approach still has its cons and pros. Does the saving of, let’s say, 30 bucks worth it, if the camera skips recording these very 10 minutes you need? On the other hand, for many cases switching the camera to motion-activated recording mode makes the cloud service almost free for its provider.
How do the Internet-services, especially OTT, i.e. the ones that have no communication channel with the cameras, get access to them?
Connection to the cameras
How can the connection between camera and online service be more secure without putting it outward? By means of the direct communication channel with the service.
Let’s set something like VPN tunnel and start publishing video from the camera to the server, and that will resolve the issue with Mirai. Some Chinese manufacturers have taken it a step further and sell cameras, that are said to work only with their own service. However, here we face that awkward situation with parents and non-existing bad word again, because despite of the Chinese managers claims these cameras still do have RTSP server, as well login with the standard password.
Camera producers from China (since, let’s face the truth, almost all the cameras are made in China) create lots of other striking trouble spots for future issues in this area. For example, not long ago we have come across a camera with the built-in online access service. This camera has such a remarkable option as mail sending on motion-detector activation. The camera manufacturer has decided to go backendless way and assigned the same login and password from the same account on 163.com site (that’s a kind of Chinese hotmail.com) to all the cameras. As you understand, one can not only watch the video content of other users of this camera, but send them something interesting and extremely active.
But let’s turn to a more common situation for today: online access to the camera is a complementary thing to the traditional work with the registrator, while the camera itself is behind the NAT and is not outwardly visible.
We’ve seen two basic ways of getting video from camera: some software, installed on the camera, publishes video to the server, i.e. initiates the video transferring start, and some software on the camera makes this camera accessible for the server in the Internet (we mean VPN tunnel).
The second way is much more practical in most cases, since it lets the server decide, when it should get video, what quality this video should be, and also lets it take screenshots from the camera.
The first way requires careful construction of complex feedback mechanisms, that would allow video transfer termination any time, otherwise you run the risk of making your own micro-Mirai: 1000 cameras, trying to upload Mbits of traffic to the wrong place, can be considered a serious issue.
Let’s have a look at the different ways of setting VPN tunnel or something like that.
OpenVPN is traditionally considered the easiest solution. The server part is free of charge and is not really hard to configure, when you need 100-300 cameras. The key advantage of OpenVPN is the fact, that it already exists out-of-the-box in buildroot project, with the help of which the firmware of the most cameras is put together.
OpenVPN client is started on the camera, it somehow gets the certificate on the first start and then turns to the service. The other side has a streaming server that sees, that the camera has got connected, and starts taking video from it, taking screen shots, etc.
VPN approach is also convenient, because the entire logic variability is on the service side. It’s much more easy and handy to update, support or edit something there, than on the camera. The error on the camera will cost you not just the camera, but the customer, besides it will result in active and vexing threads on forums.
This approach has its disadvantages, though.
VPN is a powerful mechanism for very common tasks solution, and it is rather resource-consuming for a server. Unfortunately, we have got no practical data regarding the possibility of accepting GB of traffic via OpenVPN on one server and forwarding it to a streaming server, but looks like that’s too much.
We can say for sure that OpenVPN usage means at least doubled server infrastructure compared to a regular system, because traffic is forwarded first to a separate OpenVPN server, and then to a streaming server.
In ErlyVideo we have developed our own solution for getting access to the cameras behind NAT, that requires installation of our agent onto camera. It works in a different way.
Our goal was to develop some software, that will be installed on a camera in order to access the camera from the server in the Internet.
We rejected the idea of making VPN server and streaming server separately, since each server is expected to receive heavy traffic, and there is no need to duplicate it among different software. Our agent gets connected directly to Flussonic and transmits video from camera right to RTSP handling code, i. e. this is not a generic transport, but a specialized means of video and screenshots delivery from camera to Flussonic at minimum expense.
We also resolved some other questions, like what address this agent should turn to and what it should do, if Flussonic goes off.
In case of Flussonic turning-off we decided to use the two-phase locking scheme of the protocol: first the agent gets connected to the endpoint, then it goes from the endpoint to the streaming server. If the endpoint goes down after connection to the streamer, that’s no problem, work goes on.
The situation with the endpoint address is more sophisticated. Since we are not service providers, but software developers, we sell software that allows people to build their services. That is why we can not hardcode one fixed address into the agent, it must be configurable. So we have worked out a simple technical solution, that allows forwarding the agent to the proper address on the initial activation.
As a result we have our own uncomplicated mechanism for getting access to cameras behind NAT. Even with the encoding enabled it proves to be less resource-consuming than low-level OpenVPN, and, in contrast to the former, it helps to resolve the “bug with invalid handling of non-blocking sockets” issue (i.e. bug, renounced by manufacture and not visible to customers), since all the data is quickly taken by our agent and then is uploaded from it to the Internet by means of regular libevent.
On the server it is Flussonic that decides according to its settings, if it should turn to camera and take video or screenshot from it, or even if it should start 2-way audio transmission.
Such approaches (standard VPN, publication from camera, self-made tunnel) make it possible to mitigate the risk of exposing cameras in the Internet. In our next posts we will tell about uploading our agent to cameras, about the way the agent works in the service of providing video from cameras to the users, about some interesting points of its setup, and about the most important part – how a camera is assigned to a user, and here are lots of variants. We also have some drafts about our own IP camera firmware — please let us know in comments, what topic would be of more interest.