The importance of client certificates in IoT
The use of digital (X.509 PKI) client certificates is certainly not new, but perhaps wasn’t exactly widespread, especially outside of corporate environments or large organizations – and certainly not in comparison to server-side certificates for websites. However, it is fast emerging as the key identity and authentication mechanism for machine-to-machine (M2M) communication in IoT scenarios.And for good reason.
Benefits of client certificates for IoT
I mentioned it before, but client certificates are qualitatively superior to other authentication mechanisms we have available, which are especially relevant in M2M communication. The main reason for this is that they are based on public and private keys, where the private keys are never shared. That is different from usernames and passwords, or token-based authentication schemes (with the exception of one-time-use tokens), where that information itself is sufficient to authenticate a particular host (user, or device). If ever an attacker would be able to tap the network successfully, capturing the username and password, or the token, would allow the attacker to impersonate that device. That is not possible with client certificates.The only way an attacker could impersonate the device would be if they managed to steal a copy of the device client certificate. That requires a substantially more sophisticated attack.
These client certificates should be used in combination with server certificates at the end point that ensure that we are sending any data to the correct place, without tampering.
The danger of listening in to the nework is of particular relevance in M2M scenarios where such tapping might be much harder to detect. Browsers validate server certificates, and have root certificates built in, and at any sign of danger will stop users from accessing the website without clicking away a strong security warning. In M2M scenarios, there is no browser to handle this, and any scripts that retrieve data and send it to an end point will have to take care of such checks.
It is imperative, therefore, that the code that sends the data verifies the server certificate, and the server validates the client certificate. Without such validation, you still get an encrypted channel, but there is a realistic danger that there is someone listening in. This is called a Man-in-the-Middle (MitM) attack.
As you can see in the diagram below, in a normal operation (described in the top half), a client makes a request and the server sends its own certificate. The client verifies this certificate with the Certificate Authority (CA) to check whether it is real, which itself sends a response that is checked by matching it with the pre-installed root certificates (that define the trust in the CAs). We’re asking a 3rd party to guarantee that the server says who it is.
If, however, the script does not validate the certificate, the scenario in the bottom half is eminently possible. An attacker – whether through ARP spoofing or DNS cache poisoning, for instance – insert himself in between the client and the server. The attacker pretends to be the server and will send a faked certificate to the client. As long as the client accepts this, it will open an encrypted channel to the attacker, which of course the attacker can decrypt. The attacker then forwards the request to the server, pretending to be the client. If again the server doesn’t check the client, the attacker can now monitor the traffic between the client and the server, and either manipulate it directly, or retrieve credentials it can use to send its own data to the server – obviously manipulated to their desire.
It should also be obvious from this that the use of client certificates provide an extra bit of protection. Because even if the client doesn’t validate the server certificate, the server certainly should validate the client certificate. If this doesn’t match, we can simply reject the data submission – a check that we wouldn’t have been able to perform using passwords or tokens.
One interesting aspect – at least for reasonably controlled environments, and especially IoT solutions deployed within four walls or within a extended corporate network – is that the relevance of root CAs is substantially less. While they are essential in human-to-machine traffic (secure web browsing wouldn’t be possible without it), we can act as our own CA as long as we are confident we secure the location that holds the central directory of keys we verify against. That is, you can act as your own CA, which may bring down the cost of PKI infrastructure substantially. This is very relevant in situations where the devices might run in the thousands, tens of thousands or beyond to help reduce cost, without sacrificing security.
Certificate metadata attributes
We can extend this already fairly secure client- and server-certificate mechanism for additional security. Think of it as “2 factor authentication” (but not exactly) for client certificates, as well as some other use cases I will get to in a minute. We have been working with a few partners on this and it works well. The diagram below will help to illustrate what this is.
Basically, we are adding an additional check beyond the certificate itself. We maintain a series of metadata attributes that are associated with the client certificate in a separate location from the client and server. When a client presents its certificate, the server validates it with the CA, but also requests the meta data attributes if it is correct. The server should be aware of what the value of a particular attribute should be, and if this matches with what we retrieve from the certificate metadata, we proceed. If not, we do not accept the data submission.
An important piece here is that the client certificate doesn’t contain these attributes themselves. They can only be retrieved from the attribute store when validating the certificate. Moreover, the update of these attributes happens outside of both the server and the client.
This gives us an “instant revocation” option. Certificate revocation can take a while to propagate through the system, but if this attribute doesn’t match, we instantly stop data ingestion from that device. This is what we mean by a “2 factor authentication” like approach. It could even be scripted, so that if we receive an alert from our monitoring system that there is a possible compromise of a device, we change the certificate attribute value, and essentially disable the certificate (in case it might be stolen).
In addition, this can be used for more business related metadata. Imagine a situation where it is critical that specific steps are made in sequence and cannot happen outside of sequence. This could be, for instance, tracing a shipment from factory, through trucking company, to customs, to being loaded onto a ship, ran through customs again, picked up by another truck and finally arriving at the destination. To ensure integrity throughout, we can check that all way stations are hit in the appropriate sequence and send alerts in case something arrives out of sequence. It could be used to guaranteed refrigeration, where the moment temperature hits unsafe levels in the transport of perishable goods, we change the attribute value and won’t accept delivery.
Consequences of the use of client certificates for edge infrastructure
With the increased security all of this provides, there is a consequence, and that is that this will only work for devices that are capable of performing the necessary computation required for PKI. That largely excludes small microcontrollers, and quickly gets you into devices and boards running Linux (or Android). There are PKI schemes for lower-powered devices (16-bit) using elliptic curve, but generally this will require 32-bit at least. This may be problematic for certain IoT scenarios, especially those out in the open, running on battery power, but for many others won’t be an issue at all. For ultra-low-power devices, we have to use other approaches in which our security research team in southern France has done important work.
Using IoT gateways, which often translate IoT specific protocols to something more standard as HTTPS, you can get around the lack of PKI support in the devices somewhat, by placing the certificates on the gateway, but I am not a big supporter of that, as this breaks end-to-end security, and requires us to trust that the IoT gateway and the device communication hasn’t been tampered with. Trust alone doesn’t buy much in information security. using certificates on the gateway may be acceptable in some scenarios, in many others this is likely insufficient.
However, with the cost of 32-bit and 64-bit boards getting cheaper and cheaper, this restriction should increasingly be less of a concern. ARMv7 – already a dominant mobile devices platform – is an important platform for IoT as well. A Raspberry Pi Zero costs just $5. Other chip vendors are coming out with specific IoT designs as well, including encryption capabilities.
It is essential in IoT that we can rely on the data feed from the devices in our IoT landscape. End-to-end encryption and identity management between those devices and a data ingestion point through client- and server certificates provides us with that trust in a better way than alternatives, when implemented correctly. Where at all possible then, use client certificates (and an overall PKI certificate management infrastructure) for your IoT device infrastructure.
I'm trying to understand the concept of the metadata and whether this is something we should think about for our own product.
The first example you mention is about an additional check of the certificate's validity, based on an independent datastore. This sounds to me like the classical identity management approach. Yes, if you want to deny a user's access to systems, you will not wait for the user's certificate to be revoked. Rather, you will disable the user's account in all relevant systems, which instantly blocks the access.So the metadata in this example is something like an identity store, which allows you to check if not only the certificate but also the user account are still valid.
The second example is about sending information about the current state of the business process as part of the certificate. That is certainly possible, but why would I want use the certificate as the transport vehicle? I could also send the business data (signed or encrypted) as part of the payload, without having to change the (highly standardized) structure of the certificate. Maybe I'm missing something?
Thanks for your reply!
In the first example, yes, this is not that different from classical identity management approach, with the difference that this is not about users but about things, and there isn't a user to react to or interact with. This could be easily integrated into a monitoring system, without really having to worry about user accounts: you simply stop ingesting data from this device immediately. You may still beyond that have to do something like uncover why the device is suddenly behaving strangely and take corrective action (including removing the account from the system), but this stops that ingestion immediately and can be triggered automatically by network access control or other monitoring systems. Data ingestion can be on a high frequency (multiple times a minute/second) so there is a time interval between what can be done in an automated way, and what can be done by a human operator.
For the 2nd example: it's probably important to understand that this is NOT standard (in fact, the two partners we worked with took a different approach from each other), and yes, this could be handled in the application, but it doesn't _have_ to be. Especially if this is happening in the real world, outside, etc. It will be much easier to scan something and verify a certificate in a publicly accessible certificate store/CA, than provide everyone with a device that connects back into an SAP system, which may be owned by a single provider.
Hope that answers (some) of your comments?
we're almost there 😉
So if a company uses a platform that doesn't support the management of user accounts (or the company doesn't want to manage device identities as users), then they could instead use a custom repository to map e.g. the certificate name against, which might maybe have a smaller overhead.
I'm not sure about the second scenario. Yes, if I send status information say in JSON format, then sender and receiver need to know the meaning of the values. On the other hand, if I send the same information as part of a non-standard certificate attribute, then also both sides need to understand the meaning. Actually it might be worse, if actors along the way (load balancer, fire wall) "think" they know the certificate format and try to parse and handle it. From what a see, neither case depends on a connection back to some specific SAP system.
on 1) yes, indeed. We care about knowing where the data came from, which device. But it doesn't mean we need to have a database account in HANA for each of them, but rather the identity would check the identity on ingestion, and drop it in into the database table. If we're talking about a device infrastructure with 100,000s of sensors and things, handling things are user accounts seems ... unmanageable, while providing no real benefit.
On 2) yes, you'd need to agree on semantics of the message, but that's a reasonably easy thing to do. Basically, the code to check and read the attributes is a couple of lines - could easily be done for all kinds of devices as long as they have some sort of internet connectivity. That is, it should not be problematic to write such code for a particular purpose for some scanner of some kind. That's much easier than exposing a port on an SAP system and have multiple parties agree on how to connect to that, deal with user accounts, etc.
Now, also realize that I don't foresee we would be using this capability in all IoT scenarios, but only where it's warranted and the extra investment is justified. But there were heightened security is needed/required we can use it for this "2-factor auth" concept, as well as this track&trace use case. It remains to be seen whether this is indeed the appropriate vehicle for it, but I find the idea very attractive.
Excellent argument, Jay.
Really interesting blog, especially the concept of 2-factor authentication.
Presumably the certificate metadata store also needs to be very secure, are you recommending transport and message based security between the server and the metadata store also?
I could imagine a man in the middle attack between the server and the metadata store could have an even bigger impact on an IOT solution than a single device being hacked.
With this approach is there now also a single point of failure (the metadata store)?
Hi Jon, thanks!
This metadata store is _part_ of the certificate authority that issued the certificate, and of course, all communication with it are authenticated and TLS protected as well, so MITM should not be possible/easy without serious effort (nation state type resources), and of course, the CA itself has all intention on keeping it, as well as the certificate servers that get hit first, secure.
SPoF? Sure. But if the CA goes down, that's a SPoF, too. If SAP SSO stops working, all users are locked out. If Verisign stops working, the internet grinds to a halt.