Security

Tuesday, January 06, 2009

Classic crypto fun

Enigma

It turns out that there's a much better way to kill a few minutes than looking at movies of cats playing with their favorite toys on YouTube. Instead, you can try a simulator of the classic Enigma machines, the devices that the Germans used to encrypt diplomatic and military communications in World War 2. The picture above may be a bit hard to see, but it shows that if you use the key "LWM" to encrypt the message "HELLOWORLD" with a three-rotor Enigma, you get the ciphertext "KSWKUXBXOV." This simulator also shows how the Enigma works. At each step, it shows you the path through the machine that creates each letter of ciphertext and how the rotors move to change the setup that the machine will use to create the next letter of the ciphertext. If you're even more interested, you can also find a paper that describes how to break the Enigma's encryption here.

Friday, January 02, 2009

Forging certificates with the latest attack on MD5

There’s yet another development in the history of cryptographic weaknesses associated with the MD5 hash function. After showing that it’s not too hard to find a collision in MD5 and that it's possible to use MD5 collisions to create certificates with identical signatures, researchers have now shown how to use the weakness in MD5 to create a CA certificate that most browsers will verify as being valid. Many others have commented on this, so I won’t repeat what’s been said before. I will, however, mention two thoughts that relate to this new work that haven’t been mentioned by others yet.

The first relates to how serious this newly-demonstrated vulnerability is. This research shows that it’s feasible for hackers to create valid SSL server certificates. On the other hand, carrying out the attack that takes advantage of the weakness in MD5 requires a fair amount of sophistication. It’s definitely impractical for the typical hacker, although it’s probably practical for more sophisticated cybercriminals. On the other hand, I don’t expect it to be used any time soon. There’s so much sensitive information available to cybercriminals that there’s almost certainly a better way for them to get what they want that by using a web site with a forged SSL certificate.

Suppose that you’re a cybercriminal who wants lots of sensitive information to help you carry out your insidious plans. One approach that’s now available is to take the time and effort to carry a sophisticated cryptanalytic attack that lets you create a phishing web site that’s more likely to collect information for you. Another approach is to compromise a single backup tape that holds gigabytes of the very information that you’re after. It’s not that hard to get such backup tapes, and roughly half of them aren’t encrypted today, mainly out of concerns about how difficult the key management is that’s needed to encrypt and decrypt tapes.

One approach is hard; one approach is easy. If I were a cybercriminal, I’d probably take the easy alternative. Most cybercriminals would probably make the same choice, choosing to steal a tape instead of doing complicated cryptanalysis. Because of this, I don’t think that we’ll be seeing many phishing sites with forged SSL certificates any time soon.

The second thought relates to the computers used to carry out the clever attack on MD5. In this case, a cluster of roughly 200 PlayStation 3s was used. It seems that one PS3 provides the same computing power as about 30 PCs, so they're fairly useful for projects that needs lots of computing power.

Using PS3s for high-end computing isn’t new. Stanford’s Folding@home project, for example, has been using volunteers' PS3s to help calculate the shape of protein molecules since March 22, 2007. PCs greatly outnumber PS3s in the Folding@home project, but PS3s actually provide the biggest contribution of computing power of any platform.

Not too many years ago, big computing projects were almost exclusively done only by governments or by government-funded labs. But with inexpensive computing power like the PS3 provides, it’s now much easier for others to do the same computing-intensive research. The new MD5 research shows that doing cryptanalysis is now much more feasible that it once was. Can predicting the weather or designing nuclear weapons be far behind?

Friday, December 26, 2008

Why PKI is still used

It's always fun to watch babies. They're born knowing absolutely nothing and have to learn how the world around them by watching how things work. Once they get a bit older, they seem to start doing experiments to check if what they think is actually true. When they're very young, for example, they notice that everything falls when it's dropped. When they get a bit older they'll try dropping things again and again to see if things really do fall every time they're dropped. After a while, they seem to decide that they've tested their hypothesis enough times and they stop dropping things. This is why adults don't do some of the things that babies do. Adults typically don't drop things just to see if they'll fall because they already did it hundreds of times when they were babies.

When they're learning about how the world around them works, babies will eventually give up if they find something doesn't work. They just file that away as part of their understanding of the world. On the other hand, adults also seem unwilling to learn from experience the same way that babies do. Some even insist on moving forward with technologies that have always failed in the past. It's enough to remind you of the following exchange in The Princess Bride:

Wesley: "Aha! Your pig fiancé is too late! A few more steps and we'll be safe in the fire-swamp."

Buttercup: "We'll never survive."

Wesley: "Nonsense! You're only saying that because no one ever has."

Some people seem to believe that they'll be able to succeed with difficult technologies, even though most others fail. PKI is probably a good example of this. PKI has been around for quite a while. The digital certificate was invented over 30 years ago and the first version of the X.509 standard that defines how to use certificates was completed over 20 years ago. But except for the single notable use in SSL, the technology has essentially gone nowhere in the past few decades. The root of the problem is essentially that while machines don't mind using digital certificates, people hate them.

Despite this clear evidence of failure, some organizations have still not noticed the trend in which trying to have people use digital certificates has a very high chance of failure. Maybe they see themselves as Wesley in The Princess Bride, who does indeed manage to survive the Fire Swamp despite the failure of those that come before him. On the other hand, Wesley had one thing going for him that most corporate IT departments don't: the fact that the scriptwriter was on his side. Consultants can help with many difficult issues, but even the best consultants don't have that level of influence.

Wednesday, December 24, 2008

The play's the thing

The play's the thing

Wherein I'll catch the conscience of the King.

-William Shakespeare, Hamlet II, 2, 599

One of the conventions that has been around ever since classical Greek drama is the three-act structure. In the first act you introduce the characters; in the second act you get them in trouble; in the third act you get them out of the trouble. You might summarize this as boy meets girl, boy loses girl, boy gets girl back again. All three acts aren't necessarily the same length. It's common for the second act to be much longer than the third act, for example.

It's a very common pattern. You can see it, for example, in the first three Star Wars movies as they tell the story of the redemption of Anakin Skywalker. In Star Wars (the first act) we meet the main characters. In The Empire Strikes Back (the second act), things don't look good for Anakin. He ends up cutting his son’s hand off and it looks like he’ll be on the side of evil forever. In The Return of the Jedi (the third act), however, he finally overcomes the forces of evil and returns to the side of good.

The three-act structure might even provide a framework for thinking about many information technologies. Let's try to see how well the history of the Internet fits into this structure.

The first act of this play is probably the introduction of the Internet and its adoption by businesses for communications. At that point we know all of the players. The Internet certainly has a role this play. Businesses have another. They’re the good guys. Cyber-criminals have another role, and they’re the bad guys, although at this point we are just starting to understand that they exist.

The next act probably began roughly when spam started to choke the Internet. This was followed by phishing and the more sophisticated types of identity theft that we see today. At this point, the bad guys (the cyber-criminals) seems to have the upper hand over the goog guys (businesses that want to use the Internet for communications). That's probably where we are today. Things don't look good for the good guys at this point, and there doesn't seem to be any way out of their troubles.

The final act probably hasn't started yet. The resolution that we'd like to see, of course, is that the good guys win, but this probably isn't happening yet. Cyber-criminals still seem to be fairly successful. One indication that this is true is the fact that the law of supply and demand continues to reduce the street price of a complete identity. In some cases, a complete identity is worth as little as $1 as the amount of sensitive information disclosed in data breaches makes lots of the sensitive information available. It’s certainly possible for the good guys to win so that we get back on track to the way that the third act is supposed to end, but we don’t seem to be heading in this direction. We’re probably still stuck in the second act. Maybe that's to be expected because the second act typically takes quite a while to finish.

Tuesday, December 23, 2008

Sports cars and PKI

I've known several people who aspired to one day own a Corvette. Curiously, everyone that I've known to have actually bought one ended up rarely driving it. I'm not sure why they do this. Maybe they can't stand to see their Corvette dirty. Maybe they don't want to take the chances of them getting damaged in an accident. Maybe they just want to save money on insurance by rarely driving them.

On the other hand, it could be the case that the people that I've known aren’t representative of Corvette owners. I've only known a few people who dreamed of one day owning a Corvette and then managed to actually buy one one day, and such a small sample probably won't give you enough information to find any significant trends.

On the other hand, industry analysts estimate that most of PKI software ends up much like those undriven Corvettes. Apparently roughly half of PKI software ends as "shelfware," software that's purchased but never deployed. And just like the owners who don't drive their Corvettes must have a good reason to do so, I'm sure that the people who bought PKI software and didn't deploy it also have good reason for doing so.

I can almost understand why someone would buy a Corvette and not drive it much. After all, many people could probably pass many enjoyable hours admiring a Corvette as it sat in their garage. On the other hand, I don't understand at all why someone would buy PKI software and not use it. People almost certainly take their Corvette for a test drive before buying it. Don't people try a demo copy of PKI software first and make sure that it works in their environment and is at least somewhat useful before buying it? If that's the case, it's hard to understand why they end up not using it.

Back in the dot-com era, the PKI projects that I worked on cost anywhere from the equivalent of 2 to about 60 Corvettes. This wasn’t all the PKI software itself. For a big implementation, you'd have roughly 20 Corvettes worth of software, another 20 worth of hardware, and another 20 or so of professional services. For smaller implementations, it was weighted much more towards the cost of the software.

In the bigger implementations, the PKI was just one part that made a larger e-commerce project work, so it was as if customers were buying an entire lot of cars and wouldn't even notice those 60 Corvettes. These larger implementations also had a definite need for the PKI, but many customers didn't even know that they were getting one because it was just one more component of a big project. And they didn't really care. They weren't really interested in the PKI piece of their project, but they were interested in the overall benefit that they would get from the larger project. These definitely didn't end up a shelfware.

In the case of the smaller implementations, there was often not a clearly-defined need for PKI software, but customers thought that it might one day help them. These probably account for the dusty, unopened boxes of PKI software that apparently still clutter shelves of IT departments. I saw several of these projects for each of the larger projects, so the smaller projects probably account for most of the PKI software purchases. Because there was really no need for them when they were purchased, it's actually surprising that only half of these ended up unused. Maybe that's the lesson to be learned from the rise and fall of the market for PKI software – don't buy software unless you have a good reason to do so, although that's also an odd lesson for people to actually need.

Monday, December 22, 2008

Perception and reality

Where I live in San Jose, there's a shortage of parking. Every house has a two-car garage, but most of the garages are used for storage instead of parking. Add a few families with three or four cars, and you have a situation where the demand for parking spaces exceeds their supply. One of my neighbors actually blames the Bush administration for our parking problems. I'm not sure of what line of reasoning led him to that conclusion. I was fortunate enough to have my wife listen to those particular details.

This is probably a case where there's a difference between perception and reality. I seriously doubt that politicians in Washington did anything that created The Great San Jose Parking Crisis, but there's at least one person out there who believes otherwise and I doubt that any amount of facts will change his opinion. His perception and reality will probably never agree.

Information security has its own set of mismatches between perception and reality. For example, there's the perception that e-mail is in danger of being intercepted and read while it's on the Internet, but that it's safe inside the firewall. On the other hand, the reality is that e-mail is definitely in danger of being intercepted and read inside the firewall. It's fairly easy for anyone on your network to watch the traffic on it, and it's also easy for mail administrators to read people's e-mail. I know of many more cases of an administrator intercepting and reading e-mail that I do of e-mail being intercepted and read on the Internet. Most security people you talk to will probably have the same story. Despite this, the perception is that e-mail is safe in the very place that it's at the most risk.

This may or may not be a serious problem. If all of your employees can see all of your data, then you have nothing to worry about, but this is probably not the case. There's almost certainly lots of sensitive information contained in some of the e-mails that are sent within any business. Your HR people probably send documents back and forth that contain all sort of sensitive information in them including salaries, social security numbers and more. Executives preparing for their quarterly board meetings probably send documents back and forth that contain all sorts of sensitive information about the financial situation of their company and its future plans. Sales managers probably send messages to other sales managers and to the sales engineers who support them that discuss the details of the deals that they're working on. All of this sensitive information may never leave your network, but you also may not want it to get into the wrong hands, and that doesn't necessarily mean that a hacker gets his hands on it. So if you're considering encryption as a way to protect sensitive information, don't forget to protect information when it's the most vulnerable, and that's when it's still in your network.

Friday, December 19, 2008

IBE as enterprise software

I frequently get asked why identity-based encryption technology is a big deal. After all, some people say, if they use it, it certainly looks and feels like any other encryption technology. This may be true, but it also overlooks the main difference between IBE and other encryption technologies, which is in how easy it is to support and operate. So while many encryption technologies may look roughly the same from the point of view of users, they're very different from the point of view of the administrators that keep them running.

The people who don't seem to appreciate the lower costs and complexity are often people who are fortunate enough to have little exposure to enterprise software. Their point of view is probably more accurately described as that of a user of consumer software, which is a very different market than enterprise software.

Enterprise software has much stricter requirements that consumer software. Enterprise software needs to have higher performance, better scalability and higher fault-tolerance than consumer software does. This usually means that it's also more expensive, and when the total cost of ownership is considered, it's often much more expensive. This is probably unavoidable because businesses work in a strict regulatory environment and consumers don't.

If a consumer loses data on a hard drive because he forgets the password needed to decrypt it, he has probably suffered an inconvenience, but he probably hasn't broken any laws. On the other hand, businesses are required to keep some types of data available for several years and can suffer severe penalties if they don't.

It's also often necessary for someone other than the person who encrypted data to have the ability to decrypt it. If a CFO encrypts sensitive data, his business will still need the ability to decrypt that data, even if he moves on to another job at a different company. This means that key recovery is a critical feature of enterprise software but not of consumer software. Consumers might even view key recovery as undesirable because it can give someone other than them the ability to decrypt their encrypted data.

It turns out that IBE has definite advantages when it comes to keeping encrypted data available for a long time and supporting the recovery of lost encryption keys. With most encryption technologies, to be able to implement key recovery you need to keep a secure database of all the decryption keys, and if you lose any of these keys, you can also lose the data that was encrypted with it.

On the other hand, all IBE keys are calculated when they're needed from a single IBE master secret. This means that you don't need to store any decryption keys at all, but you can still recover encrypted data when it's needed. And all you have to backup is the IBE master secret, and you can recover an IBE system that's lost or destroyed in any way.

Because you don't need to securely store decryption keys, the secure database that's part of other encryption systems isn't needed with an IBE system, which makes the IBE system simpler and cheaper. This is a feature that only enterprise users will appreciate, because it's not really something that consumers need to worry about.

Thursday, December 18, 2008

PKI standards

In a perfect world, all standards would be useful and reflect a consensus of experts. Unfortunately, we don't live in such a world, so some standards aren't very useful. This makes things tricky for vendors who have to explain to customers why they don't follow certain standards. A good example of this is ISO 15782-1, Certificate management for financial services – Part : Public-key certificates. Section 6.3.4 of this document has the following requirement for certificate authorities:

c) ensure that there is no duplication of the requester's distinguished name with that of any other entity certified by the CA

This means that once you get your first certificate, you can't get another one, which is a requirement that makes other best practices impossible. It's common, for example, for users to have three different certificates: one that's only used for encryption, one that's only used for digital signatures and one that's only used for authentication.

But if you follow ISO 15782-1, you can't get three certificates for the same person unless the certificates are requested for different names. So while you can't get three certificates for "Bob," you could get three certificates for "Bob, the guy who needs to encrypt," "Bob, the guy who needs to use digital signatures" and "Bob, the guy who needs to authenticate." Most systems don't give users such names, and I'm not sure that modifying your naming scheme to work this way is even a good idea. This makes it very unlikely that any CA is going to follow ISO 15782-1.

This requirement seems to also make it impossible to get a new certificate after a certificate expires. After all, the user with the name "Bob" is still the user with name "Bob" after his current certificate expires. The workaround for this is even worse. You could have a user "Bob, the guy with certificate number 8675309," and change his name to "Bob, the guy with certificate number 8675410" when he his old certificate expires and he gets a new one, but this is an even wore idea than changing the name to reflect the use of a certificate. Another workaround is to have a user "Bob, the guy living in the year 2008" and another user "Bob, the guy living in the year 2009," but that's not really a good idea either.

So there seems to be no reason for a reasonable PKI product to actually follow ISO 15782-1. It's probably too much work for vendors to try to explain why following ISO 15782-1 doesn't make sense, so they're probably more likely to just have an option for their CA products that puts you into ISO 15782-1 mode, even though absolutely nobody will ever run their product in that mode.

Tuesday, December 16, 2008

How important is IBE?

I frequently get asked why people should care about identity-based encryption. Because I work for a company that sells products that use IBE, I might give a biased answer, so a more independent point of view might be useful. So instead of the opinion of one person who definitely isn’t impartial, why not look at the broader cryptographic community's interest in IBE?

One way to do this is by using Google Scholar. Google Scholar can tell us how many times a particular paper has been cited by other papers, which can give us a rough idea of how important a paper is. Papers that are important are frequently cited, while those that aren’t as important are cited as frequently.

How important does IBE look when we use this metric?

Two of the most cited papers in cryptography are probably those that first described the Diffie-Hellman and RSA public-key schemes. The original Diffie-Hellman paper has been cited 5,644 times since 1977, or about 182 times per year. The original RSA paper has been cited even more often: 6,443 times since 1978, or about 215 times per year.

The paper by Boneh and Franklin that described what's commonly agreed to be the first practical and secure IBE scheme hasn't been around as long as the Diffie-Hellman or RSA papers. The Boneh and Franklin paper wasn't published until 2001, or only seven years ago, but it has been cited 2,011 times since then. That's roughly 287 times per year, which is more often than either of the other two that we’ve mentioned.

So if the interest by other cryptographers is any indicator of how important IBE is, then it certainly looks like the consensus is that it's at least as important as the Diffie-Hellman and RSA schemes.

Monday, December 15, 2008

Crypto snake oil

The term "snake oil" is often used to describe cryptography that does not actually provide the level of security that its proponents claim. The origin of this term is somewhat unclear, but one story is that it can be traced back to one of the traditional remedies for joint pain and inflammation that was brought to the US in the nineteenth century by Chinese immigrants. The fat from Chinese water snakes is high in eicosapentaenic acid (EPA), which has been shown to have some medicinal properties, so there may be some basis for believing that the traditional remedy actually had useful effects. Like the effects of many medications, however, the benefits from the traditional snake oil were subtle and varied significantly from person to person, making it difficult to rigorously prove the effectiveness of the remedy.

The fat of American rattlesnakes has a much lower concentration of EPA, however, so that when copies of the traditional remedy were made in the American West using local ingredients they turned out to be less effective than the original. Consumers could not distinguish between the two types of products, a fact that was quickly exploited by unscrupulous merchants who sold the ineffective snake oil to unsuspecting customers. Eventually this behavior became so widespread that the term "snake oil" became generalized to other products, ones that made claims of effectiveness that could not easily be substantiated by consumers and should thus be suspected of being false or misleading.

Whether this is the accurate history of the term or little more than a folk etymology, the connection to cryptography is fairly clear. Some products that provide little or no protection against a skilled adversary are sold as providing a high level of security, and most users of cryptography cannot tell the difference between secure and nonsecure versions of the technology. It seems that cryptography actually has many properties in common with snake oil, so it may be accurate to say that although cryptography may not actually be snake oil, it is very much like snake oil in some ways. And this observation is not limited to the unconventional techniques that are often labeled as such; it also includes cryptographic technologies that have withstood significant scrutiny by industry experts.

Two factors made it easy for unscrupulous vendors of ineffective snake oil to sell their product to unsuspecting customers: it was difficult for customers to distinguish between effective and ineffective versions of the product and the seller of the snake oil was also the person providing the medical advice to his customers. This situation made it extremely tempting for vendors to cheat, a temptation that many were unable to overcome. This is very similar to the situation that we still see today. Providers of car repairs and medical services both recommend purchases to their customers as well as provide what is purchased. Even after a purchase, though, it is not always clear that you really needed it. Your car may have continued to operate without a particular repair, or you might have recovered from an illness without the medication that your doctor prescribed for you. The temptation to cheat can be significant in these cases, and some studies have suggested that both car mechanics and doctors recommend a significant amount of services that their customers do not really need. Could cryptography fall into the same category?

Economists divide goods into three types: search goods, experience goods and credence goods. Search goods have properties that are easy to check before you consume them. If you are in the market for a red car, for example, it is easy to check if a potential purchase is really red. Very few, if any, information security products fall into this category.

Experience goods have properties that are not obvious before you buy, but have properties that are easy to verify after you consume them. If you are looking for a car with a certain fuel efficiency, perhaps getting at least 35 miles per gallon under your typical driving conditions, you cannot tell this by looking at the car itself (although this is why laws mandate this information be provided to consumers), but you can easily test it. Many security products are probably experience goods. You cannot tell before you deploy it whether or not antivirus software or an intrusion detection system (IDS) will really protect your network, for example, but you can observe warning messages and review the logs of the products after they have been deployed to verify that they are actually working.

Credence goods have properties that cannot easily be checked, either before or after they are consumed. Organically grown produce and meat from animals raised in humane conditions are examples of credence goods; it is very difficult to verify these particular properties, even after you consume them. Many medicines, including the historical snake oil, are also credence goods, because it is difficult to tell if your recovery was really due to the medication, a placebo effect, or even simply your body recovering on its own.

Products that implement cryptography are probably credence goods. It requires expensive and uncommon skills to verify that data is really being protected by the use of cryptography, and most people cannot easily distinguish between very weak and very strong cryptography. Even after you use cryptography, you are never quite sure that it is protecting you like it is supposed to do. It is always possible that a clever adversary could develop an attack that lets him defeat the cryptography that you are using, and he could then carry out this attack, perhaps reading encrypted messages, and you would have absolutely no idea that he was doing it.

Products cannot always be classified as purely search goods, experience goods or credence goods, and real products often have aspects of each category. Cars have some search characteristics, like their color, and some experience characteristics, like their fuel efficiency. Similarly, information security products can have aspects of more than one category. We can easily review its logs to verify that a deployed IDS is stopping some attacks on our network, so it has some experience characteristics. At the same time, the tradeoff between Type I and Type II errors that you need to make for an IDS means that a deployed IDS is probably also missing some attacks on your network that you will never be informed of. The fact that this rate of missed attacks may be acceptably low although we cannot actually verify it also gives IDS systems some credence characteristics.

On the other hand, cryptographic products seem to have many characteristics of credence goods and few characteristics of other types. You certainly cannot tell before you test it that such a product will operate as advertised, so there are probably no characteristics of search goods in these products. And because it is expensive and difficult to verify that the encryption provides strong protection to information or that a digital signature is really difficult to forge, even after it is used, cryptographic products show more characteristics of credence goods instead of experience goods. This uncertainty in quality that is characteristic of credence goods can lead to unusual results: prices that are lower than expected and are fairly uniform, even in the face of significant quality differences.

If consumers of a product cannot easily distinguish between high-quality and low-quality goods, even after they have consumed the product, we should expect that vendors cannot easily differentiate their products from competing products. In this case, we should expect prices of competing products to be roughly the same. Consumers will not be aware of the deficiencies in low-quality products, so producers of low-quality products will tend to overcharge for them. Similarly, competitive pressures will keep down the price of high-quality products. George Akerlof first described this situation in 1970 in his classic paper "The Market for 'Lemons': Quality Uncertainty and the Market Mechanism," and eventually won the Nobel Prize for Economics in 2001 for his work in this area. In the worst of these situations, the low-quality products will actually drive the high-quality products from the market as vendors of the high-quality products refuse to sell their products at the low price that the market forces upon them. Standards like Security Standards for Cryptographic Modules (FIPS 140-2) are designed to avoid such market failures and provide an indicator to customers that they are buying high-quality cryptography. Such products are guaranteed to be the modern equivalent of snake oil made from Chinese water snakes.