CRYPTOGRAPHIC TERMINOLOGY 101 by Dru Lavigne
In the next few articles, I'd like to concentrate on securing data as it
travels over a network. If you remember the IP packets series (see Capturing TCP
Packets), most network traffic is transmitted in clear text and can be decoded by a packet
sniffing utility. This can be bad for transmissions containing usernames,
passwords, or other sensitive data. Fortunately, other utilities known as cryptosystems can protect your network
traffic from prying eyes.
To configure a cryptosystem properly, you need a good understanding of the
various terms and algorithms it uses. This article is a crash course in
Cryptographic Terminology 101. Following articles will demonstrate
configuring some of the cryptosytems that are available to FreeBSD.
What is a cryptosystem and why would you want to use one? A cryptosystem is
a utility that uses a combination of algorithms to provide the following three
components: privacy, integrity, and authenticity. Different cryptosytems use
different algorithms, but all cryptosystems provide those three components.
Each is important, so let's take a look at each individually.
[ Top ]
Privacy
Privacy ensures that only the intended recipient understands the network
transmission. Even if a packet sniffer captures the data, it won't be able to
decode the contents of the message. The cryptosystem uses an encryption
algorithm, or cipher, to encrypt the original clear text into cipher text before it is transmitted. The
intended recipient uses a key to decrypt the cipher text back into the original
clear text. This key is shared between the sender and the recipient, and it is
used to both encrypt and decrypt the data. Obviously, to ensure the privacy of
the data, it is crucial that only the intended recipient has the key, for
anyone with the key can decrypt the data.
It is possible for someone without the key to decrypt the data by cracking or guessing the key that was used to
encrypt the data. The strength of the
encryption algorithm gives an indication of how difficult it is to crack the
key. Normally, strengths are expressed in terms of bitsize. For example, it would take less time to
crack a key created by an algorithm with a 56-bit size than it would for a key
created by an algorithm with a 256-bit size.
Does this mean you should always choose the algorithm with the largest bit
size? Not necessarily. Typically, as bit size increases, the longer it takes to
encrypt and decrypt the data. In practical terms, this translates into more
work for the CPU and slower network transmissions. Choose a bit size that is
suited to the sensitivity of the data you are transmitting and the hardware you
have. The increase in CPU power over the years has resulted in a double-edged
sword. It has allowed the use of stronger encryption algorithms, but it has
also reduced the time it takes to crack the key created by those algorithms.
Because of this, you should change the key periodically, before it is cracked.
Many cryptosystems automate this process for you.
There are some other considerations when choosing an encryption algorithm.
Some encryption algorithms are patented and require licenses or restrict their
usage. Some encryption algorithms have been successfully exploited or are
easily cracked. Some algorithms are faster or slower than their bit size would
indicate. For example, DES and 3DES are considered to be slow; Blowfish is
considered to be very fast, despite its large bit size.
Legal considerations also vary from country to country. Some countries
impose export restrictions. This
means that it is okay to use the full strength of an encryption algorithm
within the borders of the country, but there are restrictions for encrypting
data that has a recipient outside of the country. The United States used to restrict the
strength of any algorithm leaving the U.S. border to 40 bits, which is why some
algorithms support the very short bit size of 40 bits.
There are still countries where it is illegal to even use encryption. If
you are unsure if your particular country has any legal or export restrictions,
do a bit of research before you configure your FreeBSD system to use
encryption.
The following table compares the encryption algorithms you are most likely
to come across.
Algorithm | Bit Size | Patented | Comment |
DES | 56 | | slow, easily cracked |
3DES | 168 | | slow |
Blowfish | 32 - 448 | no | extremely fast |
IDEA | 128 | yes | |
CAST | 40 - 128 | yes | |
Arcfour | 40, 128 | | |
AES (Rijndael) | 128, 192, 256 | no | fast |
Twofish | 128, 256 | no | fast |
How much of the original packet is encrypted depends upon the encryption mode. If a cryptosystem uses transport mode, only the data portion of
the packet is encrypted, leaving the original headers in clear text. This means
that a packet sniffer won't be able to read the actual data but will be able
to determine the IP addresses of the sender and recipient and which port number
(or application) sent the data.
If a cryptosystem uses tunnel mode, the
entire packet, data and headers, is encrypted. Since the packet still needs to
be routed to its final destination, a new Layer 3 header is created. This is
known as encapsulation, and it is quite
possible that the new header contains totally different IP addresses than the
original IP header. We will see why in a later article when we configure your
FreeBSD system for IPSEC.
[ Top ]
Integrity
Integrity is the second component found in cryptosystems. This component
ensures that the data received is indeed the data that was sent and that the
data wasn't tampered with during transit. It requires a different class of
algorithms, known as cryptographic
checksums or cryptographic
hashes. You may already be familiar with checksums as they are used to
ensure that all of the bits in a frame or a header arrived in the order they
were sent. However, frame and header checksums use a very simple algorithm,
meaning that it is mathematically possible to change the bits and still use the
same checksum. Cryptographic checksums need to be more tamper-resistant.
Like encryption algorithms, cryptographic checksums vary in their
effectiveness. The longer the checksum, the harder it is to change the data and
recreate the same checksum. Also, some checksums have known flaws. The
following table summarizes the cryptographic checksums:
Cryptographic Checksum | Checksum length | Known flaws |
MD4 | 128 | yes |
MD5 | 128 | theoretical |
SHA | 160 | theoretical |
SHA-1 | 160 | not yet |
The order in the above chart is intentional. When it comes to cryptographic
checksums, MD4 is the least secure, and SHA-1 is the most secure. Always choose
the most secure checksum available in your cryptosystem.
Another term to look for in a cryptographic checksum is HMAC or Hash-based Message Authentication Code. This
indicates that the checksum algorithm uses a key as part of the checksum. This
is good, as it's impossible to alter the checksum without access to the key. If
a cryptographic checksum uses HMAC, you'll see that term before the name of the
checksum. For example, HMAC-MD4 is more secure than MD4, HMAC-SHA is more
secure than SHA. If we were to order the checksum algorithms from least secure
to most secure, it would look like this:
- MD4
- MD5
- SHA
- SHA-1
- HMAC-MD4
- HMAC-MD5
- HMAC-SHA
- HMAC-SHA-1
[ Top ]
Authenticity
So far, we've ensured that the data has been encrypted and that the data
hasn't been altered during transit. However, all of that work would be for
naught if the data, and more importantly, the key, were mistakenly sent to the
wrong recipient. This is where the third component, or authenticity, comes into
play.
Before any encryption can occur, a key has to be created and exchanged.
Since the same key is used to encrypt and to decrypt the data during the
session, it is known as a symmetric or session key. How do we safely exchange that key
in the first place? How can we be sure that we just exchanged that key with the
intended recipient and no one else?
This requires yet another class of algorithms known as asymmetric or public key algorithms. These
algorithms are called asymmetric as the sender and recipient do not share the
same key. Instead, both the sender and the recipient separately generate a key pair which consists of two mathematically
related keys. One key, known as the public
key, is exchanged. This means that the recipient has a copy of the
sender's public key and vice versa. The other key, known as the private key, must be kept private. The security
depends upon the fact that no one else has a copy of a user's private key. If a
user suspects that his private key has been compromised, he should immediately
revoke that key pair and generate a new key pair.
When a key pair is generated, it is associated with a unique string of
short nonsense words known as a fingerprint. The fingerprint is used to ensure
that you are viewing the correct public key. (Remember, you never get to see
anyone else's private key.) In order to verify a recipient, they first need to
send you a copy of their public key. You then need to double-check the
fingerprint with the other person to ensure you did indeed get their public
key. This will make more sense in the next article when we generate a key pair
and you see a fingerprint for yourself.
The most common key generation algorithm is RSA. You'll often see the term RSA associated with digital certificates or certificate authorities, also
known as CAs. A digital certificate is a signed
file that contains a recipient's public key, some information about the
recipient, and an expiration date. The X.509 or
PKCS #9 standard dictates the information found in
a digital certificate. You can read the standard for yourself at http://www.rsasecurity.com/rsalabs/pkcs
or http://ftp.isi.edu/in-notes/rfc2985.txt.
Digital certificates are usually stored on a computer known as a
Certificate Authority. This means that you don't have to exchange public keys
with a recipient manually. Instead, your system will query the CA when it needs
a copy of a recipient's public key. This provides for a scalable authentication
system. A CA can store the digital certificates of many recipients, and those
recipients can be either users or computers.
It is also possible to generate digital certificates using an algorithm
known as DSA. However, this algorithm is patented
and is slower than RSA. Here is a FAQ on the difference
between RSA and DSA. (The entire RSA Laboratories' FAQ is very good reading if you
would like a more in depth understanding of cryptography.)
There is one last point to make on the subject of digital certificates and
CAs. A digital certificate contains an expiration date, and the certificate cannot
be deleted from the CA before that date. What if a private key becomes
compromised before that date? You'll obviously want to generate a new
certificate containing the new public key. However, you can't delete the old
certificate until it expires. To ensure that certificate won't inadvertently be
used to authenticate a recipient, you can place it in the CRL or Certificate Revocation List. Whenever a
certificate is requested, the CRL is read to ensure that the certificate is
still valid.
Authenticating the recipient is one half of the authenticity component.
The other half involves generating and exchanging the information that will be
used to create the session key which in turn will be used to encrypt and
decrypt the data. This again requires an asymmetric algorithm, but this time it
is usually the Diffie Hellman, or DH, algorithm.
It is important to realize that Diffie Hellman doesn't make the actual
session key itself, but the keying information used to generate that key. This
involves a fair bit of fancy math which isn't for the faint of heart. The best
explanation I've come across, in understandable language with diagrams, is Diffie-Hellman Key Exchange - A Non-Mathematician's Explanation by Keith Palmgren.
It is important that the keying information is kept as secure as possible,
so the larger the bit size, the better. The possible Diffie Hellman bit sizes
have been divided into groups. The following
chart summarizes the possible Diffie Hellman Groups:
Group Name | Bit Size |
1 | 768 |
2 | 1024 |
5 | 1536 |
When configuring a cryptosytem, you should use the largest Diffie Hellman
Group size that it supports.
The other term you'll see associated with the keying information is PFS, or Perfect
Forward Secrecy, which Diffie Hellman supports. PFS ensures that the
new keying information is not mathematically related to the old keying
information. This means that if someone sniffs an old session key, they won't
be able to use that key to guess the new session key. PFS is always a good
thing and you should use it if the cryptosytem supports it.
[ Top ]
Putting It All Together
Let's do a quick recap and summarize how a cryptosytem protects the data
transmitted onto a network.
- First, the recipient's public key is used to verify that you are sending
the data to the correct recipient. That public key was created by the RSA
algorithm and is typically stored in a digital certificate that resides on a
CA.
- Once the recipient is verified, the DH algorithm is used to create the
information that will be used to create the session key.
- Once the keying information is available, a key that is unique to that
session is created. This key is used by both the sender and the receiver to
encrypt and decrypt the data they send to each other. It is important that this
key changes often.
- Before the data is encrypted, a cryptographic checksum is calculated. Once
the data is decrypted, the cryptographic checksum is recalculated to ensure
that the recipient has received the original message.
In next week's article, you'll have the opportunity to see many of these
cryptographic terms in action as we'll be configuring a cryptosytem that comes
built-in to your FreeBSD system: ssh.
Dru Lavigne
is an instructor at Willis College in Ottawa. In her non-existent spare time, you can find her shooting Remic's Rapids or cycling through Gatineau
Park.
[ Top ]
|