This post tries to summarize the basic concepts of cryptography and encryption that are relevant to information technologies, computer science and cybersecurity.
It tries to focus on the basic concepts. Technical details about mathematics, calculations or implementations (like encryption algorithms or digital certificate distribution) have been left for different posts.
Definitions and objectives of cryptography
Encryption is the process of converting a plaintext message into a secure-coded form of text, called ciphertext, which cannot be understood without converting it back via decyption (the reverse process) to plaintext. This is donde via a mathematical function and a special encryption/decryption password called the key.
The elements of information security are the following:
- Confidentiality: message is only know by the issuer and the desired recipients.
- Integrity: message has not been modified or altered while being transmitted.
- Availability: Assurance that the systems responsible for delivering, storing and processing information are accesible when required by the authorised users.
- Authentication: recipient can ensure that the claimed sender matches the real sender.
- Non-repudiation: sender cannot deny the message sent.
Symmetric vs Asymmetric Encryption
First of all, we need to understand some concepts of basic cryptography.
Type of encryption in cryptography:
- Symmetric encryption: the same key is used for encrypting and decrypting.
- Asymmetric encryption: a different key is used for encrypting and decrypting.
When using asymmetric cryptography, the recipient of the message generate a pair of keys:
- Public key: it is a key that can be known publicly known by everyone. It can be used to encrypt a message.
- Private key: it is a key that is only known by the issuer of the public key. It can be used to decrypt a message that has been encrypted with its corresponding public key.
Symmetric cryptography requires that the two interlocutors share a key. When one send the key to the other, it is possible that a third party is listening (specially when using a shared medium) and capture the key. This would prevent confidentiality, a basic goal of cryptography.
The purpose of creating asymmetric encryption was to ease key distribution. The public key can be shared publicly, and sender can use this key to send confidential messages to the recipient, that will decrypt the message using its private key. This allows confidential communications between peers even when others are listening the medium.
Advantages and disadvantages of symmetric encryption:
- Better performance in comparison with asymmetric encryption
- Keys are much shorter, and can be easily remembered
- Key distribution poses a challenge
- Cannot be used to sign electronic documents
Advantages and disadvantages of asymmetric encryption:
- Solves problem of key distribution
- Long length of keys
- Complexity of calculations
Using Asymmetric Encryption to Distribute Symmetric Passwords
We may be tempted to think that we should send all data always using asymmetric keys because of the security granted by public/private key system. But we have to remember that the processing power required when using asymmetric encryption is much higher than using symmetric encryption, so asymmetric encryption is not suitable for bulk data. At the end, we use a mixed of the two methods during communications.
A usual approach is to use asymmetric encryption to distribute the symmetric key, and once it has been distributed safely, symmetric key is used for bulk data.
Steps to distribute symmetric keys privately using asymmetric keys:
- Interlocutor A sends a communication request to interlocutor B
- B sends its public key to A. [in order to achieve authentication, this public key is usually sent through a certificate. We will give more details about this later]
- A generates a symmetric key for this communication, encrypts it using B’s public key, and sends it to B
- B decrypts symmetric key
- Now, both interlocutors share a symmetric key that is only known by them
Note that a third party that could spectate all the communications would not be able to understand the ciphered text, as it was never able to see the symmetric key as plaintext.
Nevertheless, this communication procedure needs to be improved because it does not meet the cryptography principles:
- Integrity not achieved: data can be altered if there are distortions during transfer or processing, and this will be unnoticed by the recipient.
- Authentication not achieved: recipient does not have any method to authenticate sender’s identity.
- Non-repudiation not achieved: sender can deny that message was sent
- Confidentiality not achieved: if a digital certificate are not used in the second step to distribute the public key, this communication procedure is vulnerable to man-in-the-middle (MITM) attack. When it happens, the real sender of the message can be different than the claimed sender. This kind of attack and its solution is explained later in this post.
The process illustrates how to solve the distribution of symmetric keys discretely. But we need to add further procedures when sending a message to ensure that all cryptology principles are met.
Adding integrity through hash functions
To understand how to achieve integrity, we need to understand what a hash is.
A hash, or digest, is a piece of data that is obtained from a bigger piece of data called original data by using a hash function or hash algorithm.
Hash functions are one-way. It means that given the plaintext the digest is obtained by applying the hash function, On the other hand, given the digest it is impossible to obtain the plaintext.
Two pieces of data that are identical will generate the same hash. So a way to achieve integrity in data communication is to calculate digest before sending the data. Then, sender sends both the data and the hash to the recipient. The recipient can apply the same hash algorithm to the plaintext to ensure that it matches to the digest.
Steps to exchange information assuring integrity:
- Sender A generates plaintext to be sent
- Sender A calculates hash based on plaintext applying a hashing algorithm
- Sender A sends both plaintext and hash data to recipient B
- Recipient B calculates hash on received plaintext data
- Recipient B compares received hash with calculated hash. If they both match, it means that the original message has not been altered.
Adding non-repudiation through Digital Signatures
When analyzing public encryption, we have always talked about using public key to encrypt and private key to decrypt. But in fact, we can also encrypt a plaintext using private key and decrypt it using public key. This property of asymmetric systems is the basis of digital signatures.
When a message is encrypted with a private key (what is called “digitally signed”), it can be decrypted by anyone that have the public key. Because the private key is only known by the sender, it is a proof that the message is originally forged by the sender.
Ideally, the whole plaintext message to be sent could be encrypted using the public key, but that would not be practical because of the processing effort to encrypt bulk data and the amount of traffic generated. Instead of that, only the hash of the original message is encrypted using the private key, creating what we call the digital signature.
Digital signatures ensure:
- Data integrity
Steps to send a message using digital signature:
- Sender applies a hash function to plaintext of the message to generate a hash
- Sender encrypts hash with its private key, creating a digital signature for the message
- Sender sends both message and digital signature to recipient
- Recipient applies hash function to plaintext, obtaining a hash
- Recipient decrypts digital signature with sender’s public key, obtaining a second hash
- Both hashes are compared. If they match, both integrity and authentication are ensured
This procedure is vulnerable to the replay attack. This attack is avoided by adding a signed time-stamping or attaching a counter to the document.
In addition, it is still vulnerable to MITM attack. It is solved by using a PKI, as seen below.
What is man-in-the-middle attack?
Man-in-the-middle attack is an attack where the attacker secretly relays and possibly alters the communications between two parties who believe that they are directly communicating with each other
When session keys are exchanged, if an MITM attack is successful communication is exposed.
Steps of how MITM is performed:
- A is intending to send data to B. C, the attacker, is listening in the same medium
- A sends a request to B to get its public key, but C intercepts the request and it never reaches B
- C requests to B its public key, and gets it
- C sends its own public key to A, impersonating B
- A encrypts data using C’s public key, and sends to C messages intended for B
- C decrypts A’s original message. Its confidentiality is altered.
- C may alter the original message and forge a new one, encrypt it using B’s public key, and send it to B
- Neither A nor B suspect that the messages is visible to a third party and that it can be altered
Distributing public keys through PKI
To achieve safe authentication and avoid MITM attacks, a third party trusted by both sender and recipient is introduced in the scheme.
This third party holds the public key of the recipient, and it is in charge of distributing it digitally signed (i.e., encrypted with its own private key) among senders. In order to distribute it, it creates a digital certificate by appending the signed recipient’s public key with details of the recipient’s identity.
The organizational structure to distribute digital certificates through a third party is called Public Key Infrastructure (PKI). The details of a PKI are given on this post.
In order to achieve all the cryptology principles (confidentiality, integrity, authentication and non-repudiation) we would need a combination of all the procedures above. Because of simplicity we do not show all of them combined in the same diagram, but we really need to leverage a combination of all of them (asymmetric encryption for symmetric key distribution, hash functions, digital signatures and PKI) when establishing a safe communication.
Some applications that uses the methodologies described above are:
- Transport Layer Security (TLS)
- IP Security (IPSEC)
- Secure Shell (SSH)
- Secure Multipurpose Internet Mail Extensions (S/MIME)