# A quick look to Cryptography

Cryptography deals with algorithms suited to protect relevant information against potential attacks performed by malicious men that want to steal private information such as business or personal sensible data. Cryptography is concerned with all issues related to message-security, peer-authentication, and integrity-check. Historically, cryptographic systems were first used in the military field. For example, the Roman general Julius Caesar, when sending a reserved message because he did not trust the messenger, replaced every A with a D, B with E and so on. Only who knew the “shift-three” rule was able to decrypt the message. In addition, during the 2nd World War, some successes of the Allied Army were related to the discovering of the ciphering rules used by the German Army to encrypt their messages.

Nowadays, cryptography is no more confined to military applications. Currently, the modern communications society is asking every day for increasingly more privacy and protection, therefore the efforts are concentrated in building more powerful and secure algorithms for information protection and user authentication

## Terms

To be familiar with the basic concepts of cryptography here is a brief list of key terms:

is the art or science of making messages secure;*Cryptography*- The
is the art of breaking a cryptographic system down and decrypting its messages;*cryptographic analysis* is a branch of mathematics, which studies the mathematical bases of cryptography.**Cryptology**- An unencrypted message is named
or*plaintext*;*cleartext* - An encrypted message is named
. It is a random sequence of symbols, therefore it is not understandable;*ciphertext* and**Encryption**are message’s transformations from plaintext to ciphertext and vice versa, respectively. Usually, they are performed using a key. In such case decryption can happen only knowing the exact key used to encrypt the message;**Decryption**means to check the identity of an individual involved in communication;**Authentication**- The
is proof that the message has not tampered during transmission;**integrity check** - A
is a ciphered string, extracted from a message, which allows to identify the sender and to verify the message’s integrity;**digital signature** - A
is a virtual ID document which allows identifying entities on the network.**digital certificate**

## Cryptography

Cryptography provides a series of algorithms and methods to make a message not interpretable. Some algorithms are very powerful and resisted to several kinds of attacks, others are less secure even if they also are relevant in this explanation.

The goal of an encryption algorithm is to make as complex as possible decrypting a message without any knowledge about the key. With a good encryption algorithm, the unique chance to decrypt an encrypted message is trying all possible keys, one at a time, until the correct key is found. Of course, the number of attempts grows exponentially as the key length grows. Therefore, with a simple 40-bit key you need up to 1.1*1012 different attempts.

It’s easy to infer that in a cryptographic system the most sensitive operation is the key generation. To be effectively secure, a cryptographic system must adopt a long enough key; also, the key must be actually created randomly, to be considered absolutely unpredictable. For such a reason it is preferable to not use computer-provided pseudo-random generators, suitable for gaming and simulations. Better it is to adopt more sophisticated systems which take advantage of a typical background noise from the physical world which cannot be absolutely predictable. Suitable sources of random numbers are radioactive-decay processes, the background noise of a semiconductor, or the time range elapsing between user’s keyboard actions. The most commonly used noise-source rely on mouse movements or the user’s typing time in milliseconds

Anyway, it is not so easy to estimate the goodness of an algorithm. Sometimes, algorithms supposed to be very strong, after several in-depth crypto-analysis and after several attacks, they are revealed to be very easy to violate with a proper attack. Anyway, it’s better to choose algorithms which resist from a longer time.

Encryption algorithms are categorized into two classes

- private key algorithms, aka symmetric algorithms;
- public key algorithms, aka asymmetric algorithms.

The difference is that the former use the same key to encrypt and decrypt a message, while the latter adopts two complementary distinct keys, a public one and a private one (more on this in the next section).

## Private Key Algorithms

Private key algorithms, also said symmetric algorithms, are the most commonly used. They use the same key to encrypt and decrypt a message. Both peers know and share the key used to encrypt a message, named private key or symmetric key, and only they can encrypt and decrypt the message (Figure 1).

*Figure 1 - Functional Schema of Exchanging a Message using a Symmetric Algorithm*

Depending on the kind of computation, there are two types of ciphering:

(sequential ciphering): the message is interpreted as a sequence of bits and is ciphered a bit at a time. They are surely the fastest way to encrypt a message, but they are considered less secure, although the security depends on the algorithm used;**stream cipher**: the message is broken down in fixed-length blocks and it is encrypted a block at a time. Although slower than stream cipher, they are generally considered more secure because each block is encrypted mixing it properly with the previous block;**block cipher**

An example of a stream cipher is:

. Developed by RSA Data Security Inc. it is a very fast variable-length-key algorithm, it is the most widely-used software stream cipher and is used in popular protocols such as Secure Sockets Layer (SSL) (to protect Internet traffic) and WEP (to secure wireless networks). Although its security it’s still not well proven, it resisted very well to several kinds of attacks. However, has been proved that some ways of using RC4 can lead to very insecure cryptosystems. Essentially, RC4 uses a random-numbers generator. The generated number is then applied via XOR to the sequence of bits which composes the cleartext.**RC4**: stream cipher used to provide voice privacy in the GSM cellular telephone protocol. A5/1 is used in Europe and the United States; a weaker cipher, A5/2, is used in countries that are not considered trustworthy enough to have strong cryptography in communication. A5/1 was developed in 1987 when GSM was not yet considered for use outside Europe, and A5/2 was developed in 1989. Both were initially kept secret. However, the general design was leaked in 1994, and the algorithms were entirely reverse engineered in 1999 by Marc Briceno from a GSM telephone. In 2000, around 130 million GSM customers relied on A5/1 to protect the confidentiality of their voice communications. Briefly, A5/1 is used to produce from a 114-bit block of cleartext another 114-bit sequence of keystream which is XORed with the former 114 bits prior to modulation. Internally, A5/1 is based around a combination of three linear feedback shift registers (LFSRs) with irregular clocking.**A5/1**is a stream cipher used in the Bluetooth protocol. It generates a sequence of pseudorandom numbers and combines it with the cleartext using the XOR operator. The key length may vary but is generally 128 bits. At each iteration, E0 generates a bit using four shift registers of differing lengths (25, 31, 33, 39 bits) and two internal states, each 2 bits long. At each step, the registers are shifted and the two states are updated with the current state, the previous state and the values in the shift registers. Four bits are then extracted from the shift registers and added together. The algorithm XORs that sum with the value in the 2-bit register. The first bit of the result is ciphertext.**E0**

Examples of block ciphers are:

. Developed at IBM in 1970, it has been for years the adopted encryption algorithm for US government and currently it has been superseded by the Advanced Encryption Standard (AES). It uses a 64-bit block and a 56-bits key. Due to the not so long key, it can be easily broken down with modern computers. Recently, has been conceived a variation, named Triple-DES or 3DES, which ciphers three times the message using each time a different key**DES / 3DES**. Developed by Bruce Schneier, adopts a 64-bit block and an up to 448-bit key. It achieved a general agreement by the international community and there is still no successful attack known. It is used in a few well-known systems, such as Nautilus e PGPfone**Blowfish**(International Data Encryption Algorithm). Developed in Switzerland in 1991, IDEA operates on 64-bit blocks using a 128-bit key and consists of a series of eight identical transformations and an output transformation. The processes for encryption and decryption are similar. IDEA derives much of its security by interleaving operations from different groups — modular addition and multiplication, and bitwise eXclusive OR (XOR) therefore it is generally considered very secure.**IDEA**, aka AES (Advanced Encryption Standard). Developed by Joan Daemen and Vincent Rijmen (after the name derives) in the later ‘90, recently has been adopted as an encryption standard by the U.S. government instead of DES. As of this writing, AES is one of the most popular algorithms used in symmetric key cryptography. AES is not precisely Rijndael (although in practice they are used interchangeably): AES has a fixed block size of 128 bits and a key size of 128, 192 or 256 bits, whereas Rijndael can be specified with key and block sizes in any multiple of 32 bits, with a minimum of 128 bits and a maximum of 256 bits, respectively.**Rijndael**

The great advantage of private-key algorithms is in their fast execution and they are suitable to encrypt a large amount of data, but their disadvantage is that the private key must be distributed to all recipients. Therefore, it is necessary to establish a further secure channel through which keys can be distributed securely. In the recent past, such an inconsistency imposed some limits on the evolution of cryptography, until public-key algorithms were introduced.

## Public Key Algorithms

Public-key algorithms use two complementary keys, named public key and private key, which are created so that the private key cannot be in any way extracted from the public key.

The communication works as follows (see Figure 2): two peers Alice and Bob both own a key-pair. Alice who wants to send an encrypted message to Bob takes

*Figure 2 - Functional Schema of the exchange of a message using a public key algorithm*

Bob’s public key and encrypt the message using such a public key. Then Alice sends the resulting encrypted message to Bob.

A message which has been encrypted with a public key can only be decrypted with the corresponding private key. So Bob, using his private key, can decrypt and read securely the message encrypted by Alice.

Using such a method, just the private key must be kept secret, while the public key can be delivered to who need to send an encrypted message to the public key’s owner. If a malicious user intercepts the public key, he has no chance to decrypt messages.

Public-key cryptography adopts smart algorithms very easy to be predicted in one direction but nearly hard to solve inversely. For example, Diffie-Hellman algorithm is based on discrete logarithm problem (more on this later): two peers, Alice and Bob, own each one a secret number, x and y respectively and both know a public number g. Alice calculates the value g*x and sends the result to Bob so that Bob can calculate the value (g * x) * y . Such a number is common to both (Alice could calculate (g * y) * x in the same way) so it becomes they shared encryption secret key. One could argue that intercepting g * x and g * y and knowing g it is possible to go back to (g * x) * y ! To discourage even the smarter spy, Alice uses the modulus function and sends to Bob the value of g * x mod p, because is almost mathematically impossible infer x from g * x mod p, even if you know g and p.

Later, the Diffie-Hellman algorithm has been refined with further mathematical integrations resulting in modern cryptographic systems based on a pair of complementary keys, a private one (which is the value of x), and a public one (formed by g, p and the value of g * x mod p).

The most popular algorithms are:

. As said before, is generally considered secure, particularly when exchanging symmetric keys, if used with keys long enough of at least 1024 bits.**Diffie-Hellman**. Invented in 1977 by Ron Rivest, Adi Shamir and Leonard Adleman at MIT (the letters RSA are the initials of their surnames), it is the most popular, both to encrypt messages and for digital signature. It is generally considered secure when used with key-length of at least 1024 bits - a 512-bit key is not secure, 768-bit is moderately secure, 1024-bit is considered secure till 2008 (when a powerful pc will factorize a 1024 bit integer) while 2048 bit is considered secure for still many years. RSA is based on the fact that it is hard to break a number down in its prime factors (Integer Factorization Problem). In fact, using two large enough numbers x and y it is very to calculate the product p = x * y but at the same time, it is highly difficult to decompose p to retrieve x and y.**RSA**, aka ECC. It is a public key cryptosystem, invented separately by Neal Kiblitz e Victor Miller in 1985, based on Elliptic Curve theory defined on finite fields. Recently it is receiving much attention because of factorization of a 1024 bit integer which make RSA less secure and because it needs easier calculations then RSA for producing the cryptographic quantities. The most of calculations are additions and multiplications on integers and/or points on the curve. Viceversa, RSA cryptosystem requires modulus calculations considerably more complex and harder in terms of calculation resources. Therefore elliptic curve cryptosystem better fits for implementations on devices having reduced resources such as smart cards, mobile phones and so on. It is generally considered secure when used with key-length of at least 160 bits, which is comparable to RSA 1024 bit. However, it allows using keys of 192, 256, 358 bits. Elliptic Curve Cryptography is based on Discrete Logarithm Problem: considering the equation: ax = b mod p (where mod stays for the rest of the division b/p and where x belongs to multiplicative group formed by integers modulo p: G = Z * p, i.e. the set of integers {1,2....p}) Discrete Logarithm Problem consists in finding a value for x such that the above equation is satisfied. As of this writing, there isn't any efficient algorithm to calculate discrete logarithms. This because it is a problem belonging to the NP (non-polynomial) class for which finding a solution requires exponential time respect to dimensions of the problem.**Elliptic Curve Cryptography**(Digital Signature Standard). It is mainly used for digital signature and was adopted by the USA government.**DSS**

## Typical usage for encryption

Public-key algorithms are noticeably slower than private-key (symmetric key) algorithms, especially to encrypt a large amount of data. So, cryptographic systems, usually opt for symmetric algorithms to encrypt messages and public-key algorithms to encrypt/exchange symmetric keys.

The sender generates a symmetric key, encrypts the message, encrypts the key generated by the receiver’s public key and send the message along with the generated key. The receiver decrypts the symmetric key using its own private key and finally decrypt the message.

## Digital Signature

Thanks to complementariness of public and private keys, a string encrypted with one of the keys can be only decrypted with the other key. So, a text’s decryption via a key ensures that it was encrypted with the complementary key. Digital signature algorithms take advantage of that characteristic to verify the message’s actual origin (sender’s authentication).

The digital signature is a string retrieved by the message applying a particular algorithm, it is encrypted through the sender’s private key and it is sent along with the message. Decrypting the signature through the public key proves that it was encrypted by the sender (or on behalf of the sender) using the correct private key. Also, comparing the decrypted string with a freshly message-retrieved string using the same algorithm, allows to check for integrity: if there is a match between the two strings it means the message has not tampered.

Here is how it works (see Figure 3): the sender, who is the unique owner of the private key, produces a footprint of the message, named hash or message-digest, and encrypt it with his own private key. The encrypted hash string represents the digital signature.

*Figure 3 - Signature's decryption via public key: sender side*

The receiver (see Figure 4) receives the message along with the signature. Starting from the message (which could be also encrypted), he reconstructs the footprint, whereas from the hash, once decrypted using the sender’s public key, he gets the message’s footprint as it was at the time of sending it. If the two footprints match the receiver are sure that the signature was apposed using the sender’s private key and that he has not tampered during the transmission.

*Figure 4 - Signature's decryption via public key: receiver side*

## Hashing Algorithms

Digital signature algorithms mainly rely on hashing algorithms, which are one-way algorithms that produce, starting from a variable-length string, a fixed-length string (typically between 64 and 255 bit in length) which is a characteristic of the given string.

The power of hashing algorithms is the fact that given an hash-string it is impossible to retrieve the original message; it is really difficult to build two messages producing the same hash-string; any message, submitted to the same algorithm an arbitrary number of times, always produce the same hash value.

Most popular algorithms are:

(Message Digest Algorithm 5), developed by RSA Data Security Inc., it is the successor of MD2 and MD4, which are algorithms not more used. MD5 produces a 128-bits hash starting from strings of arbitrary length. It is widely adopted and is considered enough secure.**MD5**(Secure Hash Algorithm), developed by NIST (National Institute of Standards and Technology) and NSA (National Security Agency), is used by USA government and produce 160-bits / 256bit / 512 hash-strings starting from strings of arbitrary length. It is considered enough secure. Usually, it used along with DSS, RSA, and ECC.**SHA1 / SHA256 / SHA512**

## Certification

Digital certificates assume a centric role in public-key cryptography. Their goal is to authenticate an individual certifying that the public key contained into the certificate actually owns to the entity to who has been issued. So, it assumes a centric role for public key exchange. A certificate is, under any standpoint, a digital ID document. Like an actual ID, it contains a set of attributes which identify the certificate’s owner and it is issued by an official organization, said Certification Authority (CA), which guarantees for the authenticity of certificate’s information. Usually, a certificate also contains the owner’s public key, some information related to the CA itself, the CA’s digital signature and expiration date. To obtain a certificate, an individual fills a certification request form with his own personal data attaching his public key and sending the request to the CA. The CA checks for the legitimacy of data and, if the response is successful, sends the certificate to the requester signing it with its own private key. Now, the requester is able to send the certificate to others to be authenticated later and to deliver its public key. The identity-check operation is performed on the certificate’s signature, as opposed by the CA, which make available its own public key.

## Add comment

Cancel reply to comment