Message Authentication Codes (MAC) and Hash Functions in Cryptography

Message Authentication Codes (MAC)

An alternative authentication technique involves the use of a secret key to generate a small fixed-size block of data, known as a cryptographic checksum or MAC, that is appended to the message.

  • Condenses a variable-length message using a secret key to a fixed-size authenticator.
  • This technique assumes that two communicating parties, say A and B, share a common secret key.
  • When A has a message to send to B, it calculates the MAC as a function of the message and key:
    MAC = C(K, M)

Where:

  • M = Message
  • C = MAC function
  • K = Shared Secret key
  • MAC = Message authentication code

The message plus MAC are transmitted to the intended recipient. The recipient performs the same calculation on the received message, using the same shared secret key, to generate a new MAC. The received MAC is compared to the calculated MAC.

Basic Use of MAC

If we assume that only the receiver and the sender know the identity of the secret key and if the received MAC matches the calculated MAC, then:

  1. The receiver is assured that the message has not been altered.
    • If an attacker alters the message, then the receiver’s calculation of the MAC will differ from the received MAC.
  2. The receiver is assured that the message is from the alleged sender.
    • Because no one else knows the secret key.
  3. If the message includes a sequence number (used with HDLC, X.25, and TCP), then the receiver can be assured of the proper sequence because an attacker cannot successfully alter the sequence number.
    • The MAC function is similar to encryption.
    • One difference is that the MAC algorithm need not be reversible (No Decryption).
    • The MAC function is a many-to-one function.
    • For example, suppose that we are using 100-bit messages and a 10-bit MAC. Then, there are a total of 2^100 different messages but only 2^10 different MACs.

Many messages may have the same MAC. However, finding these can be very difficult.

Requirements for MAC

The MAC needs to satisfy the following:

  • Knowing a message and MAC, it is infeasible to find another message with the same MAC.
  • MACs should be uniformly distributed.
  • The MAC should depend equally on all bits of the message.

Here are three situations in which a message authentication code is used:

  1. There are a number of applications in which the same message is broadcast to a number of destinations. Ex: Notification to users that the network is now unavailable or an alarm signal in a military control center.
  2. Another possible scenario is an exchange in which one side has a heavy load and cannot afford the time to decrypt all incoming messages. Authentication is carried out on a selective basis, messages being chosen at random for checking.
  3. Authentication of a computer program in plaintext is an attractive service. The computer program can be executed without having to decrypt it every time, which would be wasteful of processor resources. However, if a MAC were attached to the program, it could be checked whenever assurance was required of the integrity of the program.

Other rationales:

  1. For some applications, it may not be of concern to keep messages secret, but it is important to authenticate messages.
  2. Separation of authentication and confidentiality functions affords architectural flexibility.
  3. A user may wish to prolong the period of protection beyond the time of reception and yet allow processing of message contents.

Hash Functions

  • A variation on the message authentication code is the one-way hash function.
  • Condenses arbitrary message to fixed-size hash code.
  • A hash function accepts a variable-size message M as input and produces a fixed-size output, referred to as a hash code H(M).
  • Unlike a MAC, a hash code does not use a key but is a function only of the input message.
  • The hash code is also referred to as a message digest or hash value.
  • The hash code is a function of all the bits of the message and provides an error-detection capability.
  • A change to any bit or bits in the message results in a change to the hash code.
  • In general, the hash is much smaller than the input data; hence, hash functions are sometimes called compression functions.
  • The hash function is assumed to be public.

Requirements for Hash Functions

The purpose of a hash function is to produce a “fingerprint” of a file, message, or other block of data. A hash function H must have the following properties:

  1. H can be applied to a block of data of any size.
  2. H produces a fixed-length output.
  3. H(x) is relatively easy to compute for any given x, making both hardware and software implementations practical.
  4. For any given value h, it is computationally infeasible to find x such that H(x) = h.
    • Referred to as the one-way property.
  5. For any given block x, it is computationally infeasible to find y ≠ x such that H(y) = H(x).
    • Referred to as weak collision resistance.
  6. It is computationally infeasible to find any pair (x, y) such that H(x) = H(y).
    • Referred to as strong collision resistance (Birthday Attack – In a group of people, at least two will have the same birthday).
  7. Pseudorandomness: The output of H meets standard tests for pseudorandomness.

Variable input size => Fixed output size

Public Key Distribution Methods

  1. Public Announcement: Here, the public key is broadcast to everyone. The major weakness of this method is forgery. Anyone can create a key claiming to be someone else and broadcast it. Until the forgery is discovered, the forger can masquerade as the claimed user.
  2. Publicly Available Directory: In this type, the public key is stored in a public directory. Directories are trusted here, with properties like participant registration, access, and allowing modification of values at any time. It contains entries like {name, public-key}. Directories can be accessed electronically but are still vulnerable to forgery or tampering.
  3. Public Key Authority: It is similar to the directory but improves security by tightening control over the distribution of keys from the directory. It requires users to know the public key for the directory. Whenever keys are needed, real-time access to the directory is made by the user to obtain any desired public key securely.
  4. Public Certification: This time, the authority provides a certificate (which binds an identity to the public key) to allow key exchange without real-time access to the public authority each time. The certificate is accompanied by some other information, such as the period of validity, rights of use, etc. All of this content is signed by the private key of the certificate authority, and it can be verified by anyone possessing the authority’s public key. The first sender and receiver both request the CA for a certificate that contains a public key and other information, and then they can exchange these certificates and start communication.

Secret Key Distribution with Confidentiality and Authentication

  • Provides protection against both active and passive attacks.
  • It is assumed that A and B have exchanged public keys by one of the schemes:
  1. A uses B’s public key to encrypt a message to B containing an identifier of A (IDA) and a nonce (N1), which is used to identify this transaction uniquely.
  2. B sends a message to A encrypted with PUa and containing A’s nonce (N1) as well as a new nonce generated by B (N2). Because only B could have decrypted message (1), the presence of N1 in message (2) assures A that the correspondent is B.
  3. A returns N2 encrypted using B’s public key to assure B that its correspondent is A.
  4. A selects a secret key Ks and sends M = E(PUb, E(PRa, Ks)) to B.
    • Encryption of this message with B’s public key ensures that only B can read it.
    • Encryption with A’s private key ensures that only A could have sent it.
  5. B computes D(PUa, D(PRb, M)) to recover the secret key Ks.

Digital Signature Standard (DSS)

  • US government-approved signature scheme, designed by NIST in the early 1990s.
  • Published in 1991, revised in 1993, 1996, and 2000.
  • Uses the SHA hash algorithm.
  • DSS is the standard; DSA is the algorithm.
  • In 2000, it included alternative RSA and elliptic curve signature variants.
  • DSA is a digital signature only. However, RSA is a public-key technique that can be used for encryption, key exchange, and digital signatures.

Two Approaches for Digital Signatures

RSA Approach

  • The message to be signed is input to a hash function that produces a secure hash code of fixed length.
  • This hash code is then encrypted using the sender’s private key to form the signature.
  • Both the message and the signature are then transmitted.
  • The recipient takes the message and produces a hash code.
  • The recipient also decrypts the signature using the sender’s public key.
  • If the calculated hash code matches the decrypted signature, the signature is accepted as valid. Because only the sender knows the private key, only the sender could have produced a valid signature.

DSS Approach

  • The DSS approach also makes use of a hash function.
  • The hash code is provided as input to a signature function along with a random number generated for this particular signature.
  • The signature function also depends on the sender’s private key and a set of parameters known to a group of communicating principals (global public key components).
  • The result is a signature consisting of two components, (s and r), which are sent along with the message.
  • At the receiving end, the hash code of the incoming message is generated.
  • The hash code plus the signature is input to a verification function.
Direct Digital Signature Arbitrated Digital Signature
It only requires the communicating parties.It requires an arbiter along with communicating parties to send or receive messages.
In this, the digital signature encrypts the whole plaintext with the sending party’s private key.The encrypted message is sent by X to arbiter Z with Y’s ID, timestamp, and some random number PQ.
The message is directly transmitted between both parties without any help from an intermediate.An arbiter is needed to transmit the message.
The timestamp is not maintained by both sides.The timestamp is maintained by all three members by default.
In case confidentiality is needed, the message will be encrypted with the receiver’s public key or a shared key.

The arbiter provides confidentiality of the message.

Diffie-Hellman Man-in-the-Middle Attack

  1. Darth prepares for the attack by generating two random private keys, XD1 and XD2, and then computing the corresponding public keys, YD1 and YD2.
  2. Alice transmits YA to Bob.
  3. Darth intercepts YA and transmits YD1 to Bob. Darth also calculates K2 = (YA)XD2 mod q.
  4. Bob receives YD1 and calculates K1 = (YD1) XB mod q.
  5. Bob transmits YB to Alice.
  6. Darth intercepts YB and transmits YD2 to Alice. Darth calculates K1 = (YB) XD1 mod q.
  7. Alice receives YD2 and calculates K2 = (YD2) XA mod q.

At this point, Bob and Alice think that they share a secret key, but instead, Bob and Darth share secret key K1, and Alice and Darth share secret key K2. However, all future communication between Bob and Alice is compromised in the following way:

  1. Alice sends an encrypted message M: E(K2, M).
  2. Darth intercepts the encrypted message and decrypts it to recover M.
  3. Darth sends Bob E(K1, M) or E(K1, M’), where M’ is any message.
    • In the first case, Darth simply wants to eavesdrop on the communication without altering it.
    • In the second case, Darth wants to modify the message going to Bob.

Attacks on RSA

1. The Mathematical Attack – Factoring Problem

We can identify three approaches to attacking RSA mathematically:

  1. Factor n into its two prime factors n = p.q.
    • Hence, this enables the calculation of Ø (n) = (p – 1) × (q – 1).
    • Which in turn enables the determination of d = e-1(mod Ø (n)).
  2. Determine Ø (n) directly, without first determining p and q.
    • Again, this enables the determination of d = e-1 (mod Ø (n)).
  3. Determine d directly, without first determining Ø (n). With presently known algorithms, determining d given e and n appears to be at least as time-consuming as the factoring problem. Hence, we can use factoring performance as a benchmark against which to evaluate the security of RSA.

To avoid values of n that may be factored more easily, the algorithm’s inventors suggest the following constraints on p and q:

  1. p and q should differ in length by only a few digits.
    • Thus, for a 1024-bit key (309 decimal digits), both p and q should be on the order of magnitude of 1075 to 10100.
  2. Both (p – 1) and (q – 1) should contain a large prime factor.
  3. Gcd (p – 1, q – 1) should be small.

In addition, it has been demonstrated that if e < n and d < n1/4, then d can be easily determined.

2. Timing Attack

  • A timing attack is somewhat analogous to a burglar guessing the combination of a safe by observing how long it takes for someone to turn the dial from number to number.
  • A snooper can determine a private key by keeping track of how long a computer takes to decipher messages.
  • This attack is alarming for two reasons:
    • It comes from a completely unexpected direction.
    • It is a ciphertext-only attack.
  • Therefore, if the observed time to execute the decryption algorithm is always slow when this particular iteration is slow with a 1 bit, then this bit is assumed to be 1.
  • If a number of observed execution times for the entire algorithm are fast, then this bit is assumed to be 0.

Countermeasures:

  1. Use constant exponentiation time:
    • Ensure that all exponentiations take the same amount of time.
    • This is a simple fix but does degrade performance.
  2. Add random delays:
    • Better performance could be achieved by adding a random delay to the exponentiation algorithm to confuse the timing attack.

MD5

  • Designed by Ronald Rivest (the R in RSA) in 1991.
  • MD5, or the Message Digest algorithm, is a hash function that is used in cryptography.
  • This algorithm/function takes an input message of arbitrary length to produce a 128-bit or 16-byte hash value or message digest.
  • Latest in a series of MD2, MD4.
  • Until recently, it was the most widely used hash algorithm. In recent times, it has had both brute-force and cryptanalytic concerns.

Algorithm

  • Append Padding Bits: In the first step, append padding bits to the original message in such a way that the total length of the message is 64 bits less than the exact multiple of 512. Pad the message so its length is 448 mod 512. The initial bit of padding is 1, while the remaining bits are all zeros: 1000….0.
  • Append a 64-bit length value to the message: In this step, append the length bit in the output of the first step in such a way that the total number of bits is the perfect multiple of 512.
  • Initialize the 4-word (128-bit) MD buffer (A, B, C, D).

MD5 Processing of a Single 512-bit Block

  • Each 512-bit block gets broken down into 16 sub-blocks of 32 bits each.
  • Process the message in 16-word (512-bit) blocks: Four auxiliary functions are used, each of which accepts three 32-bit words as input and outputs one 32-bit word as a result. These functions make use of logical operators such as AND, XOR, OR, and NOT.
  • T[i] is the 32-bit constant value derived from the mathematical sine function (64 values).
  • X[k] is the kth 32-bit word in the 512-bit block of the message.

MD5 Compression Function

  • Each round has 16 steps of the form: a = b + ((a + g(b,c,d)+ X[k]+T[i]) <<
  • a, b, c, and d refer to the 4 words of the buffer but are used in varying permutations. Note this updates 1 word only of the buffer. After 16 steps, each word is updated 4 times.
  • Where g(b,c,d) is a different nonlinear function in each round (F, G, H, I).

Advantages

  • Every hash bit is dependent on all message bits.
  • It is easier to compare and store smaller hashes using MD5 algorithms than it is to store a large variable-length text.
  • By using MD5, passwords are stored in 128-bit format.
  • A message digest can easily be created from an original message using MD5.

Disadvantages

  • When compared to other algorithms like the SHA algorithm, MD5 is comparatively slow.
  • It is possible to construct the same hash function for two distinct inputs using MD5.
  • MD5 is less secure when compared to the SHA algorithm since MD5 is more vulnerable to collision attacks.