密码学复习en（1）**Problem**: In simplified BLS setup (using small

Cryptography Final Exam Review - English Version

Part 1: Symmetric Tools

1. PRG -> Stream Cipher

1.1 Pseudo-Random Generator (PRG)

Exam Focus: Concept Explanation - What is PRG?

A pseudo-random generator (PRG) is a deterministic algorithm that expands a short random seed into a long sequence, such that this sequence is computationally indistinguishable from a truly random sequence. PRG is a fundamental building block in cryptography that solves the core problem of "how to generate a large amount of seemingly random data using a small amount of randomness."

Formal Definition: Let $G: \{0,1\}^n \rightarrow \{0,1\}^l$ be a function where $l > n$ (expansion requirement). $G$ is a secure PRG if for all polynomial-time distinguishers $D$ , there exists a negligible function $\epsilon$ such that: $|\Pr[D(G(s)) = 1] - \Pr[D(r) = 1]| \leq \epsilon(n)$ where $s \leftarrow \{0,1\}^n$ is a random seed and $r \leftarrow \{0,1\}^l$ is a truly random string.

Key Understanding:

Input: a short random seed $s$ (e.g., 128 bits), which is the only source of randomness
Output: a long pseudo-random sequence $G(s)$ (e.g., 1GB), much longer than the input
Security: the output sequence is computationally indistinguishable from a truly random sequence, meaning no polynomial-time algorithm can distinguish $G(s)$ from a truly random string $r$
Deterministic: the same seed always produces the same output sequence

Why Do We Need PRG? In cryptography, we often need large amounts of random numbers (e.g., encryption key streams), but generating truly random numbers is expensive. PRG allows us to generate large amounts of "seemingly random" data using a small amount of true randomness (the seed), greatly reducing the randomness requirement.

Practical Applications:

Generating encryption key streams (for stream ciphers)
Generating session keys
Generating initialization vectors (IV)
As building blocks for other cryptographic primitives

Exam Focus: PRG Security Definition A distinguisher $D$ is an algorithm that attempts to distinguish PRG output from a truly random string. If $D$ outputs 1 to indicate it thinks the input is random, then:

$\Pr[D(G(s)) = 1]$ is the probability that the distinguisher thinks the PRG output is random
$\Pr[D(r) = 1]$ is the probability that the distinguisher thinks a truly random string is random
If the difference between these probabilities is negligible, it means the PRG output is indistinguishable from a truly random string

1.2 Stream Cipher

Exam Focus: Concept Explanation - How Stream Cipher Works

A stream cipher is a symmetric encryption scheme that uses a key stream generated by a PRG to perform bitwise XOR with the plaintext. The core idea of stream ciphers is: if the key stream is truly random, then the ciphertext is a "one-time pad," which is theoretically unbreakable encryption.

Detailed How It Works:

Key Generation: Sender and receiver share a key $k$ (e.g., 128 bits)
Initialization Vector: Choose a random or pseudo-random $IV$ (initialization vector). $IV$ need not be secret, but should be different for each encryption
Key Stream Generation: Use PRG to generate key stream $K = G(k, IV)$ , with length equal to plaintext length
Encryption: Ciphertext is obtained by bitwise XOR: $c = m \oplus K$

Mathematical Representation: For the $i$ -th bit: $c_i = m_i \oplus k_i$ where $m_i$ is the $i$ -th bit of plaintext, $k_i$ is the $i$ -th bit of key stream, and $c_i$ is the $i$ -th bit of ciphertext.

Decryption Process: Due to the property of XOR ( $A \oplus B \oplus B = A$ ), decryption simply requires XOR again: $m_i = c_i \oplus k_i = (m_i \oplus k_i) \oplus k_i = m_i \oplus (k_i \oplus k_i) = m_i \oplus 0 = m_i$

Calculation Problem: Stream Cipher Encryption/Decryption Example: Given plaintext $m = 10110101$ and key stream $k = 11001011$ , calculate ciphertext $c$ and decrypted plaintext. Solution:

Encryption: $c = m \oplus k = 10110101 \oplus 11001011 = 01111110$
Decryption: $m' = c \oplus k = 01111110 \oplus 11001011 = 10110101 = m$ ✓

Security Requirements (Exam Focus):

Key Stream Length: Key stream must be the same length as plaintext, neither shorter nor longer
Non-Reusability: The same combination of key and IV must never be reused! If reused, an attacker can compute: $c_1 \oplus c_2 = (m_1 \oplus k) \oplus (m_2 \oplus k) = m_1 \oplus m_2$ This reveals the XOR of two plaintexts, leaking information
PRG Security: The underlying PRG must be cryptographically secure, otherwise the key stream may be predictable
IV Uniqueness: Each encryption must use a different IV to ensure different key streams

Scheme Design Problem: Why Do We Need IV? Without IV, the same plaintext always produces the same ciphertext, which is insecure. The role of IV is to ensure that even with the same key, each encryption produces a different key stream, thus producing different ciphertexts. IV can be a counter, random number, or timestamp, but must ensure uniqueness.

Advantages:

Fast encryption (only requires XOR operation)
Simple implementation (easy to implement in both hardware and software)
Suitable for real-time communication (can generate key stream while encrypting)
Errors do not propagate (one bit error only affects one bit)

Disadvantages:

Key stream cannot be reused (must ensure IV uniqueness)
Requires synchronized IV (sender and receiver must use the same IV)
Complex key management (need to securely share and store keys)

Proof Problem: CPA Security of Stream Cipher If PRG $G$ is secure, then the stream cipher based on $G$ is semantically secure under CPA. Proof Idea:

Assume there exists an attacker $A$ that can break the stream cipher with non-negligible advantage
Construct a distinguisher $D$ to distinguish PRG output from random strings
If $A$ succeeds, then $D$ can also succeed in distinguishing, contradicting PRG security
Therefore, the stream cipher is CPA secure

2. PRF -> Block Cipher

2.1 Pseudo-Random Function (PRF)

Exam Focus: Concept Explanation - Difference Between PRF and PRG

A pseudo-random function (PRF) is a deterministic function that takes a key and an input, producing an output, such that without knowing the key, it is impossible to distinguish this function from a truly random function. A PRF can be thought of as a "queryable random function table," where querying different inputs yields (seemingly) random outputs.

Formal Definition: Let $F: \{0,1\}^n \times \{0,1\}^m \rightarrow \{0,1\}^n$ be a function family, where the first parameter is the key and the second parameter is the input. $F$ is a secure PRF if for all polynomial-time distinguishers $D$ , there exists a negligible function $\epsilon$ such that: $|\Pr[D^{F_k(\cdot)} = 1] - \Pr[D^{f(\cdot)} = 1]| \leq \epsilon(n)$ where $k \leftarrow \{0,1\}^n$ is a random key and $f$ is a randomly chosen function from all functions $\{0,1\}^m \rightarrow \{0,1\}^n$ (a truly random function).

Key Understanding:

Function Family: For different keys $k$ , $F_k$ are different functions
Deterministic: The same input $x$ and key $k$ always produce the same output $F_k(x)$
Pseudo-randomness: Without knowing the key $k$ , the behavior of $F_k(\cdot)$ is indistinguishable from a truly random function $f(\cdot)$
Queryability: The distinguisher can query the function on arbitrary inputs

Difference Between PRF and PRG (Exam Focus):

Property	PRG	PRF
Input	Fixed-length seed	Key + arbitrary input
Output	Long sequence	Fixed-length output
Query	Generate entire sequence at once	Can query arbitrary inputs on demand
Application	Stream cipher	Block cipher, MAC

Core Properties:

Deterministic: Same input and key always produce same output, i.e., $F_k(x) = F_k(x)$
Pseudo-randomness: Without knowing the key, output appears random
Efficiency: Fast computation, can be computed in polynomial time
Reversibility: PRF itself may not be reversible, but can be used to construct reversible block ciphers

Practical Applications:

Constructing block ciphers (e.g., AES)
Constructing message authentication codes (MAC)
Constructing key derivation functions
As building blocks for other cryptographic primitives

2.2 Block Cipher

Exam Focus: Concept Explanation - What is Block Cipher?

A block cipher is a symmetric encryption scheme that encrypts a fixed-length plaintext block (e.g., 128 bits) into a ciphertext block of the same length. Block ciphers are one of the most fundamental and widely used cryptographic primitives.

Basic Structure: Block ciphers are typically constructed based on PRF, with AES (Advanced Encryption Standard) being the most famous example. The core of a block cipher is designing a reversible, pseudo-random permutation, i.e., for each key, the encryption function is a bijection from all possible plaintext blocks to all possible ciphertext blocks.

How AES Works (Detailed): AES treats a 128-bit plaintext block as a $4 \times 4$ byte matrix (each byte is 8 bits, total 16 bytes = 128 bits).

Input: 128-bit plaintext block $P$ and 128/192/256-bit key $k$
Key Expansion: Expand the key into multiple round keys
Initial Round Key Addition (AddRoundKey): XOR plaintext with first round key $State = P \oplus RoundKey_0$
Multiple Rounds (depending on key length: 10 rounds for 128-bit key, 12 for 192-bit, 14 for 256-bit):
- SubBytes (Byte Substitution): Use S-box (Substitution Box) to perform nonlinear substitution on each byte, providing confusion
- ShiftRows (Row Shifting): Cyclically shift each row of the matrix by different amounts, providing diffusion
- MixColumns (Column Mixing): Perform linear transformation on each column (omitted in last round), further providing diffusion
- AddRoundKey (Round Key Addition): XOR with current round key
Last Round: Omit MixColumns operation
Output: 128-bit ciphertext block $C$

Mathematical Representation: $C = E_k(P)$ where $P$ is the plaintext block, $k$ is the key, $E_k$ is the encryption function, and $C$ is the ciphertext block.

Decryption Process: $P = D_k(C) = E_k^{-1}(C)$ where $D_k$ is the decryption function, the inverse of the encryption function. AES decryption uses inverse operations: InvSubBytes, InvShiftRows, InvMixColumns.

Calculation Problem: AES Encryption Calculation Example: Given AES-128 key $k$ and plaintext block $P$ , describe the encryption process (no need for specific values, but explain each step). Solution Points:

Key expansion: Generate 11 round keys (1 initial + 10 round keys)
Initial round key addition: $State = P \oplus RoundKey_0$
10 rounds of transformation: Each round includes SubBytes, ShiftRows, MixColumns (except round 10), AddRoundKey
Output ciphertext block

Security Analysis (Exam Focus):

Single Block Security: Block ciphers themselves only provide encryption for a single block. If the plaintext is exactly one block, encryption is secure
Multiple Blocks Problem: Directly using block ciphers to encrypt multiple blocks is insecure! Identical plaintext blocks produce identical ciphertext blocks, which leaks information
Encryption Modes Needed: Must use encryption modes (such as CBC, CTR) to securely encrypt multiple blocks

Proof Problem: Why Directly Using Block Cipher to Encrypt Multiple Blocks is Insecure? Proof Idea: Assume directly using $E_k$ to encrypt multiple blocks: $C_i = E_k(P_i)$ If $P_i = P_j$ ( $i \neq j$ ), then $C_i = E_k(P_i) = E_k(P_j) = C_j$ An attacker observing $C_i = C_j$ can infer $P_i = P_j$ , which leaks plaintext information. Therefore, directly using block ciphers to encrypt multiple blocks is not semantically secure.

3. How to Use Block Cipher? -> Encryption Modes

Scheme Design Problem: Why Do We Need Encryption Modes?

Directly using block ciphers to encrypt multiple blocks is insecure because identical plaintext blocks produce identical ciphertext blocks, which leaks information. Encryption modes define how to use block ciphers to securely encrypt messages of arbitrary length (which may contain multiple blocks).

Core Problem: Block cipher $E_k$ can only encrypt fixed-length blocks (e.g., 128 bits). To encrypt longer messages, we need:

Split the message into multiple blocks
Use some method to combine the encryption of these blocks
Ensure the combination method does not leak information

3.1 ECB Mode (Electronic Codebook Mode)

Exam Focus: Concept Explanation - How ECB Mode Works and Its Security

ECB (Electronic Codebook) mode is the simplest encryption mode, where each plaintext block is encrypted independently without affecting others.

How It Works:

Split plaintext $m$ into blocks: $P_1, P_2, \ldots, P_n$ (each block length equals the block cipher block length)
Encrypt each block independently: $C_i = E_k(P_i) \quad \text{for } i = 1, 2, \ldots, n$
Ciphertext is: $C = C_1 || C_2 || \ldots || C_n$

Decryption: $P_i = D_k(C_i) = E_k^{-1}(C_i)$

Calculation Problem: ECB Mode Encryption Calculation Example: Encrypt message $m = "HELLO WORLD"$ using AES-128 (block length 128 bits) in ECB mode (assume padded to block length multiple). If $E_k("HELLO") = C_1$ , $E_k(" WORL") = C_2$ , $E_k("D...") = C_3$ , write the encryption process. Solution:

Split: $P_1 = "HELLO"$ , $P_2 = " WORL"$ , $P_3 = "D..."`
Encrypt: $C_1 = E_k(P_1)$ , $C_2 = E_k(P_2)$ , $C_3 = E_k(P_3)$
Ciphertext: $C = C_1 || C_2 || C_3$

Security Analysis (Exam Focus): ECB mode has serious security problems:

Identical plaintext blocks produce identical ciphertext blocks: If $P_i = P_j$ , then $C_i = E_k(P_i) = E_k(P_j) = C_j$
Does not hide patterns: Repetitive patterns in plaintext are reflected in ciphertext
Not CPA secure: Attackers can observe which blocks are identical, inferring plaintext information

Proof Problem: Prove ECB Mode is Not CPA Secure Proof Idea: Construct attacker $A$ :

$A$ chooses two plaintexts $m_0 = P || P$ (two identical blocks) and $m_1 = P || Q$ (two different blocks, $P \neq Q$ )
$A$ queries encryption oracle, receives challenge ciphertext $c^*$
If the first two blocks of $c^*$ are identical, $A$ outputs $b' = 0$ ; otherwise outputs $b' = 1$
$A$ 's success probability is 1 (perfect distinguishing), therefore ECB is not CPA secure

Practical Example: If an image is encrypted using ECB mode, even after encryption, the general outline of the image remains visible because identical pixel blocks produce identical ciphertext blocks.

Conclusion: ECB mode should not be used in practical applications!

3.2 CBC Mode (Cipher Block Chaining Mode)

Exam Focus: Concept Explanation - How CBC Mode Solves ECB's Problems

CBC (Cipher Block Chaining) mode solves the problem of identical plaintext blocks producing identical ciphertext blocks by XORing each plaintext block with the previous ciphertext block.

How It Works:

Choose IV: Choose a random initialization vector $IV$ (length equals block length, e.g., 128 bits)
First Block: First plaintext block XORed with $IV$ then encrypted $C_1 = E_k(P_1 \oplus IV)$
Subsequent Blocks: Each plaintext block XORed with previous ciphertext block then encrypted $C_i = E_k(P_i \oplus C_{i-1}) \quad \text{for } i \geq 2$

Complete Encryption Formula: $C_0 = IV$ $C_i = E_k(P_i \oplus C_{i-1}) \quad \text{for } i \geq 1$

Decryption Process:

Decrypt first block: $P_1 = D_k(C_1) \oplus IV = D_k(C_1) \oplus C_0$
Decrypt subsequent blocks: $P_i = D_k(C_i) \oplus C_{i-1} \quad \text{for } i \geq 2$

Calculation Problem: CBC Mode Encryption/Decryption Calculation Example: In CBC mode, given $IV = 1010$ , $P_1 = 1100$ , $P_2 = 0110$ , key $k$ , assuming $E_k(0010) = 1111$ , $E_k(1001) = 0101$ , calculate $C_1$ and $C_2$ , then verify decryption. Solution:

Encryption:
- $C_1 = E_k(P_1 \oplus IV) = E_k(1100 \oplus 1010) = E_k(0110)$ (need to know value of $E_k(0110)$ )
- Assume $E_k(0110) = 1110$ , then $C_1 = 1110$
- $C_2 = E_k(P_2 \oplus C_1) = E_k(0110 \oplus 1110) = E_k(1000)$
- Assume $E_k(1000) = 0001$ , then $C_2 = 0001$
Decryption Verification:
- $P_1' = D_k(C_1) \oplus IV = D_k(1110) \oplus 1010 = 0110 \oplus 1010 = 1100 = P_1$ ✓
- $P_2' = D_k(C_2) \oplus C_1 = D_k(0001) \oplus 1110 = 1000 \oplus 1110 = 0110 = P_2$ ✓

Key Characteristics (Exam Focus):

Requires IV: Must have an initialization vector, usually transmitted together with the first ciphertext block
Properties of IV:
- $IV$ should be random (or pseudo-random)
- $IV$ need not be secret, can be transmitted in plaintext
- Each encryption should use a different $IV$ (to ensure semantic security)
Different Ciphertexts for Same Plaintext: Even if $P_i = P_j$ , if $C_{i-1} \neq C_{j-1}$ , then $C_i \neq C_j$
Parallelism:
- Encryption is sequential: Must wait for previous block encryption to complete before encrypting next block (cannot parallelize)
- Decryption can be parallelized: All blocks can be decrypted simultaneously because $C_{i-1}$ is already known

Scheme Design Problem: Why Does IV Not Need to Be Secret? Even if the attacker knows $IV$ , as long as $IV$ is random, CBC mode is still CPA secure. The role of $IV$ is to ensure each encryption produces different ciphertexts, not to provide confidentiality. Confidentiality is provided by the encryption function $E_k$ .

Security: CBC mode is semantically secure under CPA (Chosen Plaintext Attack), provided:

The underlying block cipher $E_k$ is a pseudo-random permutation (PRP)
$IV$ is randomly chosen (different for each encryption)

Proof Problem: CPA Security of CBC Mode (Simplified Proof Idea) Proof Idea:

Assume the underlying block cipher $E_k$ is a pseudo-random permutation
If $IV$ is random, then each block's input ( $P_i \oplus C_{i-1}$ ) appears random
Since $E_k$ is pseudo-random, output $C_i$ also appears random
Therefore, the entire ciphertext appears random, cannot distinguish encryptions of two plaintexts
Therefore, CBC mode is CPA secure

3.3 CTR Mode (Counter Mode)

Exam Focus: Concept Explanation - Relationship Between CTR Mode and Stream Cipher

CTR (Counter) mode uses a counter to generate a key stream, then XORs with plaintext. CTR mode essentially converts a block cipher into a stream cipher.

How It Works:

Choose IV: Choose an initial value $IV$ (usually a random number or counter starting value)
Generate Key Stream: Apply block cipher to each counter value $IV + i$ ( $i = 0, 1, 2, \ldots$ ) to generate key stream blocks: $K_i = E_k(IV + i)$
Encryption: Plaintext block XORed with corresponding key stream block: $C_i = P_i \oplus K_i = P_i \oplus E_k(IV + i)$
Decryption: XOR ciphertext with same key stream block: $P_i = C_i \oplus K_i = C_i \oplus E_k(IV + i)$

Mathematical Representation: $C_i = P_i \oplus E_k(IV + i)$ $P_i = C_i \oplus E_k(IV + i)$

Calculation Problem: CTR Mode Encryption Calculation Example: In CTR mode, given $IV = 5$ , $P_1 = 1010$ , $P_2 = 1100$ , key $k$ , assuming $E_k(5) = 1111$ , $E_k(6) = 0101$ , calculate $C_1$ and $C_2$ . Solution:

$K_1 = E_k(IV + 0) = E_k(5) = 1111$
$C_1 = P_1 \oplus K_1 = 1010 \oplus 1111 = 0101$
$K_2 = E_k(IV + 1) = E_k(6) = 0101$
$C_2 = P_2 \oplus K_2 = 1100 \oplus 0101 = 1001$

Key Characteristics (Exam Focus):

Encryption and Decryption Are Identical: Both encryption and decryption use XOR operation, exactly the same operation
Fully Parallelizable: Can encrypt/decrypt all blocks simultaneously because each $K_i = E_k(IV + i)$ can be computed independently
No Padding Needed: Can handle arbitrary length data without padding to block length multiples
Similar to Stream Cipher: CTR mode behaves similarly to stream cipher but based on block cipher rather than PRG
Random Access: Can decrypt arbitrary block without decrypting previous blocks (just need to know $IV$ and block index)

Scheme Design Problem: CTR Mode vs CBC Mode

Property	CBC Mode	CTR Mode
Parallel Encryption	❌ Sequential	✅ Fully parallel
Parallel Decryption	✅ Can parallelize	✅ Fully parallel
Random Access	❌ Need previous blocks	✅ Can directly access
Padding	✅ Required	❌ Not needed
Error Propagation	✅ One bit error affects subsequent blocks	❌ Errors do not propagate
Implementation Complexity	Medium	Simple

Security: CTR mode is semantically secure under CPA, provided:

The underlying block cipher $E_k$ is a pseudo-random permutation (PRP)
$IV$ is randomly chosen, and $IV + i$ values do not repeat (counter does not wrap around)

Proof Problem: CPA Security of CTR Mode Proof Idea:

If $E_k$ is a pseudo-random permutation, then $E_k(IV + i)$ output appears random
Key stream $K_i = E_k(IV + i)$ appears to be a truly random string
Therefore, $C_i = P_i \oplus K_i$ appears to be "one-time pad" encryption
"One-time pad" is theoretically unbreakable, therefore CTR mode is CPA secure

Practical Applications: CTR mode is widely used in modern cryptography because it supports parallel processing, suitable for high-speed encryption scenarios.

3.4 CPA Secure Semantic Security

Exam Focus: Concept Explanation - Complete Definition of CPA Security

Chosen Plaintext Attack (CPA): In a chosen plaintext attack, the attacker can choose arbitrary plaintexts and obtain corresponding ciphertexts. This models real-world scenarios where attackers can observe encrypted communications.

Semantic Security: Semantic security means that even if the attacker knows some information about the plaintext (such as length, format, etc.), they cannot obtain additional information from the ciphertext. In other words, the ciphertext does not leak any information about the plaintext (except public information like length).

Formal Definition: An encryption scheme $(Gen, Enc, Dec)$ is CPA secure if for all polynomial-time attackers $A$ , there exists a negligible function $\epsilon$ such that: $\Pr[\text{CPA-Game}(A) = 1] \leq \frac{1}{2} + \epsilon(n)$ where $\text{CPA-Game}(A)$ is the CPA game, outputting 1 when attacker $A$ wins.

CPA Game (Detailed Steps):

Initialization: Challenger generates key $k \leftarrow Gen(1^n)$ , where $n$ is the security parameter
Learning Phase 1: Attacker $A$ can query encryption oracle $Enc_k(\cdot)$ arbitrarily many polynomial times, obtaining $(m_i, Enc_k(m_i))$ pairs
Challenge Phase:
- Attacker chooses two equal-length plaintexts $m_0, m_1$ ( $|m_0| = |m_1|$ )
- Challenger randomly chooses $b \leftarrow \{0,1\}$ (uniformly random)
- Challenger computes and returns challenge ciphertext $c^* = Enc_k(m_b)$
Learning Phase 2: Attacker $A$ can continue to query encryption oracle $Enc_k(\cdot)$ arbitrarily many polynomial times (but cannot query decryption of $c^*$ )
Guess Phase: Attacker outputs $b' \in \{0,1\}$
Decision: If $b' = b$ , attacker wins, game outputs 1; otherwise outputs 0

Attacker's Advantage: Attacker's advantage is defined as: $\text{Adv}_{CPA}(A) = |\Pr[\text{CPA-Game}(A) = 1] - \frac{1}{2}|$ If $\text{Adv}_{CPA}(A) \leq \epsilon(n)$ (negligible function), then the encryption scheme is CPA secure.

Proof Problem: Prove ECB Mode is Not CPA Secure Proof: Construct attacker $A$ :

$A$ chooses $m_0 = P || P$ (two identical blocks) and $m_1 = P || Q$ (two different blocks, $P \neq Q$ )
$A$ queries encryption oracle, receives challenge ciphertext $c^* = c_1^* || c_2^*$
$A$ checks: if $c_1^* = c_2^*$ , output $b' = 0$ ; otherwise output $b' = 1$

Analysis:

If $b = 0$ (encrypt $m_0$ ), then $c_1^* = E_k(P)$ , $c_2^* = E_k(P)$ , so $c_1^* = c_2^*$
If $b = 1$ (encrypt $m_1$ ), then $c_1^* = E_k(P)$ , $c_2^* = E_k(Q)$ , since $P \neq Q$ and $E_k$ is a permutation, so $c_1^* \neq c_2^*$
Therefore, $A$ can always correctly guess $b$ with success probability 1
$\text{Adv}_{CPA}(A) = |1 - \frac{1}{2}| = \frac{1}{2}$ , which is non-negligible
Therefore, ECB mode is not CPA secure

Proof Problem: Prove CBC Mode is CPA Secure (Simplified Idea) Proof Idea (based on ideal cipher model):

Assume the underlying block cipher $E_k$ is a pseudo-random permutation (PRP)
In ideal case, if $IV$ is random, then each block's input $P_i \oplus C_{i-1}$ appears random
Since $E_k$ is pseudo-random, output $C_i$ also appears random
Therefore, the entire ciphertext appears random, cannot distinguish encryptions of $m_0$ and $m_1$
Any attacker's advantage is negligible
Therefore, CBC mode is CPA secure

Important Conclusions (Exam Focus):

✅ CBC Mode: Semantically secure under CPA (if $IV$ is random)
✅ CTR Mode: Semantically secure under CPA (if $IV$ is random)
❌ ECB Mode: Not CPA secure, should not be used
CPA security is a fundamental security requirement in modern cryptography: Any practical encryption scheme must be CPA secure

4. Hash Function

Exam Focus: Concept Explanation - Three Security Properties and Their Relationships

A hash function maps inputs of arbitrary length to fixed-length outputs. Hash functions are fundamental tools in cryptography, used for data integrity verification, digital signatures, password storage, etc.

Formal Definition: Let $H: \{0,1\}^* \rightarrow \{0,1\}^n$ be a function, where $\{0,1\}^*$ represents binary strings of arbitrary length and $\{0,1\}^n$ represents binary strings of fixed length $n$ .

Basic Properties:

Compression: Fixed output length (e.g., SHA-256 outputs 256 bits), regardless of input length
Efficiency: Fast computation, can be computed in polynomial time
Deterministic: Same input always produces same output, i.e., $H(x) = H(x)$

Security Requirements (Exam Focus):

1. Preimage Resistance (One-Wayness) Definition: Given hash value $y$ , finding $x$ such that $H(x) = y$ is computationally infeasible. Formal: For all polynomial-time attackers $A$ , there exists a negligible function $\epsilon$ such that: $\Pr[x \leftarrow \{0,1\}^*, y = H(x), A(y) = x' \text{ and } H(x') = y] \leq \epsilon(n)$ Intuitive Understanding: Given a hash value, computing the preimage is difficult (one-way function property).

2. Second Preimage Resistance Definition: Given $x$ , finding $x' \neq x$ such that $H(x) = H(x')$ is computationally infeasible. Formal: For all polynomial-time attackers $A$ , there exists a negligible function $\epsilon$ such that: $\Pr[x \leftarrow \{0,1\}^*, A(x) = x' \neq x \text{ and } H(x') = H(x)] \leq \epsilon(n)$ Intuitive Understanding: Given an input, finding another input that produces the same hash value is difficult.

3. Collision Resistance Definition: Finding any $x, x'$ such that $x \neq x'$ but $H(x) = H(x')$ is computationally infeasible. Formal: For all polynomial-time attackers $A$ , there exists a negligible function $\epsilon$ such that: $\Pr[A() = (x, x') \text{ and } x \neq x' \text{ and } H(x) = H(x')] \leq \epsilon(n)$ Intuitive Understanding: Finding any pair of collisions is difficult.

Proof Problem: Relationships Between Three Security Properties Theorem:

Collision Resistance $\Rightarrow$ Second Preimage Resistance
Second Preimage Resistance $\Rightarrow$ Preimage Resistance (under random oracle model)
But Preimage Resistance $\not\Rightarrow$ Second Preimage Resistance
Second Preimage Resistance $\not\Rightarrow$ Collision Resistance

Proof Idea 1: Collision Resistance $\Rightarrow$ Second Preimage Resistance

Assume there exists attacker $A_2$ that can find second preimage with non-negligible probability
Construct attacker $A_1$ to find collision:
1. Randomly choose $x$
2. Call $A_2(x)$ to get $x' \neq x$ such that $H(x') = H(x)$
3. Output $(x, x')$ as collision
If $A_2$ succeeds, then $A_1$ also succeeds, contradiction

Calculation Problem: Birthday Attack (Collision Finding) Birthday Paradox: In a room with $N$ people, what is the probability that at least two people share the same birthday?

When $N \approx \sqrt{2 \times 365 \times \ln 2} \approx 23$ , probability is about 50%

Birthday Attack: For a hash function with output length $n$ bits, finding a collision requires approximately $O(2^{n/2})$ hash computations. Example: SHA-256 outputs 256 bits, how many hash computations are needed to find a collision using birthday attack? Solution: $2^{256/2} = 2^{128}$ hash computations (still computationally infeasible)

Practical Applications:

Digital Signatures: Hash message before signing, rather than signing entire message directly (efficient)
Message Authentication Codes: As building block for MAC (e.g., HMAC)
Password Storage: Store hash of password rather than plaintext password
Blockchain: Merkle trees, proof of work (PoW)
Data Integrity Verification: Verify whether files have been tampered with

Common Hash Functions:

SHA-256: 256-bit output, currently secure, widely used
SHA-512: 512-bit output, more secure but slightly slower
SHA-3: Based on Keccak, different design from SHA-2
MD5: 128-bit output, insecure, should not be used (collisions found)
SHA-1: 160-bit output, insecure, should not be used (collisions found)

Scheme Design Problem: How to Use Hash Function to Verify File Integrity? Scheme:

Sender computes file hash value $h = H(file)$
Sender sends $(file, h)$ through secure channel (e.g., digital signature)
Receiver computes $h' = H(file')$ after receiving
If $h' = h$ , file is intact; otherwise file has been tampered with

Why This Design?

If attacker modifies file, hash value changes
Due to preimage resistance, attacker cannot find different file producing same hash value
Therefore, any tampering can be detected

5. Message Authentication Code (MAC)

Exam Focus: Concept Explanation - Difference Between MAC and Digital Signature

Message authentication codes (MAC) are used to ensure message integrity and authenticity. MAC uses symmetric keys, so sender and receiver must share the same key.

Basic Idea:

Key Sharing: Sender and receiver pre-share a key $k$ (through secure channel)
Generate Tag: Sender computes $tag = MAC_k(m)$ and sends $(m, tag)$
Verify Tag: Receiver computes $tag' = MAC_k(m)$ and checks if $tag' = tag$
Decision: If $tag' = tag$ , message is intact and from sender; otherwise reject message

Formal Definition: A MAC scheme consists of three algorithms:

Key Generation: $Gen(1^n) \rightarrow k$ , generates key $k$
Tag Generation: $MAC_k(m) \rightarrow tag$ , computes tag for message $m$
Verification: $Verify_k(m, tag) \rightarrow \{0, 1\}$ , verifies tag, outputs 1 (valid) or 0 (invalid)

Usually $Verify_k(m, tag) = 1$ if and only if $MAC_k(m) = tag$ .

Security Requirements (Exam Focus):

Unforgeability MAC must be unforgeable, meaning even if the attacker sees many $(m_i, tag_i)$ pairs, they cannot generate a valid tag $tag^*$ for a new message $m^*$ .

Formal Security Definition: For all polynomial-time attackers $A$ , there exists a negligible function $\epsilon$ such that: $\Pr[A^{MAC_k(\cdot)} \text{ outputs } (m^*, tag^*) \text{ and } Verify_k(m^*, tag^*) = 1] \leq \epsilon(n)$ where $m^*$ is not a message that $A$ has queried (i.e., $m^* \notin \{m_1, m_2, \ldots, m_q\}$ , where $q$ is the number of queries).

MAC Security Game:

Challenger generates key $k \leftarrow Gen(1^n)$
Attacker can query MAC oracle $MAC_k(\cdot)$ arbitrarily many polynomial times, obtaining $(m_i, tag_i)$ pairs
Attacker outputs $(m^*, tag^*)$ , where $m^*$ is not a queried message
If $Verify_k(m^*, tag^*) = 1$ , attacker wins

Proof Problem: Why Is Simple $MAC_k(m) = H(k || m)$ Possibly Insecure? Problem: If attacker knows $(m, tag)$ , they may be able to construct new valid tags. Attack Example (Length Extension Attack):

If $MAC_k(m) = H(k || m)$ , attacker can compute $MAC_k(m || m') = H(k || m || m')$ without knowing $k$
Attacker queries $MAC_k(m)$ to get $tag = H(k || m)$
Then attacker can compute $H(tag || m')$ (in some hash functions, this equals $H(k || m || m')$ )
Therefore attacker can generate valid tag for $m || m'$ without knowing $k$

Common MAC Constructions:

1. HMAC (MAC Based on Hash Function) $HMAC_k(m) = H(k \oplus opad || H(k \oplus ipad || m))$ where:

$opad$ = 0x5c5c5c... (outer padding)
$ipad$ = 0x363636... (inner padding)
$||$ represents concatenation

Calculation Problem: HMAC Calculation Example: Given key $k$ and message $m$ , describe HMAC computation steps. Solution:

If $|k| >$ block length, then $k = H(k)$
If $|k| <$ block length, then right-pad $k$ with 0s
Compute $inner = H(k \oplus ipad || m)$
Compute $HMAC_k(m) = H(k \oplus opad || inner)$

2. CBC-MAC (MAC Based on Block Cipher) Encrypt message using CBC mode, use the last ciphertext block as MAC.

How It Works:

Split message $m$ into blocks: $m_1, m_2, \ldots, m_n$
Encrypt using CBC mode (usually $IV = 0$ ): $C_1 = E_k(m_1)$ $C_i = E_k(m_i \oplus C_{i-1}) \quad \text{for } i \geq 2$
MAC is the last ciphertext block: $tag = C_n$

Note: CBC-MAC is only secure for fixed-length messages. For variable-length messages, other techniques are needed (e.g., encrypt the length of the last block).

MAC vs Digital Signature (Exam Focus):

Property	MAC	Digital Signature
Key Type	Symmetric key (shared key)	Asymmetric key (public/private key pair)
Key Management	Need to securely share key	Public key can be public
Non-repudiation	❌ Both parties can generate tags	✅ Only private key holder can sign
Verifier	Must know key	Only needs public key
Efficiency	Fast	Slower
Application	Two-party communication	Public verification, non-repudiation

Scheme Design Problem: Design a Secure MAC Scheme Requirements: Ensure message integrity and authenticity. Scheme:

Choose secure hash function $H$ (e.g., SHA-256)
Use HMAC: $tag = HMAC_k(m) = H(k \oplus opad || H(k \oplus ipad || m))$
Send $(m, tag)$
Receiver verifies: Compute $tag' = HMAC_k(m)$ , check if $tag' = tag$

Why This Design?

HMAC resists length extension attacks
Based on secure hash function, provides unforgeability
Computationally efficient

6. Authenticated Encryption

Exam Focus: Concept Explanation - Why Do We Need Authenticated Encryption?

Authenticated encryption provides both confidentiality and integrity/authenticity. This is one of the most important security goals in modern cryptography.

Two Security Goals:

Confidentiality (Confidentiality): Attacker cannot obtain any information about plaintext, even when seeing ciphertext
Integrity/Authenticity (Integrity/Authenticity): Attacker cannot forge or modify messages without being detected

Why Authenticated Encryption is Needed: Encryption alone (such as CBC, CTR) only provides confidentiality, not integrity. This leads to serious security problems:

Attack Example:

Attacker intercepts ciphertext $c = Enc_k(m)$
Attacker modifies ciphertext to get $c'$ (e.g., flip some bits)
Receiver decrypts to get $m' = Dec_k(c')$
Receiver cannot detect that $m'$ is not the original message $m$ (because decryption may produce valid plaintext, but not the original message)

Practical Attack Scenarios:

Padding Oracle Attack: Attacker gains information by observing whether decryption succeeds
Ciphertext Modification Attack: Attacker modifies ciphertext, may cause predictable plaintext changes

Construction Methods (Exam Focus):

1. Encrypt-then-MAC (Recommended) Steps:

Encrypt first: $c = Enc_{k_1}(m)$
Compute MAC: $tag = MAC_{k_2}(c)$
Send $(c, tag)$

Verification:

Verify MAC: Check if $MAC_{k_2}(c) = tag$
If MAC is valid, decrypt: $m = Dec_{k_1}(c)$

Advantages:

Best security (under standard assumptions)
MAC protects entire ciphertext, including integrity of encryption scheme

2. MAC-then-Encrypt Steps:

Compute MAC first: $tag = MAC_{k_2}(m)$
Encrypt: $c = Enc_{k_1}(m || tag)$
Send $c$

Verification:

Decrypt: $m || tag = Dec_{k_1}(c)$
Verify MAC: Check if $MAC_{k_2}(m) = tag$

Disadvantages:

May be insecure if encryption scheme has vulnerabilities (e.g., padding oracle)
Not recommended

3. Encrypt-and-MAC Steps:

Encrypt and compute MAC simultaneously: $c = Enc_{k_1}(m)$ , $tag = MAC_{k_2}(m)$
Send $(c, tag)$

Verification:

Verify MAC: Check if $MAC_{k_2}(m') = tag$ (need to decrypt first to get $m'$ )
If MAC is valid, accept $m'$

Disadvantages:

MAC protects plaintext, not ciphertext
May be insecure if encryption and MAC use the same key

Scheme Design Problem: Design an Authenticated Encryption Scheme Requirements: Provide both confidentiality and integrity. Recommended Scheme (Encrypt-then-MAC):

Choose CPA-secure encryption scheme (e.g., AES-CTR)
Choose secure MAC scheme (e.g., HMAC-SHA256)
Use two independent keys: $k_1$ (encryption) and $k_2$ (MAC)
Encrypt: $c = Enc_{k_1}(m)$
Compute MAC: $tag = MAC_{k_2}(c)$
Send $(c, tag)$
Receiver verifies: First verify $MAC_{k_2}(c) = tag$ , if valid then decrypt $m = Dec_{k_1}(c)$

Why Use Two Independent Keys?

If using the same key, may lead to security problems
Key separation is a best practice in cryptography

CCA Secure Semantic Security

Exam Focus: Concept Explanation - Difference Between CCA Security and CPA Security

Chosen Ciphertext Attack (CCA): In a chosen ciphertext attack, the attacker can choose arbitrary ciphertexts and obtain corresponding plaintexts (or verification failure). This models real-world scenarios where attackers can send modified ciphertexts and observe decryption results.

CCA Security: An encryption scheme is CCA secure if the attacker cannot distinguish encryptions of two plaintexts even when they can query a decryption oracle.

Formal Definition: For all polynomial-time attackers $A$ , there exists a negligible function $\epsilon$ such that: $\Pr[\text{CCA-Game}(A) = 1] \leq \frac{1}{2} + \epsilon(n)$

CCA Game (Detailed Steps):

Initialization: Challenger generates key $k \leftarrow Gen(1^n)$
Learning Phase 1: Attacker can query:
- Encryption oracle $Enc_k(\cdot)$ : Obtain $(m_i, Enc_k(m_i))$
- Decryption oracle $Dec_k(\cdot)$ : Obtain $(c_i, Dec_k(c_i))$
Challenge Phase:
- Attacker chooses two equal-length plaintexts $m_0, m_1$
- Challenger randomly chooses $b \leftarrow \{0,1\}$
- Challenger returns challenge ciphertext $c^* = Enc_k(m_b)$
Learning Phase 2: Attacker can continue to query encryption and decryption oracles, but cannot query decryption of $c^*$
Guess Phase: Attacker outputs $b' \in \{0,1\}$
Decision: If $b' = b$ , attacker wins

Proof Problem: Why Are Encryption Modes Alone (Such as CBC, CTR) Not CCA Secure? Proof Idea (using CBC as example): Construct attacker $A$ :

$A$ chooses $m_0 = P_1 || P_2$ and $m_1 = Q_1 || Q_2$ (two blocks)
$A$ queries encryption oracle, receives challenge ciphertext $c^* = c_1^* || c_2^*$
$A$ constructs new ciphertext $c' = c_1^* || c_2^* || c_3'$ (add a random block)
$A$ queries decryption oracle $Dec_k(c')$ , obtains $m' = m_1' || m_2' || m_3'$
Due to CBC mode properties, $A$ can infer information about $m_b$ from $m'$
Therefore $A$ can distinguish encryptions of $m_0$ and $m_1$

Important Conclusions (Exam Focus):

✅ Authenticated Encryption: Typically provides CCA security (e.g., Encrypt-then-MAC)
❌ Encryption Modes Alone (CBC, CTR): Not CCA secure
CCA security is stronger than CPA security: CCA security implies CPA security, but not vice versa
Modern applications must use authenticated encryption: Encryption alone is insufficient to provide complete security

Practical Applications:

TLS/SSL Protocols: Use authenticated encryption to protect Web communications
SSH Protocol: Use authenticated encryption to protect remote login
Modern Encrypted Communication: All secure communication protocols use authenticated encryption

Calculation Problem: Complete Flow of Authenticated Encryption Example: Implement Encrypt-then-MAC using AES-128-CTR and HMAC-SHA256. Given: Key $k_1 = 0x1234...$ (128 bits), $k_2 = 0x5678...$ (256 bits), message $m = "Hello World"$ Solution Steps:

Encryption: $c = AES\text{-}CTR_{k_1}(m)$
- Choose random $IV$
- Generate key stream: $K_i = AES_{k_1}(IV + i)$
- Encrypt: $c = m \oplus K$
Compute MAC: $tag = HMAC_{k_2}(c)$
- $tag = SHA256(k_2 \oplus opad || SHA256(k_2 \oplus ipad || c))$
Send: $(IV, c, tag)$
Receiver verifies:
- Verify $HMAC_{k_2}(c) = tag$
- If valid, decrypt $m = AES\text{-}CTR_{k_1}^{-1}(c)$

Part 2: Public Key Cryptography

1. Public Key Encryption from Trapdoor Permutations

1.1 RSA Encryption Scheme

Exam Focus: Concept Explanation - What is a Trapdoor Permutation?

A trapdoor permutation is a family of functions with the following properties:

Forward computation is easy: Given public key $pk$ and input $x$ , computing $f_{pk}(x)$ is easy
Reverse computation is hard: Without trapdoor information, computing $x$ from $f_{pk}(x)$ is hard
Reverse computation is easy (with trapdoor): Given private key (trapdoor) $sk$ , computing $x$ from $f_{pk}(x)$ is easy

RSA is a classic example of a trapdoor permutation.

RSA Key Generation Algorithm:

Choose two large primes $p$ and $q$ (typically 1024 bits or larger each)
Compute $n = p \times q$ (modulus)
Compute Euler's totient function $\phi(n) = (p-1)(q-1)$
Choose integer $e$ such that $1 < e < \phi(n)$ and $\gcd(e, \phi(n)) = 1$ ( $e$ is public exponent, often 65537)
Compute $d$ such that $ed \equiv 1 \pmod{\phi(n)}$ ( $d$ is private exponent)
Public key: $pk = (n, e)$
Private key: $sk = (n, d)$ (or $(p, q, d)$ )

RSA Encryption (Textbook Version): Given public key $(n, e)$ and plaintext $m \in \mathbb{Z}_n$ ( $0 \leq m < n$ ), ciphertext is: $c = m^e \bmod n$

RSA Decryption: Given private key $(n, d)$ and ciphertext $c$ , plaintext is: $m = c^d \bmod n$

Correctness Proof: Since $ed \equiv 1 \pmod{\phi(n)}$ , there exists integer $k$ such that $ed = 1 + k\phi(n)$ . By Euler's theorem, if $\gcd(m, n) = 1$ , then $m^{\phi(n)} \equiv 1 \pmod{n}$ . Therefore: $c^d \equiv (m^e)^d \equiv m^{ed} \equiv m^{1+k\phi(n)} \equiv m \cdot (m^{\phi(n)})^k \equiv m \cdot 1^k \equiv m \pmod{n}$

Calculation Problem: Complete RSA Encryption/Decryption Calculation

Problem 1: Given RSA parameters: $p = 11$ , $q = 13$ , $e = 7$ , plaintext $m = 5$ . (1) Compute public and private keys (2) Compute ciphertext $c$ (3) Verify decryption process

Detailed Solution:

Step 1: Compute modulus and Euler's totient function

$n = p \times q = 11 \times 13 = 143$
$\phi(n) = (p-1)(q-1) = 10 \times 12 = 120$

Step 2: Verify $e$ is coprime with $\phi(n)$

$\gcd(7, 120) = 1$ ✓ (since 7 is prime and 7 does not divide 120)

Step 3: Compute private exponent $d$ Need to find $d$ such that $7d \equiv 1 \pmod{120}$ , i.e., $7d = 1 + 120k$ for some integer $k$ .

Using extended Euclidean algorithm:

$120 = 7 \times 17 + 1$
$7 = 1 \times 7 + 0$

Back substitution:

$1 = 120 - 7 \times 17$
Therefore $d = -17 \equiv 103 \pmod{120}$

Verify: $7 \times 103 = 721 = 6 \times 120 + 1 \equiv 1 \pmod{120}$ ✓

Step 4: Determine keys

Public key: $pk = (n, e) = (143, 7)$
Private key: $sk = (n, d) = (143, 103)$

Step 5: Encryption

Plaintext: $m = 5$
Ciphertext: $c = m^e \bmod n = 5^7 \bmod 143$

Compute $5^7 \bmod 143$ :

$5^2 = 25 \bmod 143 = 25$
$5^4 = (5^2)^2 = 25^2 = 625 \bmod 143 = 625 - 4 \times 143 = 625 - 572 = 53$
$5^7 = 5^4 \times 5^2 \times 5 = 53 \times 25 \times 5 = 6625 \bmod 143$

Compute $6625 \bmod 143$ :

$143 \times 46 = 6578$
$6625 - 6578 = 47$
Therefore $c = 47$

Step 6: Decryption verification

Ciphertext: $c = 47$
Plaintext: $m = c^d \bmod n = 47^{103} \bmod 143$

Since $103$ is large, use modular exponentiation:

$103 = 64 + 32 + 4 + 2 + 1 = 2^6 + 2^5 + 2^2 + 2^1 + 2^0$

Compute $47^{2^i} \bmod 143$ :

$47^1 \equiv 47 \pmod{143}$
$47^2 = 2209 \bmod 143 = 2209 - 15 \times 143 = 2209 - 2145 = 64$
$47^4 = (47^2)^2 = 64^2 = 4096 \bmod 143 = 4096 - 28 \times 143 = 4096 - 4004 = 92$
$47^8 = (47^4)^2 = 92^2 = 8464 \bmod 143 = 8464 - 59 \times 143 = 8464 - 8437 = 27$
$47^{16} = (47^8)^2 = 27^2 = 729 \bmod 143 = 729 - 5 \times 143 = 729 - 715 = 14$
$47^{32} = (47^{16})^2 = 14^2 = 196 \bmod 143 = 196 - 143 = 53$
$47^{64} = (47^{32})^2 = 53^2 = 2809 \bmod 143 = 2809 - 19 \times 143 = 2809 - 2717 = 92$

Now compute $47^{103}$ : $47^{103} = 47^{64} \times 47^{32} \times 47^4 \times 47^2 \times 47^1$ $= 92 \times 53 \times 92 \times 64 \times 47 \pmod{143}$

Compute step by step:

$92 \times 53 = 4876 \bmod 143 = 4876 - 34 \times 143 = 4876 - 4862 = 14$
$14 \times 92 = 1288 \bmod 143 = 1288 - 9 \times 143 = 1288 - 1287 = 1$
$1 \times 64 = 64$
$64 \times 47 = 3008 \bmod 143 = 3008 - 21 \times 143 = 3008 - 3003 = 5$

Therefore $m = 5$ , matching the original plaintext ✓

Problem 2: Given RSA public key $(n, e) = (143, 7)$ , encrypt message $m = 100$ , compute ciphertext.

Detailed Solution:

Plaintext: $m = 100$
Check: $100 < 143$ , so $m \in \mathbb{Z}_{143}$ ✓
Ciphertext: $c = m^e \bmod n = 100^7 \bmod 143$

Compute $100^7 \bmod 143$ :

$100^2 = 10000 \bmod 143 = 10000 - 69 \times 143 = 10000 - 9867 = 133$
$100^4 = (100^2)^2 = 133^2 = 17689 \bmod 143 = 17689 - 123 \times 143 = 17689 - 17589 = 100$
$100^7 = 100^4 \times 100^2 \times 100 = 100 \times 133 \times 100 = 1330000 \bmod 143$

Compute $1330000 \bmod 143$ :

$133 \bmod 143 = 133$ (since $133 < 143$ )
$1330000 = 133 \times 10000$
$10000 \bmod 143 = 10000 - 69 \times 143 = 133$
Therefore $1330000 \bmod 143 = 133 \times 133 \bmod 143 = 17689 \bmod 143 = 100$

So $c = 100$

Note: In this example $m^e \equiv m \pmod{n}$ , which is due to special properties of $m = 100$ . This generally does not occur.

1.2 RSA Textbook Version vs CPA Secure Version

Exam Focus: Concept Explanation - Why is RSA Textbook Version Insecure?

RSA textbook version has the following security problems:

Deterministic encryption: Same plaintext always produces same ciphertext, attackers can observe patterns
Not CPA secure: Attackers can query encryption oracle, obtaining $(m_i, m_i^e \bmod n)$ pairs, potentially inferring information
Small plaintext attack: If plaintext $m$ is small ( $m^e < n$ ), then $c = m^e$ (no modular reduction), attackers can directly compute $m = \sqrt[e]{c}$
Common modulus attack: If multiple users use same $n$ but different $e$ , may be attacked
Low encryption exponent attack: If $e$ is small and plaintexts are same, may be attacked

Proof Problem: Prove RSA Textbook Version is Not CPA Secure

Proof: Construct attacker $A$ :

$A$ chooses two plaintexts $m_0 = 0$ and $m_1 = 1$
$A$ queries encryption oracle, receives challenge ciphertext $c^*$
If $c^* = 0$ , $A$ outputs $b' = 0$ ; if $c^* = 1$ , $A$ outputs $b' = 1$ ; otherwise $A$ outputs random guess

Analysis:

If $b = 0$ (encrypt $m_0 = 0$ ), then $c^* = 0^e \bmod n = 0$
If $b = 1$ (encrypt $m_1 = 1$ ), then $c^* = 1^e \bmod n = 1$
Therefore $A$ can perfectly distinguish with success probability 1
Therefore RSA textbook version is not CPA secure

CPA Secure RSA (RSA-OAEP):

To achieve CPA security, padding schemes are needed, most commonly OAEP (Optimal Asymmetric Encryption Padding).

RSA-OAEP Encryption Process:

Use OAEP padding to convert plaintext $m$ to $M$ (padded message)
Encrypt: $c = M^e \bmod n$

OAEP Padding (Simplified Description):

Use random number $r$ and hash functions
Ensures same plaintext produces different ciphertexts each encryption
Provides semantic security

Scheme Design Problem: Design a CPA Secure RSA Encryption Scheme

Scheme: Use RSA-OAEP

Detailed Steps:

Key Generation: Use standard RSA key generation, obtain $(n, e, d)$
Encryption:
- Choose random number $r \leftarrow \{0,1\}^k$ ( $k$ is security parameter)
- Use hash functions $G$ and $H$ (e.g., SHA-256)
- Compute $X = m || 0^k \oplus G(r)$ ( $0^k$ is $k$ zeros padding)
- Compute $Y = r \oplus H(X)$
- Set $M = X || Y$
- If $M \geq n$ , reselect $r$ and repeat
- Encrypt: $c = M^e \bmod n$
Decryption:
- Decrypt: $M = c^d \bmod n$
- Split: $M = X || Y$
- Recover: $r = Y \oplus H(X)$
- Recover: $m || 0^k = X \oplus G(r)$
- Verify padding and extract $m$

Why is this design CPA secure?

Random number $r$ ensures same plaintext produces different ciphertexts
Hash functions provide randomness
Under random oracle model, RSA-OAEP can be proven CPA secure