🔑 哈希表（Hash Table）：数据界的高速索引！📖 一、什么是哈希表？从图书馆说起 1.1 生活中的场景想象

"给我一个Key，还你一个Value，就是这么快！" ⚡

📖 一、什么是哈希表？从图书馆说起

1.1 生活中的场景

想象你在图书馆找书：

没有哈希表（暴力查找）：

😰 你：我要找《算法导论》
📚 管理员：好的，让我一本一本找...
     第1本：不是
     第2本：不是
     ...
     第9527本：找到了！

时间：30分钟 ⏰

有了哈希表（直接定位）：

😎 你：我要找《算法导论》
🔍 系统：计算哈希值 → 书架3 → 第5层 → 位置12
📚 管理员：直接去那里取！

时间：1分钟 ⚡

1.2 专业定义

哈希表（Hash Table） 是一种根据键（Key）直接访问值（Value）的数据结构。通过哈希函数将键映射到数组的索引位置，实现快速的插入、删除和查找。

核心组成：

🔹 哈希函数：Key → Index（将键转换为数组索引）
🔹 数组：存储数据
🔹 冲突解决：处理不同Key映射到同一索引的情况

核心优势：

✅ 查找：O(1) 平均时间复杂度
✅ 插入：O(1) 平均时间复杂度
✅ 删除：O(1) 平均时间复杂度

🎨 二、哈希表的工作原理

2.1 基本流程

        Key              Hash函数           Index           Value
        "apple"    →    hash("apple")   →    3      →      "苹果"
        "banana"   →    hash("banana")  →    7      →      "香蕉"
        "orange"   →    hash("orange")  →    1      →      "橙子"

数组结构：
索引:  0      1        2     3        4     5     6     7
值:  [null]["橙子"][null]["苹果"][null][null][null]["香蕉"]

2.2 哈希函数示例

// 简单哈希函数
public int hash(String key) {
    int hash = 0;
    for (char c : key.toCharArray()) {
        hash += c;  // 累加字符ASCII值
    }
    return hash % arrayLength;  // 取模映射到数组范围
}

// 示例
hash("cat") = (99 + 97 + 116) % 10 = 312 % 10 = 2

2.3 完整过程图解

插入 put("name", "张三")

步骤1：计算哈希值
hash("name") = 1234

步骤2：取模得到索引
index = 1234 % 16 = 2

步骤3：存储到数组
array[2] = Entry("name", "张三")

查找 get("name")

步骤1：计算哈希值
hash("name") = 1234

步骤2：取模得到索引
index = 1234 % 16 = 2

步骤3：从数组取值
return array[2].value = "张三"

⚔️ 三、哈希冲突及解决方案

3.1 什么是哈希冲突？

哈希冲突（Hash Collision）：不同的Key通过哈希函数计算得到相同的索引。

hash("cat") = 2
hash("tac") = 2  ← 冲突！两个key映射到同一位置

    Key1 "cat"  ╲
                  → Index 2 → ?
    Key2 "tac"  ╱

3.2 解决方案1：拉链法（Chaining）⭐

原理： 数组每个位置存储一个链表，冲突的元素放到同一个链表中。

图解：

数组：
[0] → null
[1] → null
[2] → ["cat":"猫"] → ["tac":"踏"] → null  ← 拉链
[3] → ["dog":"狗"] → null
[4] → null
...

插入过程：
1. hash("cat") = 2 → array[2]，创建链表节点
2. hash("tac") = 2 → array[2]，追加到链表后面

Java HashMap的实现（JDK 8+）：

数组 + 链表 + 红黑树

当链表长度 ≤ 8：使用链表
当链表长度 > 8：转换为红黑树（提高查询效率）

[0] → null
[1] → null
[2] → [链表] → Node1 → Node2 → Node3
[3] → [红黑树] ← 链表太长，转成树
      /    \
    ...    ...

代码示例：

// 简化的拉链法实现
class Entry {
    String key;
    String value;
    Entry next;  // 指向下一个节点
    
    public Entry(String key, String value) {
        this.key = key;
        this.value = value;
    }
}

public class HashTableChaining {
    private Entry[] table;
    private int size;
    
    public HashTableChaining(int capacity) {
        table = new Entry[capacity];
    }
    
    private int hash(String key) {
        return Math.abs(key.hashCode()) % table.length;
    }
    
    // 插入
    public void put(String key, String value) {
        int index = hash(key);
        Entry entry = table[index];
        
        // 检查key是否已存在
        while (entry != null) {
            if (entry.key.equals(key)) {
                entry.value = value;  // 更新值
                return;
            }
            entry = entry.next;
        }
        
        // 头插法：新节点插入链表头部
        Entry newEntry = new Entry(key, value);
        newEntry.next = table[index];
        table[index] = newEntry;
        size++;
    }
    
    // 查找
    public String get(String key) {
        int index = hash(key);
        Entry entry = table[index];
        
        while (entry != null) {
            if (entry.key.equals(key)) {
                return entry.value;
            }
            entry = entry.next;
        }
        
        return null;  // 未找到
    }
    
    // 删除
    public void remove(String key) {
        int index = hash(key);
        Entry entry = table[index];
        Entry prev = null;
        
        while (entry != null) {
            if (entry.key.equals(key)) {
                if (prev == null) {
                    table[index] = entry.next;  // 删除头节点
                } else {
                    prev.next = entry.next;
                }
                size--;
                return;
            }
            prev = entry;
            entry = entry.next;
        }
    }
    
    // 测试
    public static void main(String[] args) {
        HashTableChaining ht = new HashTableChaining(10);
        
        ht.put("name", "张三");
        ht.put("age", "25");
        ht.put("city", "北京");
        
        System.out.println("name: " + ht.get("name"));  // 张三
        System.out.println("age: " + ht.get("age"));    // 25
        
        ht.remove("age");
        System.out.println("age: " + ht.get("age"));    // null
    }
}

3.3 解决方案2：开放寻址法（Open Addressing）

原理： 冲突时，在数组中寻找下一个空位置。

🔹 线性探测（Linear Probing）

冲突时，依次检查下一个位置：index, index+1, index+2, ...

示例：
hash("cat") = 2
hash("tac") = 2  ← 冲突！

数组：
[0] → null
[1] → null
[2] → "cat"    ← 第一个存这里
[3] → "tac"    ← 冲突后存下一个位置
[4] → null

代码示例：

public class LinearProbing {
    private String[] keys;
    private String[] values;
    private int capacity;
    private int size;
    
    public LinearProbing(int capacity) {
        this.capacity = capacity;
        keys = new String[capacity];
        values = new String[capacity];
    }
    
    private int hash(String key) {
        return Math.abs(key.hashCode()) % capacity;
    }
    
    public void put(String key, String value) {
        int index = hash(key);
        
        // 线性探测找空位
        while (keys[index] != null) {
            if (keys[index].equals(key)) {
                values[index] = value;  // 更新
                return;
            }
            index = (index + 1) % capacity;  // 下一个位置
        }
        
        keys[index] = key;
        values[index] = value;
        size++;
    }
    
    public String get(String key) {
        int index = hash(key);
        
        while (keys[index] != null) {
            if (keys[index].equals(key)) {
                return values[index];
            }
            index = (index + 1) % capacity;
        }
        
        return null;
    }
}

问题： 容易产生聚集（Clustering），影响性能。

🔹 二次探测（Quadratic Probing）

探测序列：index, index+1², index+2², index+3², ...

hash("cat") = 2

探测顺序：
2 → 2+1 → 2+4 → 2+9 → ...
2 → 3   → 6   → 11  → ...

🔹 双重散列（Double Hashing）

使用两个哈希函数：
index = hash1(key)
step = hash2(key)

探测序列：index, index+step, index+2*step, ...

3.4 拉链法 vs 开放寻址法

特性	拉链法	开放寻址法
存储位置	链表/树	数组
内存占用	需要额外节点	只用数组
缓存友好	❌ 否	✅ 是
冲突处理	链表长度增加	寻找其他位置
删除操作	简单	复杂（需要标记）
适用场景	冲突多、删除多	冲突少、内存紧张
Java实现	HashMap	ThreadLocal

🔬 四、Java HashMap深度解析

4.1 HashMap的演变

JDK 7:  数组 + 链表
JDK 8+: 数组 + 链表 + 红黑树

4.2 核心参数

// 默认初始容量：16（必须是2的幂）
static final int DEFAULT_INITIAL_CAPACITY = 16;

// 默认负载因子：0.75
static final float DEFAULT_LOAD_FACTOR = 0.75f;

// 链表转红黑树阈值：8
static final int TREEIFY_THRESHOLD = 8;

// 红黑树退化链表阈值：6
static final int UNTREEIFY_THRESHOLD = 6;

// 扩容阈值 = 容量 × 负载因子
threshold = capacity * loadFactor

4.3 扰动函数（Hash函数优化）

// JDK 8的hash方法
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

// 为什么要 ^ (h >>> 16)？
// 让高16位也参与运算，减少冲突

示例：
hashCode = 0b 1111_1111_1111_1111_0000_0000_0000_0001
h >>> 16 = 0b 0000_0000_0000_0000_1111_1111_1111_1111
         ↓ 异或运算
hash     = 0b 1111_1111_1111_1111_1111_1111_1111_1110

4.4 为什么容量必须是2的幂？

// 计算索引：
index = hash & (length - 1)  // 位运算，等价于 hash % length

// 当length = 16 = 2^4时
length - 1 = 15 = 0b 1111

hash = 1234 = 0b 0100_1101_0010
& (length-1)  = 0b 0000_0000_1111
              ───────────────────
index         = 0b 0000_0000_0010 = 2

优势：位运算比取模快得多！

4.5 扩容机制

// 扩容时机：size > threshold
// 扩容大小：2倍

原容量：16  →  新容量：32
原阈值：12  →  新阈值：24

扩容过程：
1. 创建新数组（2倍容量）
2. 重新计算每个元素的位置（rehash）
3. 移动元素到新数组

优化（JDK 8）：
元素新位置只有两种可能：
- 原位置 index
- 原位置 + 旧容量 (index + oldCap)

4.6 完整代码示例

import java.util.*;

public class HashMapDemo {
    public static void main(String[] args) {
        // 创建HashMap
        Map<String, Integer> map = new HashMap<>();
        
        System.out.println("=== 1. 插入数据 ===");
        map.put("张三", 85);
        map.put("李四", 92);
        map.put("王五", 78);
        map.put("赵六", 95);
        System.out.println("Map: " + map);
        
        System.out.println("\n=== 2. 查询数据 ===");
        System.out.println("张三的分数：" + map.get("张三"));
        System.out.println("包含李四？" + map.containsKey("李四"));
        
        System.out.println("\n=== 3. 更新数据 ===");
        map.put("张三", 90);  // 更新
        System.out.println("张三的新分数：" + map.get("张三"));
        
        System.out.println("\n=== 4. 删除数据 ===");
        map.remove("王五");
        System.out.println("Map: " + map);
        
        System.out.println("\n=== 5. 遍历Map ===");
        // 方法1：entrySet
        for (Map.Entry<String, Integer> entry : map.entrySet()) {
            System.out.println(entry.getKey() + ": " + entry.getValue());
        }
        
        // 方法2：keySet
        for (String key : map.keySet()) {
            System.out.println(key + ": " + map.get(key));
        }
        
        // 方法3：forEach（JDK 8+）
        map.forEach((k, v) -> System.out.println(k + ": " + v));
        
        System.out.println("\n=== 6. 常用方法 ===");
        System.out.println("大小：" + map.size());
        System.out.println("是否为空：" + map.isEmpty());
        System.out.println("获取或默认：" + map.getOrDefault("不存在", 0));
        map.putIfAbsent("新同学", 88);  // 不存在才插入
        map.compute("张三", (k, v) -> v + 5);  // 计算新值
    }
}

🎯 五、经典应用场景

5.1 统计词频 📊

public class WordFrequency {
    public static void main(String[] args) {
        String text = "apple banana apple orange banana apple";
        Map<String, Integer> freq = new HashMap<>();
        
        for (String word : text.split(" ")) {
            freq.put(word, freq.getOrDefault(word, 0) + 1);
        }
        
        System.out.println("词频统计：");
        freq.forEach((word, count) -> 
            System.out.println(word + ": " + count));
        
        // 输出：
        // apple: 3
        // banana: 2
        // orange: 1
    }
}

5.2 两数之和（LeetCode 1）⭐

public class TwoSum {
    public int[] twoSum(int[] nums, int target) {
        Map<Integer, Integer> map = new HashMap<>();
        
        for (int i = 0; i < nums.length; i++) {
            int complement = target - nums[i];
            if (map.containsKey(complement)) {
                return new int[]{map.get(complement), i};
            }
            map.put(nums[i], i);
        }
        
        return new int[0];
    }
    
    public static void main(String[] args) {
        TwoSum solution = new Solution();
        int[] nums = {2, 7, 11, 15};
        int target = 9;
        int[] result = solution.twoSum(nums, target);
        System.out.println("索引：[" + result[0] + ", " + result[1] + "]");
        // 输出：索引：[0, 1]  (nums[0] + nums[1] = 2 + 7 = 9)
    }
}

5.3 LRU缓存实现 🗂️

class LRUCache extends LinkedHashMap<Integer, Integer> {
    private int capacity;
    
    public LRUCache(int capacity) {
        super(capacity, 0.75f, true);  // accessOrder=true
        this.capacity = capacity;
    }
    
    public int get(int key) {
        return super.getOrDefault(key, -1);
    }
    
    public void put(int key, int value) {
        super.put(key, value);
    }
    
    @Override
    protected boolean removeEldestEntry(Map.Entry eldest) {
        return size() > capacity;  // 超过容量自动删除最旧的
    }
}

// 测试
LRUCache cache = new LRUCache(2);
cache.put(1, 1);
cache.put(2, 2);
cache.get(1);       // 返回 1
cache.put(3, 3);    // 移除 key 2
cache.get(2);       // 返回 -1 (未找到)

5.4 字符串分组（异位词）🔤

public class GroupAnagrams {
    public List<List<String>> groupAnagrams(String[] strs) {
        Map<String, List<String>> map = new HashMap<>();
        
        for (String str : strs) {
            char[] chars = str.toCharArray();
            Arrays.sort(chars);
            String key = new String(chars);
            
            map.putIfAbsent(key, new ArrayList<>());
            map.get(key).add(str);
        }
        
        return new ArrayList<>(map.values());
    }
    
    public static void main(String[] args) {
        GroupAnagrams solution = new GroupAnagrams();
        String[] strs = {"eat", "tea", "tan", "ate", "nat", "bat"};
        List<List<String>> result = solution.groupAnagrams(strs);
        System.out.println(result);
        // [[eat, tea, ate], [tan, nat], [bat]]
    }
}

🎓 六、经典面试题

面试题1：HashMap的put过程？

答案：

1. 计算hash值：hash(key)
2. 计算索引：index = hash & (length - 1)
3. 判断table[index]是否为空：
   - 为空：直接插入
   - 不为空：处理冲突
4. 冲突处理：
   - 链表：遍历链表，key相同则覆盖，否则追加
   - 红黑树：按树的方式插入
5. 判断是否需要转树：链表长度 > 8
6. 判断是否需要扩容：size > threshold
7. 扩容：resize()

面试题2：HashMap为什么线程不安全？

答案：

JDK 7：扩容时形成环形链表，导致死循环
JDK 8：多线程put可能覆盖数据
解决方案：
- ConcurrentHashMap（推荐）
- Collections.synchronizedMap()
- Hashtable（过时）

面试题3：HashMap和Hashtable的区别？

特性	HashMap	Hashtable
线程安全	❌ 否	✅ 是（synchronized）
null键值	✅ 允许	❌ 不允许
性能	高	低（锁粒度大）
初始容量	16	11
扩容	2倍	2倍+1
推荐	✅	❌（已过时）

面试题4：负载因子为什么是0.75？

答案：

是空间和时间的折衷
太小（如0.5）：浪费空间，频繁扩容
太大（如1.0）：冲突多，链表长，性能差
0.75：平衡冲突概率和空间利用率

面试题5：HashMap的并发问题如何解决？

答案：

// 方案1：ConcurrentHashMap（推荐）
Map<String, Integer> map = new ConcurrentHashMap<>();

// 方案2：synchronized包装
Map<String, Integer> map = Collections.synchronizedMap(new HashMap<>());

// 方案3：手动加锁
synchronized (map) {
    map.put("key", "value");
}

🎪 七、趣味小故事

故事：图书馆的革命

很久以前，有个图书馆，所有书都堆在一起。

没有哈希表的日子：

馆长老王每天被累惨了：

👤 读者："我要《算法导论》"
😰 老王："好的，我一本一本找..."
- 翻了第1本：不是
- 翻了第2本：不是
- ...
- 翻了第9999本：终于找到了！
👤 读者早就走了... 💨

引入哈希表后：

老王想了个办法（哈希函数）：

书名 → 计算首字母 → 对应书架

"算法导论" → 首字母"S" → 19号书架 → 第3层

现在的工作流程：

读者："我要《算法导论》"
系统计算：hash("算法导论") = 19-3
老王："去19号书架第3层直接拿！"
读者：10秒钟拿到书！😄

遇到冲突怎么办？

有一天，两本书都映射到同一位置：

"算法导论" → 19-3
"算术基础" → 19-3 ← 冲突！

老王的解决方案（拉链法）：

19号书架第3层装个抽屉（链表）：
[算法导论] → [算术基础] → null

找书时：

去19-3位置
打开抽屉
一本一本看书名
找到目标！

虽然比直接拿慢一点，但比翻遍整个图书馆快多了！

这就是哈希表的魔力——O(1)的快速查找！⚡

📚 八、知识点总结

核心要点 ✨

定义：通过哈希函数实现O(1)查找的数据结构
组成：哈希函数 + 数组 + 冲突解决
冲突解决：
- 拉链法（HashMap）
- 开放寻址法
HashMap：
- JDK 8+：数组+链表+红黑树
- 负载因子0.75
- 容量必须是2的幂
- 扩容2倍
时间复杂度：O(1)平均，O(n)最坏

记忆口诀 🎵

哈希表真神奇，
键值对应不费力。
哈希函数来映射，
数组索引直接取。
冲突解决有办法，
拉链法和开放法。
HashMap最常用，
数组链表加红黑。
负载因子零点七五，
容量必须二的幂。
查找插入都是一，
面试必考要牢记！

复杂度对比 📊

操作	平均	最坏
查找	O(1)	O(n)
插入	O(1)	O(n)
删除	O(1)	O(n)

🌟 九、总结彩蛋

恭喜你！🎉 你已经掌握了哈希表这个超级重要的数据结构！

记住：

🔑 哈希表 = Key直接找Value，超快！
⚡ O(1)的时间复杂度是核心优势
🔧 哈希冲突用拉链法或开放寻址法
🎯 HashMap是面试高频考点

最后送你一张图

       Key
        ↓
   [哈希函数]
        ↓
      Index
        ↓
    [数组][链表/树]
        ↓
      Value
      
     ⚡超快⚡

下次见，继续加油！ 💪😄

📖 参考资料

Java官方文档：HashMap
《算法导论》第11章 - 散列表
《Java核心技术卷I》- 集合框架
HashMap源码分析

作者: AI算法导师
最后更新: 2025年11月
难度等级: ⭐⭐⭐⭐ (中高级)
预计学习时间: 4-5小时

💡 温馨提示：理解哈希函数和冲突解决是关键，建议画图理解整个过程！