数据结构: 字典

749 阅读3分钟

算法世界搭车客便览系列文章可配合《算法设计指南》(本科教学版)阅读.


本文由《算法设计指南》(本科教学版)译者谢勰(微博: 算法时空)翻译, 未经许可不得转载. 原文链接: www.algorist.com/problems/Di…


输入:

输入

输出:

输出

输入描述: 一个由n条记录组成的集合, 每条记录可由一个或更多的键字段确定.

问题叙述: 构建并维护一个数据结构, 任给待查键q, 该数据结构能高效地对q所关联的记录进行定位、插入、删除.

定位: 根据q查找所关联的记录位置.

《算法设计指南》摘录: 抽象数据类型``字典"是计算机科学中最重要的结构之一. 研究人员已经提出许多数据结构以实现字典, 其中包括散列表、跳跃表和平衡/不平衡二叉查找树. 这意味着为字典挑选最好的实现可能得需要一定的功力. 事实上, 根据实际场景选出实现字典的数据结构是会对性能发挥产生显著影响的一项决策. 不过在实际中, 相比于在所有选择中定出最好的那一种, 避免使用较差的数据结构这点更为重要.

你得了解下面这条建议: 应将字典数据结构的实现从其接口处仔细分离. 我们应该对数据结构的初始化、查找、修改等子程序进行显式调用, 而不是将它们所完成的操作变为代码直接嵌到应用程序中. 这样做会使程序更为干净利落, 而且更容易去试用不同数据结构的实现从而观察它们对程序性能所产生的影响. 别被此种抽象中过程调用所带来的固有开销而困扰. 要是你的应用程序对时间的要求严格到上述开销都会影响程序性能的程度, 那么上述能随意更换并测试字典各种实现的这种分离策略对你来说就更不可或缺了.


Input Description: A set of n records, each identified by one or more key fields.

Problem: Build and maintain a data structure to efficiently locate, insert, or delete the record associated with any query key q.

Excerpt from The Algorithm Design Manual: The abstract data type ``dictionary'' is one of the most important structures in computer science. Dozens of different data structures have been proposed for implementing dictionaries including hash tables, skip lists, and balanced/unbalanced binary search trees -- so choosing the right one can be tricky. Depending on the application, it is also a decision that can significantly impact performance. In practice, it is more important to avoid using a bad data structure than to identify the single best option available.

An essential piece of advice is to carefully isolate the implementation of the dictionary data structure from its interface. Use explicit calls to subroutines that initialize, search, and modify the data structure, rather than embedding them within the code. This leads to a much cleaner program, but it also makes it easy to try different dictionary implementations to see how they impact performance. Do not obsess about the cost of the procedure call overhead inherent in such an abstraction. If your application is so time-critical that such overhead can impact performance, then it is even more essential that you be able to easily experiment with different implementations of your dictionary.