Part Three: Implementing Linear Probing
线性探测思路比较简单,这里建议阅读一下HashFunction.h的源码,明白调用哈希类的实例时传入一个数据结构T是怎么返回一个int类型的数据,详细知识点注释在以下源码里了
cpp中类模板,泛型的使用,explicit关键字禁止隐式类型转换,std::function类型可以让函数作为参数来使用,函数调用运算符让实例可以像函数一样被调用,这里就是通过调用实例并传递一个T数据类型来获取散列值,它内部是通过callback函数来实际计算散列值
//类模板,它的定义中使用了占位符 T,可以表示任何数据类型,可以根据需要使用模板来创建具体的类
//例如,你可以实例化模板为 HashFunction<int> 来创建一个处理整数的类,或实例化为 HashFunction<string> 来创建一个处理字符串的类。
template <typename T> class HashFunction {
public:
/**
* Constructs a new HashFunction for the given number of slots. Each hash
* function constructed this way will be initialized randomly.
*
* The second argument is a random seed. Setting this value is
* useful if you'd like to have your hash function behave consistently
* across runs of the program.
*/
//explicit禁止隐式转换,有参构造函数用于明确配置哈希函数
explicit HashFunction(int numSlots, int randomSeed = 0);
/**
* Constructs a new HashFunction. This HashFunction cannot be used bceause
* it won't have been initialized with a number of buckets, and trying to
* use it will cause a runtime error.
*
* You shouldn't directly use this constructor; it's only here so that
* you can declare variables of type HashFunction and initialize them
* later.
*/
//无参构造函数用于允许用户在稍后初始化哈希函数对象,这里会返回错误,具体看实现
HashFunction();
/**
* Constructs a hash function that specifically uses the underlying raw
* hash code as its hash function. This is useful if you want to guarantee
* predictable values for your hash function when testing.
*/
//静态成员函数可以通过类名来调用,而无需创建类的对象
//HashFunction: 这是返回类型,表示 wrap 函数会返回一个 HashFunction 对象
/*
wrap 函数允许用户提供自定义的哈希函数,以替代默认的哈希函数。这在测试和调试过程中非常有用,
因为它允许你在测试时使用可预测的哈希函数。用户可以提供一个符合特定签名的哈希函数,
即接受一个 const T & 类型的参数(T 是哈希表中存储的元素类型),
并返回一个整数。这个哈希函数将被用于将元素映射到哈希表的槽中。
*/
static HashFunction wrap(int numSlots,
std::function<int (const T &)> hashFn);
/*
* std::function<int (const T &)> 表示一个可以接受 const T & 类型参数
* 并返回 int 类型的可调用对象。这里的 T 是一个泛型类型,可以是任何数据类型,
* 例如整数、字符串等。 std::function 的模板参数用于指定可调用对象的签名,
* 即它的参数类型和返回类型。
*/
/**
* Returns the number of slots this hash function is designed to operate
* over.
*/
int numSlots() const;
/**
* Applies the hash function to the specified argument. The syntax for
* using this function is
*
* hashFn(argument)
*
* That is, you'll treat the variable of type HashFunction as though it's
* an honest-to-goodness function rather than a variable of some type.
*/
//定义了一个函数调用运算符 operator(),允许对象被使用类似函数的方式来调用
/*
int: 这是函数调用运算符的返回类型,表示它将返回一个整数值,即哈希值。
operator(): 这是函数调用运算符的名称,用于表示对象可以像函数一样进行调用。
(const T& argument): 这是函数调用运算符的参数列表,它接受一个类型为 T 的参数(通常是哈希表中的元素),并将其标记为 const,以表明在调用期间不会修改参数的值。
当实例化好一个哈希函数对象后,通过调用对象传入T就通过此方法返回哈希值
*/
int operator() (const T& argument) const;
private:
std::function<int(const T&)> callback;//它是一个 std::function 对象,用于存储用户自定义的哈希函数
int mNumSlots;//用于存储哈希表的槽数数量
/*
static_assert(stanfordcpplib::collections::IsHashable<T>::value, ...);:
这是一个 static_assert 断言,它在编译时检查某个条件是否为真。具体来说,它检查类型 T
是否符合 stanfordcpplib::collections::IsHashable 的要求。如果不符合,将引发编译时错误,
显示错误消息。
stanfordcpplib::collections::IsHashable 是一个模板元编程技术中的类型特性(type trait)。它用于检查类型是否具有哈希功能,即是否可以被哈希。
static_assert 用于在编译时进行条件检查,如果条件不满足,则会引发编译错误。在这里,它确保
只有可哈希的类型才能用于 HashFunction 类的模板参数 T。
错误消息 "Oops! You've tried to make a HashFunction for a type that isn't hashable
. ..." 是一个友好的编译时错误消息,它向开发人员指出他们尝试为不可哈希的类型创建 HashFunction,
并提供了进一步的细节信息。
*/
static_assert(stanfordcpplib::collections::IsHashable<T>::value,
"Oops! You've tried to make a HashFunction for a type that isn't hashable. "
"Double-click this error message for more details.");
/*
* Hello CS106 students! If you got directed to this line of code in a compiler error,
* it probably means that you tried making a HashFunction<T> with a custom struct or
* class type.
*
* In order to have a HashFunction<T> for a type T, the type T needs to have a hashCode
* function defined and be capable of being compared using the == operator. If you were
* directed here, one of those two conditions wasn't met.
*
* There are two ways to fix this. The first option would simply be to not use your custom
* type in conjunction with HashFunction<T>. This is probably the easiest option.
*
* The second way to fix this is to explicitly define a hashCode() and operator== function
* for your type. To do so, first define hashCode as follows:
*
* !!!stanfordcpplib::collections::hashCode 函数来帮助计算哈希值!!!
*
* int hashCode(const YourCustomType& obj) {
* return stanfordcpplib::collections::hashCode(obj.data1, obj.data2, ..., obj.dataN);
* }
*
* where data1, data2, ... dataN are the data members of your type. For example, if you had
* a custom type
*
* struct MyType {
* int myInt;
* string myString;
* };
*
* you would define the function
*
* int hashCode(const MyType& obj) {
* return stanfordcpplib::collections::hashCode(obj.myInt, obj.myString);
* }
*
* Second, define operator== as follows:
*
* bool operator== (const YourCustomType& lhs, const YourCustomType& rhs) {
* return lhs.data1 == rhs.data1 &&
* lhs.data2 == rhs.data2 &&
* ...
* lhs.dataN == rhs.dataN;
* }
*
* Using the MyType example from above, we'd write
*
* bool operator== (const MyType& lhs, const MyType& rhs) {
* return lhs.myInt == rhs.myInt && lhs.myString == rhs.myString;
* }
*
* Hope this helps!
*/
};
namespace hashfunction_detail {
std::function<int(int)> tabulationHashFunction(int seed);
}
/* * * * * Implementation Below This Point * * * * */
template <typename T>
HashFunction<T>::HashFunction(int numSlots, int seed) {
if (numSlots <= 0) {
error("HashFunction<T>::wrap(): numSlots must be positive.");
}
auto scrambler = hashfunction_detail::tabulationHashFunction(seed);
mNumSlots = numSlots;
/*
callback 是一个 lambda 函数,它接受一个参数 const T& key,其中 key 是要计算哈希值的输入。
[scrambler, numSlots]:这部分是 lambda 函数的捕获列表。它告诉 lambda 函数要捕获两个变量
,scrambler 和 numSlots。这些变量在 lambda 函数内部可以直接使用,而不需要显式传递它们。
const T& key:这是 lambda 函数的参数列表,表示 lambda 函数接受一个常量引用类型的参数 key,
该参数是要计算哈希值的输入。
*/
callback = [scrambler, numSlots](const T& key) {
return (scrambler(hashCode(key)) & 0x7FFFFFF) % numSlots;
};
}
template <typename T>
HashFunction<T> HashFunction<T>::wrap(int numSlots,
std::function<int (const T&)> hashFn) {
if (numSlots <= 0) {
error("HashFunction<T>::wrap(): numSlots must be positive.");
}
HashFunction result;
result.callback = [hashFn, numSlots] (const T& key) {
return (0x7FFFFFFF & hashFn(key)) % numSlots;
};
result.mNumSlots = numSlots;
return result;
}
template <typename T> int HashFunction<T>::numSlots() const {
return mNumSlots;
}
/* Default constructor sets up a hash function that always reports an error. */
template <typename T> HashFunction<T>::HashFunction() {
//lambda表达式
callback = [](const T&) -> int {
error("Attempted to use an uninitialized HashFunction object.");
};
mNumSlots = 0;
}
/* Call operator forwards to the callback. */
template <typename T> int HashFunction<T>::operator()(const T& arg) const {
return callback(arg);
}
#endif
实现线性探测表比较简单,可以借鉴思路
LinearProbingHashTable::LinearProbingHashTable(HashFunction<string> hashFn) {
/* TODO: Delete this comment and the next line, then implement this function. */
allocatedSize = hashFn.numSlots();
elems=new Slot[allocatedSize];
for(int i=0;i<allocatedSize;i++){
elems[i].type =SlotType::EMPTY;
}
this->hashFn=hashFn;
}
LinearProbingHashTable::~LinearProbingHashTable() {
/* TODO: Delete this comment, then implement this function. */
delete []elems;
}
//const 关键字可以用于成员函数的声明,表示该函数不会修改对象的状态
bool LinearProbingHashTable::isEmpty() const{
if (logicalSize==0) return true;
return false;
}
int LinearProbingHashTable::size() const{
return logicalSize;
}
bool LinearProbingHashTable::contains(const std::string& key) const{
//根据散列值查找槽位,槽位为EMPTY返回false
//遇到FILLED,检查是否相同,相同flag=1,break;不同继续勘察下一个槽位
//遇到TOMBSTONE,不做任何操作,继续勘察下一个槽位
int pos=hashFn(key);
int flag=0;
int count=0;//统计已经查找的元素个数
while(count<allocatedSize){
if (elems[pos].type==SlotType::FILLED){
if(elems[pos].value==key){
flag=1;
break;
}
}
else if(elems[pos].type==SlotType::EMPTY) {
return false;
}
if(pos==allocatedSize-1)
{
count++;
pos=0;
}else{
count++;
pos++;
}
}
if(flag==1)return true;
return false;
}
bool LinearProbingHashTable::insert(const std::string& key){
//将指定的元素插入此哈希表。如果元素已经存在,则保持表不变,并返回false表示存在,没有添加任何内容
//如果表中没有可以插入元素的空间——即每个槽是满的-这应该返回false,表明没有更多的空间。
//这个函数返回元素是否被插入到表中
if(contains(key) || logicalSize == allocatedSize){
return false;
}
int flag=0;
int pos=hashFn(key);
while(1){
if(elems[pos].type==SlotType::EMPTY ||elems[pos].type==SlotType::TOMBSTONE ){
elems[pos].value=key;
elems[pos].type=SlotType::FILLED;
logicalSize++;
flag=1;
break;
}
if(pos==allocatedSize-1){
pos=0;
}else{
pos++;
}
}
if (flag==1) return true;
return false;
}
bool LinearProbingHashTable::remove(const string& elem) {
if(!contains(elem)){
return false;
}
int pos=hashFn(elem);
while(1){
if (elems[pos].value==elem){
elems[pos].type=SlotType::TOMBSTONE;
logicalSize--;
return true;
}
if(pos==allocatedSize-1){
pos=0;
}else{
pos++;
}
}
}
Part Five: Robin Hood Hashing
这里光看文档的描述我没看懂。。但是看slidev就知道这个东西要怎么实现了,他把思路都直接给你了,在实现删除操作时记得一定要对边界值进行处理(遍历到数组尾部时)
RobinHoodHashTable::RobinHoodHashTable(HashFunction<string> hashFn) {
/* TODO: Delete this comment, then implement this function. */
allocatedSize=hashFn.numSlots();
elems=new Slot[allocatedSize];
for(int i=0;i<allocatedSize;i++){
elems[i].distance=EMPTY_SLOT;
}
this->hashFn=hashFn;
}
RobinHoodHashTable::~RobinHoodHashTable() {
delete[] elems;
}
int RobinHoodHashTable::size() const {
return logicalSize;
}
bool RobinHoodHashTable::isEmpty() const {
/* TODO: Delete this comment and the next line, then implement this function. */
if(logicalSize==0)return true;
return false;
}
bool RobinHoodHashTable::insert(const string& elem) {
/* TODO: Delete this comment and the next lines, then implement this function. */
if(contains(elem) || logicalSize == allocatedSize){
return false;
}
int pos=hashFn(elem);
int distance=0;
string value=elem;
while(1){
if(elems[pos].distance==EMPTY_SLOT){
elems[pos].value=value;
elems[pos].distance=distance;
logicalSize++;
return true;
}
else if(elems[pos].distance<distance){
string tempvalue=elems[pos].value;
int tempdistance=elems[pos].distance;
elems[pos].value=value;
elems[pos].distance=distance;
value=tempvalue;
distance=tempdistance;
}
else{
pos=(pos+1)%allocatedSize;
distance++;
}
}
}
bool RobinHoodHashTable::contains(const string& elem) const {
/* TODO: Delete this comment and the next lines, then implement this function. */
int pos=hashFn(elem);
int distance=0;
while(1){
if (elems[pos].distance==EMPTY_SLOT){
return false;
}
else if(elems[pos].value==elem){
return true;
}
else if(distance>elems[pos].distance){
return false;
}
else{
pos=(pos+1)%allocatedSize;
distance++;
}
}
}
bool RobinHoodHashTable::remove(const string& elem) {
/* TODO: Delete this comment and the next lines, then implement this function. */
if(!contains(elem)){
return false;
}
int pos=hashFn(elem);
while(1){
if(elems[pos].value==elem){
elems[pos].distance=EMPTY_SLOT;
pos=(pos+1)%allocatedSize;
for (;elems[pos].distance!=0 && elems[pos].distance!=EMPTY_SLOT;pos=(pos+1)%allocatedSize){
//注意当pos为0时的情况,pos-1为负值
if(pos==0){
elems[allocatedSize-1].value=elems[pos].value;
elems[allocatedSize-1].distance=elems[pos].distance-1;
elems[pos].distance=EMPTY_SLOT;
}else{
elems[pos-1].value=elems[pos].value;
elems[pos-1].distance=elems[pos].distance-1;
elems[pos].distance=EMPTY_SLOT;
}
}
logicalSize--;
return true;
}
else{
pos=(pos+1)%allocatedSize;
}
}
}