CS144-Lab1记录任务回顾在lab1中要实现一个Reassembler类，基于Lab0的字节流类实现，要实现如下

任务回顾

在lab1中要实现一个Reassembler类，基于Lab0的字节流类实现，要实现如下功能：

包含一个字节流类和读写接口
提供一个insert方法
- 接收3个数据：字节索引（index），字符串（data），结束标志（is_last_substring）
  - 每次收到的数据顺序混乱
  - 可能受到有重叠的数据
  - 每次收到的字符串连续
- insert方法将输入的子串进行排序和去重等操作以还原为原始数据，并按顺序写入字节流中。全部写完后关闭写接口
知道 next_index 下一个期望写入流的自字节索引
缓存适合流的可用容量但尚不能写入的字节
丢弃超出流可用容量的字节

难点在于对随机收到的字节数据自动复原

思路

首先不管怎么样，一定要维护一个next_index表示写一个希望写入的字节索引。

然后写入采用lazzy的方法，每次接受新的子串后写入最大能写入的字节。这里要注意如果缓存的字节大小大于可写入大小，且如果这是最后一个子串，将写不完，

一开始想用想维护一个按头索引顺序排列的子串数组，每个元素是一个子串。然后用二分查找插入新来的子串，每次插入时和前后的子串比较，去掉重复的部分。

这样可以用O（logn）的复杂度找到插入点，可是数组的插入复杂度为O（n）

考虑使用C++的数据结构map,基于红黑树实现，内部自动排序，插入、查找的复杂度为O（logn）

然后终止条件设置为写入字节数等于总共字节数，后者在收到is_last_substring后计算得到

代码

void Reassembler::insert( uint64_t first_index, string data, bool is_last_substring )
    {
    if ( is_last_substring ) {
        last_index_ = first_index + data.size();
    }
    const uint64_t max_index = next_index_ + output_.writer().available_capacity() - 1+100;
    if ( first_index > max_index ) {
        return;
    } else if ( first_index + data.size() < next_index_ ) {
        return;
    }
    if ( first_index < next_index_ ) {
        data = data.substr( next_index_ - first_index );
        first_index = next_index_;
    }
    if ( first_index + data.size() > max_index + 1 ) {
        data = data.substr( 0, max_index - first_index + 1 );
    }
    if (buffer_.find( first_index ) != buffer_.end()){
        if (buffer_[first_index].size() >= data.size()){
        return;
        }
    }
    if (!data.empty()){
        buffer_[first_index] = data;
        // 去重
        bool live_flag = true;
        auto n_t = buffer_.find( first_index );
        if (n_t-- != buffer_.begin()){
            if (n_t->first + n_t->second.size() >= first_index+ buffer_[first_index].size()){
                buffer_.erase( ++n_t );
                live_flag = false;
            } else if (n_t->first + n_t->second.size() > first_index){
                n_t->second.erase( first_index - n_t->first, n_t->second.size() - first_index +n_t->first);
            }
        }
        if(live_flag) {
            n_t = buffer_.find( first_index );
            n_t++;
            while ( n_t != buffer_.end()){
                if (n_t->first < first_index + buffer_[first_index].size()){
                    if (n_t->first + n_t->second.size() <= first_index + buffer_[first_index].size()){
                        buffer_.erase( n_t );
                    } else{
                        buffer_[first_index].erase( n_t->first - first_index, buffer_[first_index].size() - n_t->first +first_index);
                    }
                    n_t = buffer_.find( first_index );
                    n_t++;
                } else{
                    break;
                }
            }
        }
    }

    while ( buffer_.find( next_index_ ) != buffer_.end() ) {
        auto it = buffer_.find( next_index_ );
        output_.writer().push( it->second );
        next_index_ += it->second.size();
        buffer_.erase( it );
    }
    if ( next_index_ == last_index_ ) {
        output_.writer().close();
    }
}