iceberg的记录删除功能的核心类Deletes解析

566 阅读1分钟

首先Deletes类有4个静态内部类

private static class EqualitySetDeleteFilter<T> extends Filter<T>
private static class PositionSetDeleteFilter<T> extends Filter<T>
private static class PositionStreamDeleteFilter<T> extends CloseableGroup implements CloseableIterable<T>
private static class DataFileFilter<T extends StructLike> extends Filter<T>

其中EqualitySetDeleteFilter,PositionSetDeleteFilter,DataFileFilter都继承自Filter抽象类,其中需要实现一个protected abstract boolean shouldKeep(T item); 方法来判断记录是否保留,下面是EqualitySetDeleteFilter的实现

  private static class EqualitySetDeleteFilter<T> extends Filter<T> {
    private final StructLikeSet deletes;
    private final Function<T, StructLike> extractEqStruct;

    protected EqualitySetDeleteFilter(Function<T, StructLike> extractEq,
                                      StructLikeSet deletes) {
      this.extractEqStruct = extractEq;
      this.deletes = deletes;
    }

    @Override
    protected boolean shouldKeep(T row) {
      return !deletes.contains(extractEqStruct.apply(row));
    }
  }

其中成员变量private final StructLikeSet deletes;中存放着是需要EqualityDelete的集合 private final Function<T, StructLike> extractEqStruct;是提取相等结构体类型的方法,方法的参数类型为T,返回值类型为StructLike,即将数据转化为主键的方法,shouldKeep方法中逻辑为deletes不包含row的key则保留。

看下StructLikeSet类是一个实现了Set的类,其中主要的成员变量

public static StructLikeSet create(Types.StructType type) {
    return new StructLikeSet(type);
  }

  private final Types.StructType type;
  private final Set<StructLikeWrapper> wrapperSet;
  private final ThreadLocal<StructLikeWrapper> wrappers;

  private StructLikeSet(Types.StructType type) {
    this.type = type;
    this.wrapperSet = Sets.newHashSet();
    this.wrappers = ThreadLocal.withInitial(() -> StructLikeWrapper.forType(type));
  }

这里type为Types.StructType类对象,表示StructLike的类型,其实就是一个主键字段的集合, private final Set<StructLikeWrapper> wrapperSet;为真正存储数据的集合,