用Cassandra sdk写yugabytedb

154 阅读4分钟

开启掘金成长之旅!这是我参与「掘金日新计划 · 12 月更文挑战」的第3天,点击查看活动详情

使用Cassandra sdk导入大量数据

yugabytedb支持Cassandra Client协议,我们从https://github.com/datastax/cpp-driver.git下载SDK,并编译:

  • 依赖项有cmake、libuv、gcc等
  • cpp-driver/examples/concurrent_executions 可以编译测试,注意指定rpath 我们使用cpp-driver/examples/concurrent_executions的原因是这是一个并发写入数据的example,可以方便配置写入量,但我们需要改写代码。
 apt-get install libuv1-dev:amd64
 mkdir build
 cmake ..
 cd cpp-driver/examples/concurrent_executions 
 gcc concurrent_executions.c  -I../../include/ -Wl,-rpath,../../build/ ../../build/libcassandra.so

./a.out 127.0.1.1 即可,其中127.0.1.1为yugabytedb的cql端口。你可以在启动界面找到:下图中,YCQL即为Cassandra Server入口。

+----------------------------------------------------------------------------------------------------------+
|                                                yugabyted                                                 |
+----------------------------------------------------------------------------------------------------------+
| Status              : Running.                                                                           |
| Replication Factor  : 1                                                                                  |
| Web console         : http://127.0.1.1:7000                                                              |
| JDBC                : jdbc:postgresql://127.0.1.1:5433/yugabyte?user=yugabyte&password=yugabyte          |
| YSQL                : bin/ysqlsh -h 127.0.1.1  -U yugabyte -d yugabyte                                   |
| YCQL                : bin/ycqlsh 127.0.1.1 9042 -u cassandra                                             |
| Data Dir            : /root/var/data                                                                     |
| Log Dir             : /root/var/logs                                                                     |
| Universe UUID       : e0764452-c361-420f-8131-9c4c2ab890fe                                               |
+----------------------------------------------------------------------------------------------------------+

关键函数修改如下,我们需要大的key、value,这样才能触发flush到RocksDB,这有助于测试YugabyteDB的Rocksdb魔改效果。这里我们使用rand_str生产512byte随机value。

 CassError prepare_insert(CassSession* session, const CassPrepared** prepared) {
   CassError rc = CASS_OK;
     
   const char* query = "INSERT INTO examples.concurrent_executions (id, value) VALUES (?, ?) using TTL 100;";
 //  const char* query = "INSERT INTO examples.concurrent_executions (id, value) VALUES (?, ?);"; // 如果不测试TTL相关功能,使用这个SQL
   CassFuture* future = cass_session_prepare(session, query);
   cass_future_wait(future);
 
   rc = cass_future_error_code(future);
   if (rc != CASS_OK) {
     print_error(future);
   } else {
     *prepared = cass_future_get_prepared(future);
   }
 
   cass_future_free(future);
 
   return rc;
 }
 
 
 void rand_str(char *dest, size_t length) {
     char charset[] = "0123456789"
                      "abcdefghijklmnopqrstuvwxyz"
                      "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
 
     while (length-- > 0) {
         size_t index = (double) rand() / RAND_MAX * (sizeof charset - 1);
         *dest++ = charset[index];
     }
     *dest = '\0';
 }
 
 void insert_into_concurrent_executions() {
   CassFuture* futures[CONCURRENCY_LEVEL];
   int num_requests = NUM_REQUESTS;
 
   while (num_requests > 0) {
     int i;
     int num_outstanding_requests = CONCURRENCY_LEVEL;
     if (num_requests < num_outstanding_requests) {
       num_outstanding_requests = num_requests;
     }
     num_requests -= num_outstanding_requests;
 
     for (i = 0; i < num_outstanding_requests; ++i) {
       CassUuid uuid;
       CassStatement* statement = cass_prepared_bind(prepared);
       cass_statement_set_is_idempotent(statement, cass_true);
       cass_uuid_gen_random(uuid_gen, &uuid);
       cass_statement_bind_uuid_by_name(statement, "id", uuid);
       char value_buffer[512];
       // sprintf(value_buffer, "%d", i);
       rand_str(value_buffer,511);
       cass_statement_bind_string_by_name(statement, "value", value_buffer);
       futures[i] = cass_session_execute(session, statement);
       cass_statement_free(statement);
     }
 
     for (i = 0; i < num_outstanding_requests; ++i) {
       CassFuture* future = futures[i];
       CassError rc = cass_future_error_code(future);
       if (rc != CASS_OK) {
         print_error(future);
       }
       cass_future_free(future);
     }
   }
 }
 

调整flush 参数

在前一篇文章我提到YugabyteDB为了减少写放大,魔改了rocksdb的flush参数,默认的刷盘阈值是2G,这导致我们这点写入量很难触发刷盘写SST,查看方法是在data目录执行du -d 1 -h,如果wals目录很大,data很小,这说明数据都集中在rocksdb memtable,下文是改动后的效果(改之前没截图)。 image.png 改动阈值的方法是找对应gflag参数:将"--global_memstore_size_mb_max=10"添加到yb_tserver_cmd启动脚本bin/yugabyted中,重启启动即可。

seek是否可以使用file filter

此前我们在YugabyteDB写入大量TTL数据,目的是测试docs.yugabyte.com/preview/dev…提到的对TTL expire优化,以及在read、seek时直接 跳过过期sst文件:在readOption中指定filter去seek,可以快速回收和跳过墓碑。

使用bin/ycqlsh 127.0.1.1 9042 -u cassandra打开客户端,默认密码为空。使用SELECT * from examples.concurrent_executions limit 10 ;搜索数据:

image-20221201190944660.png 由于遇到大量TTL超時的墓碑数据,select 全部超时了。发下有个Gflag没有打开,打开这个flag,tablet_enable_ttl_file_filter试试看:

image-20221201191129103.png

这个特性没用,select依旧超时。这就比较尴尬了。使用GDB 打开调试,下断点调查一下:

gdb -p 1541
b bounded_rocksdb_iterator.cc:30
(gdb) p read_opts
$1 = (const rocksdb::ReadOptions &) @0x7f7b65d20c48: {verify_checksums = true, fill_cache = true, snapshot = 0x0, iterate_upper_bound = 0x0, read_tier = rocksdb::kReadAllTier, tailing = false, 
  managed = false, total_order_seek = false, prefix_same_as_start = false, pin_data = false, query_id = 369471131232, table_aware_file_filter = {__ptr_ = 0x0, __cntrl_ = 0x0}, file_filter = {
    __ptr_ = 0x560a5f8498, __cntrl_ = 0x560a5f8480}, static kDefault = {verify_checksums = true, fill_cache = true, snapshot = 0x0, iterate_upper_bound = 0x0, read_tier = rocksdb::kReadAllTier, tailing = false, 
    managed = false, total_order_seek = false, prefix_same_as_start = false, pin_data = false, query_id = 0, table_aware_file_filter = {__ptr_ = 0x0, __cntrl_ = 0x0}, file_filter = {__ptr_ = 0x0, 
      __cntrl_ = 0x0}, static kDefault = <same as static member of an already seen type>}}

看到table_aware_file_filter仍为空,file_filter是有 分配的,这就有点诡异。 下文是YugabyteDB select *的堆栈,为我们接下来阅读代码有很好的指导作用,记下来:

 ../../src/yb/docdb/bounded_rocksdb_iterator.cc:30
 yb::docdb::IntentAwareIterator::IntentAwareIterator src/yb/docdb/intent_aware_iterator.cc:272
 yb::docdb::CreateIntentAwareIterator src/yb/docdb/docdb_rocksdb_util.cc:382
 yb::docdb::DocRowwiseIterator::DoInit<yb::docdb::DocQLScanSpec> /src/yb/docdb/doc_rowwise_iterator.cc:170
 yb::docdb::DocRowwiseIterator::Init /src/yb/docdb/doc_rowwise_iterator.cc:210
 yb::docdb::QLRocksDBStorage::GetIterator /src/yb/docdb/ql_rocksdb_storage.cc:49
 yb::docdb::QLReadOperation::Execute /src/yb/docdb/cql_operation.cc:1561
 yb::tablet::AbstractTablet::HandleQLReadRequest./src/yb/tablet/abstract_tablet.cc:66
 yb::tablet::Tablet::HandleQLReadRequest /src/yb/tablet/tablet.cc:1481
 ReadQuery::DoReadImpl (this=0x5607508798) at ../../src/yb/tserver/read_query.cc:640
 ReadQuery::DoRead (this=0x5607508798) at ../../src/yb/tserver/read_query.cc:550
 ReadQuery::Complete (this=0x5607508798) at ../../src/yb/tserver/read_query.cc:467
 ReadQuery::DoPerform (this=0x5607508798) at ../../src/yb/tserver/read_query.cc:409
 ReadQuery::Perform (this=0x5607508798) at ../../src/yb/tserver/read_query.cc:102
 yb::tserver::PerformRead /src/yb/tserver/read_query.cc:699
 yb::tserver::TabletServiceImpl::Read  /src/yb/tserver/tablet_service.cc:1852
 yb::tserver::TabletServerServiceIf::InitMethods yb/rpc/local_call.h:116
 yb::tserver::TabletServerServiceIf::InitMethodssrc/yb/tserver/tserver_service.service.cc:541
 std::__1::__invoke[abi:v15002]<yb::tserver::TabletServerServiceIf::InitMethods invoke.h:394
 std::__1::__invoke_void_return_wrapper<void, true>::__call<yb::tserver::TabletServerServiceIf::InitMethods
 yb::tserver::TabletServerServiceIf::Handle (this=0x5604c6e020, call=...) at src/yb/tserver/tserver_service.service.cc:511
 yb::rpc::ServicePoolImpl::Handle (this=0x5604ed4480, incoming=...) at ../../src/yb/rpc/service_pool.cc:267
 yb::rpc::InboundCall::InboundCallTask::Run (this=0x5605306640) at ../../src/yb/rpc/inbound_call.cc:236
 yb::rpc::(anonymous namespace)::Worker::Execute (this=0x560fdfd960) at ../../src/yb/rpc/thread_pool.cc:104
 yb::Thread::SuperviseThread (arg=0x5604e62f00) at ../../src/yb/util/thread.cc:800
 start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100

使用sst_dump检查properties

我们检查下SST property是否正确收集了必要信息,在这里是ttl 信息是否正确记录在properties中? 我们在最大的rocksdb目录使用命令/root/tmp/yugabyte-db/build/debug-clang15-dynamic-ninja/bin/sst_dump --file=./ --show_properties发现并没有在properties中读到想要的信息,失望而归,改天继续。

 Process .//000048.sst
 Sst file format: block-based
 Table Properties:
 ------------------------------
   # data blocks: 1
   # data index blocks: 1
   # filter blocks: 1
   # entries: 36
   raw key size: 5273
   raw average key size: 146.472222222222
   raw value size: 446
   raw average value size: 12.3888888888889
   data blocks total size: 1006
   data index size: 28
   filter blocks total size: 65482
   filter index block size: 21
   (estimated) table size: 66516
   filter policy name: DocKeyV3Filter
   # deleted keys: 0
   User collected properties:
   ------------------------------
   rocksdb.block.based.table.data.block.key.value.encoding.format: 02
   rocksdb.block.based.table.index.num.levels: 01000000
   rocksdb.block.based.table.index.type: 02000000
   rocksdb.block.based.table.prefix.filtering: 0
   rocksdb.block.based.table.whole.key.filtering: 1
   rocksdb.deleted.keys: 00