flow_study 英文版介绍The flow_study open source project builds a

The flow_study open source project builds a development and learning platform for Flow developers. The project includes 3 modules, code download module, search module and display module. The flow_study open source project is mainly used as an effective tool and development platform for flow learning. This chapter introduces flow_study_spider, the ETL of flow contract code. The detailed code visits the url below.

flow_study_server github.com/flowstudy/f…
flow_study_spider github.com/flowstudy/f…
flow_study_sql github.com/flowstudy/f…
flow_study_web github.com/flowstudy/f…

The following figure is the basic structure of the flow block.

Basic flow chart of the flow_study_spider program.

get_block.py

In order to get all the information of the block, first the program get_block.py passes The interface provided by flow_py_sdk obtains the latest block information, and stores the obtained relevant information into the flow_block table of the database.

    async with flow_client(
            host="access.mainnet.nodes.onflow.org", port=9000
    ) as client:
        latest_block = await client.get_latest_block()
        height = latest_block.height
        parent_id_hex = latest_block.parent_id.hex()
        data = {
            "block_id" : latest_block.id.hex(),
            "signatures" : latest_block.signatures[0].hex(),
            "parent_id" : parent_id_hex,
            "height" : latest_block.height,
            "fetch_time" : time.strftime("%Y-%m-%d %H:%M:%S",time.localtime()),
            "timestamp" : latest_block.timestamp,
        }

Visit flowscan.org/ to view information about blocks.

update_trans_multi.py

In the above function, we have stored the block information in the flow_block table. Now we can call the get_trans function of get_trans.py to obtain the transaction information of the block through the block height in the flow_block table, such as the transaction name contract_name, transaction address contract_address, and store the obtained information in the flow_trans_data table.

The following is part of the code in get_trans.py to obtain transaction information through flow height. From the basic structure of the flow block, the get_trans function needs to obtain collection first to obtain transaction information. Through the collection information, use the following function to obtain the user address in the transaction , transaction script code, contract address and other information. Finally, the collected information is stored in the flow_trans_data table.

async with flow_client(
         host="access.mainnet.nodes.onflow.org", port=9000
) as client:
     # Obtain block information through height.
     block = await client.get_block_by_height(height=height)
     #collection is a collection of transactions
     trans_list = []
     # Get the collection list in the block
     for i in range(len(block. collection_guarantees)):
         collection_id = block.collection_guarantees[i].collection_id # get the i-th collection id
         collection = await client.get_collection_by_i_d(id=collection_id) # get collection details
         for trans_id in collection.transaction_ids: # get all transaction list in collection, and traverse
             trans_id_hex = trans_id.hex() # Get the transaction id in hexadecimal form
             transaction = await client.get_transaction(id=trans_id) # get transaction details according to transaction id
             user_address= transaction.proposal_key.address.hex() # Get the user address in the transaction
             trans_script = transaction.script.decode("utf-8") # Get the script code of the transaction
             import_data_list = get_contract_address(trans_script) # Parse the contract address from the transaction code
             # Traverse the contract address and construct the data that needs to be inserted into the database
             for import_data in import_data_list:
                 trans_data = {
                     "trans_id": trans_id_hex,
                     "user_address":"0x" + user_address, "contract_name":import_data["contract_name"],
  "contract_address":import_data["contract_address"],
                     "fetch_time": time.strftime('%Y-%m-%d %H:%M:%S', time.localtime()),
                     "height": height,
                 } #Note that when using, the basic nft and other classes can not be used
                 #print(trans_data)
                 trans_list.append(trans_data)

Among them, the trans_script needs to be simply analyzed by the get_contract_address function to obtain the contract_name and contract_address. The specific code is as follows.

def get_contract_address(trans_script):
#step 1, get all references
     p = re.compile('import.{3,30}from 0x\w{10,25}') #quoted regular
     import_list = p.findall(trans_script)

     #step 2, resolve all references
    import_data_list = []
    for item in import_list:
        item_list = item.split()
        contract_name = item_list[1]
        contract_address = item_list[3]
        #print(contract_name, contract_address)
        import_data = {
            "contract_name":contract_name,
            "contract_address":contract_address,
        }
        import_data_list.append(import_data)
    return import_data_list

update_contract.py

Run once an hour, remove the duplicate contract_address in the flow_trans_data table, insert it into the flow_contract_address table, which is equivalent to the task table to be processed next, and then process these contracts in get_contract.py.

get_contract.py

Obtain the contract code through the contract_address in the table. The specific implementation is as follows. The service_account_address can be obtained through the contract address, so as to obtain the account information, and then the contract_address, contract_code, and contract_name can be obtained by parsing the account. Among them, the contract_name needs to be filtered out from the contract_code by simple regular matching.

async def get_contract(address):
    async with flow_client(
            host="access.mainnet.nodes.onflow.org", port=9000
    ) as client:
        service_account_address = bytes.fromhex(address)
      # to the latest altitude information
        account = await client.get_account(
            address=service_account_address
        )
        contract_data_list = []
        for key in account.contracts:
            contract = account.contracts[key]
            #print(contract.decode())
            contract_code = contract.decode()
            contract_name = get_contract_name(contract_code)
            contract_data = {
                "contract_address": "0x"+address,
                "contract_name": contract_name,
                "contract_code": contract_code
            }
            contract_data_list.append(contract_data)
        return contract_data_list

contract_name regular matching code.

def get_contract_name(code):
     
     #step 1, get all references
     p = re.compile('pub contract.*|access(all) contract.*') #quoted regular
     import_list = p.findall(code) #Include carriage return

    contract_name_line = import_list[0]
    contract_name_line = contract_name_line.replace(" interface","")
    contract_name_line = contract_name_line.replace(":", " ")
    contract_name_line = contract_name_line.replace("{", " ")

    contract_name_line_item = contract_name_line.split()
    contract_name = contract_name_line_item[2].strip()
    return contract_name

parse_flow_code.py

Add attributes to the contract_type and contract_category in the flow_code table, obtain the contract_code of each row in the table, and analyze the contract_type of the contract_code to belong to interface, contract or transaction through regularization. The get_code_category function parses the contract name to get the code_category and stores the code_category in the flow_code table.

contract_struct_parser2.py

In order to obtain the position of each code block in the contract code and realize a quick jump, contract_struct_parser2.py calls a parsing service written in go language. The specific code of the service can be viewed in the GitHub address github.com/HoppingChar… , the parsed code_text contains the location information of each part of the contract code from the beginning to the end of the line, after the parse_code function parses, the type of the struct_type code block, the name of the struct_name code block, start_pos the starting position of this code block , end_pos the end position of this code block.

update_relate_code.py

The main function is to analyze the relevant code of the contract in flow_code and store it in the contract_relation table. The update_relate_code function obtains the contract name, contract address, and contract code to be processed in the flow_code table, and calls the get_code_related function of code_relation.py to analyze the contract_code and contract_name of the related contract.

code_relation.py

There are two functions, get_code_related to get code-related code, which refers to the contract referenced by this contract code, and output the contract name and contract address of the relevant contract. The list format is as follows.

[{"contract_name":"name1","contract_address":"address1"}, {"contract_name":"name2","contract_address":"address2"}]。

relate_contract_list = []
#step 1, get all import lines
p = re.compile('import.{3,30}from 0x\w{10,25}') #quoted regular
import_list = p. findall(contract_code)
# step 2, parse each line to obtain the relevant contract address and contract name

for item in import_list:
    item_list = item.split()
    contract_name = item_list[1]
    contract_address = item_list[3]
    #print(contract_name, contract_address)
    import_data = {
        "contract_name":contract_name,
        "contract_address":contract_address,
    }
    relate_contract_list.append(import_data)
return relate_contract_list

get_code_relate_transaction, get the code-related transaction transaction, and store it in the flow_code_relate_transaction table of the database. The specific relationship is explained as follows:

Source database table: contract_relation Storage database table: flow_code_relate_transaction

Get the relationship from the related contract table contract_relation, The related transaction in line 41 refers to the CharityNFT in line 33, because it can be seen from the table that 41 is referenced by 33, and the flow_code contract_type of 33 is transaction.

contract_relation table example
33	Admin	0x097bafa4e0b48eef	CharityNFT	0x097bafa4e0b48eef	2022-11-26 19:45:22
41	CharityNFT	0x097bafa4e0b48eef	NonFungibleToken	0x1d7e57aa55817448	2022-11-27 00:32:59

Other codes explained

update_fcode_es.py
Import the flow_code table data into es in full.

update_trans.py
Read the unprocessed block from flow_block, obtain the transaction information in the block, mainly the contract code and name, and insert it into the flow_trans_data table.

get_flow_trans_link.py
Obtain the events record of flow transfer and store it in the flow_token table.

get_flow_trans_link_to.py Obtain the events record of flow transfer and store it in the flow_token_to table.