The flow_study open source project builds a development and learning platform for Flow developers.
The project includes 3 modules, code download module, search module and display module. The flow_study open source project is mainly used as an effective tool and development platform for flow learning. This chapter introduces flow_study_spider, the ETL of flow contract code.
The detailed code visits the url below.
flow_study_server github.com/flowstudy/f…
flow_study_spider github.com/flowstudy/f…
flow_study_sql github.com/flowstudy/f…
flow_study_web github.com/flowstudy/f…
The following figure is the basic structure of the flow block.
Basic flow chart of the flow_study_spider program.
get_block.py
In order to get all the information of the block, first the program get_block.py passes The interface provided by flow_py_sdk obtains the latest block information, and stores the obtained relevant information into the flow_block table of the database.
async with flow_client(
host="access.mainnet.nodes.onflow.org", port=9000
) as client:
latest_block = await client.get_latest_block()
height = latest_block.height
parent_id_hex = latest_block.parent_id.hex()
data = {
"block_id" : latest_block.id.hex(),
"signatures" : latest_block.signatures[0].hex(),
"parent_id" : parent_id_hex,
"height" : latest_block.height,
"fetch_time" : time.strftime("%Y-%m-%d %H:%M:%S",time.localtime()),
"timestamp" : latest_block.timestamp,
}
Visit flowscan.org/ to view information about blocks.
update_trans_multi.py
In the above function, we have stored the block information in the flow_block table. Now we can call the get_trans function of get_trans.py to obtain the transaction information of the block through the block height in the flow_block table, such as the transaction name contract_name, transaction address contract_address, and store the obtained information in the flow_trans_data table.
The following is part of the code in get_trans.py to obtain transaction information through flow height. From the basic structure of the flow block, the get_trans function needs to obtain collection first to obtain transaction information. Through the collection information, use the following function to obtain the user address in the transaction , transaction script code, contract address and other information. Finally, the collected information is stored in the flow_trans_data table.
async with flow_client(
host="access.mainnet.nodes.onflow.org", port=9000
) as client:
# Obtain block information through height.
block = await client.get_block_by_height(height=height)
#collection is a collection of transactions
trans_list = []
# Get the collection list in the block
for i in range(len(block. collection_guarantees)):
collection_id = block.collection_guarantees[i].collection_id # get the i-th collection id
collection = await client.get_collection_by_i_d(id=collection_id) # get collection details
for trans_id in collection.transaction_ids: # get all transaction list in collection, and traverse
trans_id_hex = trans_id.hex() # Get the transaction id in hexadecimal form
transaction = await client.get_transaction(id=trans_id) # get transaction details according to transaction id
user_address= transaction.proposal_key.address.hex() # Get the user address in the transaction
trans_script = transaction.script.decode("utf-8") # Get the script code of the transaction
import_data_list = get_contract_address(trans_script) # Parse the contract address from the transaction code
# Traverse the contract address and construct the data that needs to be inserted into the database
for import_data in import_data_list:
trans_data = {
"trans_id": trans_id_hex,
"user_address":"0x" + user_address, "contract_name":import_data["contract_name"],
"contract_address":import_data["contract_address"],
"fetch_time": time.strftime('%Y-%m-%d %H:%M:%S', time.localtime()),
"height": height,
} #Note that when using, the basic nft and other classes can not be used
#print(trans_data)
trans_list.append(trans_data)
Among them, the trans_script needs to be simply analyzed by the get_contract_address function to obtain the contract_name and contract_address. The specific code is as follows.
def get_contract_address(trans_script):
#step 1, get all references
p = re.compile('import.{3,30}from 0x\w{10,25}') #quoted regular
import_list = p.findall(trans_script)
#step 2, resolve all references
import_data_list = []
for item in import_list:
item_list = item.split()
contract_name = item_list[1]
contract_address = item_list[3]
#print(contract_name, contract_address)
import_data = {
"contract_name":contract_name,
"contract_address":contract_address,
}
import_data_list.append(import_data)
return import_data_list
update_contract.py
Run once an hour, remove the duplicate contract_address in the flow_trans_data table, insert it into the flow_contract_address table, which is equivalent to the task table to be processed next, and then process these contracts in get_contract.py.
get_contract.py
Obtain the contract code through the contract_address in the table. The specific implementation is as follows. The service_account_address can be obtained through the contract address, so as to obtain the account information, and then the contract_address, contract_code, and contract_name can be obtained by parsing the account. Among them, the contract_name needs to be filtered out from the contract_code by simple regular matching.
async def get_contract(address):
async with flow_client(
host="access.mainnet.nodes.onflow.org", port=9000
) as client:
service_account_address = bytes.fromhex(address)
# to the latest altitude information
account = await client.get_account(
address=service_account_address
)
contract_data_list = []
for key in account.contracts:
contract = account.contracts[key]
#print(contract.decode())
contract_code = contract.decode()
contract_name = get_contract_name(contract_code)
contract_data = {
"contract_address": "0x"+address,
"contract_name": contract_name,
"contract_code": contract_code
}
contract_data_list.append(contract_data)
return contract_data_list
contract_name regular matching code.
def get_contract_name(code):
#step 1, get all references
p = re.compile('pub contract.*|access(all) contract.*') #quoted regular
import_list = p.findall(code) #Include carriage return
contract_name_line = import_list[0]
contract_name_line = contract_name_line.replace(" interface","")
contract_name_line = contract_name_line.replace(":", " ")
contract_name_line = contract_name_line.replace("{", " ")
contract_name_line_item = contract_name_line.split()
contract_name = contract_name_line_item[2].strip()
return contract_name
parse_flow_code.py
Add attributes to the contract_type and contract_category in the flow_code table, obtain the contract_code of each row in the table, and analyze the contract_type of the contract_code to belong to interface, contract or transaction through regularization. The get_code_category function parses the contract name to get the code_category and stores the code_category in the flow_code table.
contract_struct_parser2.py
In order to obtain the position of each code block in the contract code and realize a quick jump, contract_struct_parser2.py calls a parsing service written in go language. The specific code of the service can be viewed in the GitHub address github.com/HoppingChar… , the parsed code_text contains the location information of each part of the contract code from the beginning to the end of the line, after the parse_code function parses, the type of the struct_type code block, the name of the struct_name code block, start_pos the starting position of this code block , end_pos the end position of this code block.
update_relate_code.py
The main function is to analyze the relevant code of the contract in flow_code and store it in the contract_relation table. The update_relate_code function obtains the contract name, contract address, and contract code to be processed in the flow_code table, and calls the get_code_related function of code_relation.py to analyze the contract_code and contract_name of the related contract.
code_relation.py
There are two functions, get_code_related to get code-related code, which refers to the contract referenced by this contract code, and output the contract name and contract address of the relevant contract. The list format is as follows.
[{"contract_name":"name1","contract_address":"address1"}, {"contract_name":"name2","contract_address":"address2"}]。
relate_contract_list = []
#step 1, get all import lines
p = re.compile('import.{3,30}from 0x\w{10,25}') #quoted regular
import_list = p. findall(contract_code)
# step 2, parse each line to obtain the relevant contract address and contract name
for item in import_list:
item_list = item.split()
contract_name = item_list[1]
contract_address = item_list[3]
#print(contract_name, contract_address)
import_data = {
"contract_name":contract_name,
"contract_address":contract_address,
}
relate_contract_list.append(import_data)
return relate_contract_list
get_code_relate_transaction, get the code-related transaction transaction, and store it in the flow_code_relate_transaction table of the database. The specific relationship is explained as follows:
Source database table: contract_relation Storage database table: flow_code_relate_transaction
Get the relationship from the related contract table contract_relation, The related transaction in line 41 refers to the CharityNFT in line 33, because it can be seen from the table that 41 is referenced by 33, and the flow_code contract_type of 33 is transaction.
| contract_relation table example | |||||
|---|---|---|---|---|---|
| 33 | Admin | 0x097bafa4e0b48eef | CharityNFT | 0x097bafa4e0b48eef | 2022-11-26 19:45:22 |
| 41 | CharityNFT | 0x097bafa4e0b48eef | NonFungibleToken | 0x1d7e57aa55817448 | 2022-11-27 00:32:59 |
Other codes explained
update_fcode_es.py
Import the flow_code table data into es in full.
update_trans.py
Read the unprocessed block from flow_block, obtain the transaction information in the block, mainly the contract code and name, and insert it into the flow_trans_data table.
get_flow_trans_link.py
Obtain the events record of flow transfer and store it in the flow_token table.
get_flow_trans_link_to.py Obtain the events record of flow transfer and store it in the flow_token_to table.