一、环境
- Python 3.8.10
- Node v16.16.0
- Windows 10 专业版 64 位操作系统
- Superset 1.5
二、前端操作步骤
我们可以参照csv的导出方式,去扩展excel。以以下页面为例
2.1 确定导出csv的地方
2.1.1查找样式
发现有个chart-slice的类名,以这个关键词进行全局搜索
发现Chart下有包含
2.1.2 找到渲染导出csv的button组件
进入chart组件可以看到关联组件SliceHeader
按照查找思路SliceHeader --> SliceHeaderControls,该文件下全局搜索CSV,可以确定csv的button组件导出位置
2.2 写入导出EXCEL组件
\superset-1.5\superset-frontend\src\dashboard\components\SliceHeaderControls\index.tsx
2.2.1 添加导出菜单
目前可以看菜单按钮有两块
{this.props.slice.viz_type !== 'filter_box' &&
this.props.supersetCanCSV && (
<Menu.Item key={MENU_KEYS.EXPORT_CSV}>{t('Export CSV')}</Menu.Item>
)}
{this.props.slice.viz_type !== 'filter_box' &&
isFeatureEnabled(FeatureFlag.ALLOW_FULL_CSV_EXPORT) &&
this.props.supersetCanCSV &&
isTable && (
<Menu.Item key={MENU_KEYS.EXPORT_FULL_CSV}>
{t('Export full CSV')}
</Menu.Item>
)}
依葫芦画瓢,在后面新增两个excel导出按钮
{this.props.slice.viz_type !== 'filter_box' &&
this.props.supersetCanExcel && (
<Menu.Item key={MENU_KEYS.EXPORT_EXCEL}>{t('Export EXCEL')}</Menu.Item>
)}
{this.props.slice.viz_type !== 'filter_box' &&
isFeatureEnabled(FeatureFlag.ALLOW_FULL_EXCEL_EXPORT) &&
this.props.supersetCanExcel &&
isTable && (
<Menu.Item key={MENU_KEYS.EXPORT_FULL_EXCEL}>
{t('Export full EXCEL')}
</Menu.Item>
)}
2.2.2 添加相应字段
此时应该有很多未定义报错,所以需要添加以下代码
找到MENU_KEYS,在末尾添加
EXPORT_FULL_CSV: 'export_full_csv',
EXPORT_FULL_EXCEL: 'export_full_excel',
找到\superset-1.5\superset-frontend\packages\superset-ui-core\src\utils\featureFlags.ts中的FeatureFlag,在末尾添加
ALLOW_FULL_EXCEL_EXPORT = 'ALLOW_FULL_EXCEL_EXPORT',
2.2.3 消除props不通过的校验
现在还有些问题是因为有些属性没有声明,props校验不通过,所以需要添加属性。目前有三个属性未添加supersetCanExcel、exportFullExcel、exportExcel。
- \superset-1.5\superset-frontend\src\dashboard\components\SliceHeaderControls\index.tsx
找到export interface SliceHeaderControlsProps,在末尾添加
exportExcel?: (sliceId: number) => void;
exportFullExcel?: (sliceId: number) => void;
supersetCanExcel?: boolean;
- \superset-1.5\superset-frontend\src\dashboard\components\SliceHeader\index.tsx
找到const SliceHeader: FC,在末尾添加
exportExcel = () => ({}),
exportFullExcel,
supersetCanExcel = false,
- superset-1.5\superset-frontend\src\dashboard\containers\Chart.jsx
在supersetCanCSV: !!dashboardInfo.superset_can_csv, 后面新增
supersetCanExcel: !!dashboardInfo.superset_can_excel,
- \superset-1.5\superset-frontend\src\dashboard\actions\hydrate.js
由于数据源来自于dashboardInfo,所以我们找到来源为hydrate.js,此时发现这个步骤牵扯到权限问题,目前还未实现,所以我们暂时写死,
// superset_can_excel: findPermission('can_excel', 'Superset', roles),
superset_can_excel: true,
2.3 添加Excel导出方法
2.3.1 \superset-1.5\superset-frontend\src\dashboard\components\gridComponents\Chart.jsx
现在需要新增导出excel方法,同样参照exportCSV添加以下代码
exportExcel(isFullExcel = false) {
this.props.logEvent(LOG_ACTIONS_EXPORT_EXCEL_DASHBOARD_CHART, {
slice_id: this.props.slice.slice_id,
is_cached: this.props.isCached,
});
exportChart({
formData: isFullExcel
? { ...this.props.formData, row_limit: this.props.maxRows }
: this.props.formData,
// resultType: 'results',
resultType: 'full',
resultFormat: 'excel',
});
}
exportFullExcel() {
this.exportExcel(true);
}
2.3.2 \superset-1.5\superset-frontend\src\logger\LogUtils.ts
需要添加
export const LOG_ACTIONS_EXPORT_EXCEL_DASHBOARD_CHART =
'export_excel_dashboard_chart';
2.4 修改exportChart相关导出方法
exportChart是主要的导出csv的方法
export const exportChart = ({
formData,
resultFormat = 'json',
resultType = 'full',
force = false,
ownState = {},
}) => {
let url;
let payload;
if (shouldUseLegacyApi(formData)) {
const endpointType = getLegacyEndpointType({ resultFormat, resultType });
url = getExploreUrl({
formData,
endpointType,
allowDomainSharding: false,
});
payload = formData;
} else {
url = '/api/v1/chart/data';
payload = buildV1ChartDataPayload({
formData,
force,
resultFormat,
resultType,
ownState,
});
}
console.log("url:", url, payload);
postForm(url, payload);
};
shouldUseLegacyApi这个函数主要是根据useLegacyApi来判断调用哪个接口,useLegacyApi是在图表注册时就已经确定好了。里面有两个函数,getLegacyEndpointType(生成最终类型的函数), getExploreUrl(生成导出url的函数)。
2.4.1 修改getLegacyEndpointType
export const getLegacyEndpointType = ({ resultType, resultFormat }) =>
(resultFormat === 'csv' || resultFormat === 'excel' ) ? resultFormat : resultType;
2.4.2 修改 getExploreUrl
找到
if (endpointType === 'csv') {
search.csv = 'true';
}
在后面新增
if(endpointType === 'excel') {
search.excel = 'true';
}
前端已经修改完毕,最终修改结果
三、后端操作步骤
3.1 下载openpyxl
安装excel的python驱动包openpyxl包,执行
pip install openpyxl
\
3.2 修改config.py
\superset-1.5\superset\config.py
CSV_EXPORT = {"encoding": "utf-8"} 后面新增
CSV_EXPORT = {"encoding": "utf-8"}
EXCEL_EXPORT = {"encoding": "utf-8"}
3.3 添加类型
我们需要在ChartDataResultFormat里面新增一个excel的类型
\superset-1.5\superset\common\chart_data.py
class ChartDataResultFormat(str, Enum):
"""
Chart data response format
"""
CSV = "csv"
JSON = "json"
EXCEL = "excel" // 新增
3.4 实现ExcelResponse
\superset-1.5\superset\views\base.py
class ExcelResponse(Response): # pylint: disable=too-many-ancestors
"""
Override Response to take into account xlsx encoding from config.py
"""
charset = conf.get("EXCEL_EXPORT").get("encoding", "utf-8")
3.5 扩展generate_json
\superset-1.5\superset\views\core.py
找到最后return的地方,可以看到需要实现generate_json,在这个方法新增如下代码
1、import上面的ExcelResponse
from superset.views.base import (
api,
BaseSupersetView,
check_ownership,
common_bootstrap_payload,
create_table_permissions,
CsvResponse,
ExcelResponse,
data_payload_response,
generate_download_headers,
get_error_msg,
handle_api_exception,
json_error_response,
json_errors_response,
json_success,
validate_sqlatable,
)
2、增加response_type
def generate_json(
self, viz_obj: BaseViz, response_type: Optional[str] = None
) -> FlaskResponse:
if response_type == ChartDataResultFormat.CSV:
return CsvResponse(
viz_obj.get_csv(), headers=generate_download_headers("csv")
)
#20220905
if response_type == ChartDataResultFormat.EXCEL:
return ExcelResponse(
viz_obj.get_excel(), headers=generate_download_headers("xlsx")
)
3.6 实现get_excel
\superset-1.5\superset\viz.py
在头部导入
import io
在get_csv后面新增如下代码
def get_excel(self) -> bytes:
data = io.BytesIO()
df = self.get_df_payload()["df"]
include_index = not isinstance(df.index, pd.RangeIndex)
df.to_excel(data, index=include_index, **config.get("EXCEL_EXPORT"))
data.seek(0)
return data.read()
3.7 修改接口charts\data\api.py
下载接口一共有两个,一个是explore_json,但不支持表格之类的下载,此时需要修改另一个接口
\superset-1.5\superset\charts\data\api.py中的_send_chart_response方法,在
return Response(
create_zip(files),
headers=generate_download_headers("zip"),
mimetype="application/zip",
)
后面新增
# import pandas
import pandas as pd
……
# import ExcelResponse
from superset.views.base import CsvResponse, generate_download_headers, ExcelResponse
……
)
# add by yangyuan on 20220811
if result_format == ChartDataResultFormat.EXCEL:
# pass
# Verify user has permission to export excel file
# if not security_manager.can_access("can_csv", "Superset"):
# return self.response_403()
import io
config = current_app.config
if not result["queries"]:
return self.response_400(_("Empty query result"))
if len(result["queries"]) == 1:
# print('excel1' * 80)
# return single query results excel format
try:
str_data = result["queries"][0]["data"]
# print(type(str_data))
# print(type(str_data[0]))
# print(str_data[0:2])
df = pd.DataFrame(str_data)
# print(df[0:3])
data = io.BytesIO()
include_index = not isinstance(df.index, pd.RangeIndex)
df.to_excel(data, index=include_index, **config.get("EXCEL_EXPORT"))
data.seek(0)
return ExcelResponse(data.read(), headers=generate_download_headers("xlsx"))
except Exception as e:
print('error', str(e))
return self.response_400(_("deal_queries.error:[{}]".format(str(e))))
# return multi-query excel results bundled as a zip file
# encoding = current_app.config["EXCEL_EXPORT"].get("encoding", "utf-8")
files = {}
for idx, result in enumerate(result["queries"]):
str_data = result["data"]
df = pd.DataFrame(str_data)
data = io.BytesIO()
include_index = not isinstance(df.index, pd.RangeIndex)
df.to_excel(data, index=include_index, **config.get("EXCEL_EXPORT"))
data.seek(0)
files[f"query_{idx + 1}.xlsx"] = data.read()
return Response(
create_zip(files),
headers=generate_download_headers("zip"),
mimetype="application/zip",
)
return self.response_400(_("Deal with excel error."))
最后一个流程就大致完成了,我们可以在这个页面测试下
下载成功
四、总结
这是superset二次开发支持excel的初步探索,目前还有一些问题未解决。如下:
- Sqllab中点击.xlsx
点击.xlsx,需要实现对应的/superset/excel/${this.props.query.id}方法。可以参考
\superset-1.5\superset\views\core.py中的这段实现csv的方法
@has_access
@event_logger.log_this
@expose("/csv/<client_id>")
def csv( # pylint: disable=no-self-use,too-many-locals
self, client_id: str
) -> FlaskResponse:
"""Download the query results as csv."""
logger.info("Exporting CSV file [%s]", client_id)
query = db.session.query(Query).filter_by(client_id=client_id).one()
try:
query.raise_for_access()
except SupersetSecurityException as ex:
flash(ex.error.message)
return redirect("/")
blob = None
if results_backend and query.results_key:
logger.info("Fetching CSV from results backend [%s]", query.results_key)
blob = results_backend.get(query.results_key)
if blob:
logger.info("Decompressing")
payload = utils.zlib_decompress(
blob, decode=not results_backend_use_msgpack
)
obj = _deserialize_results_payload(
payload, query, cast(bool, results_backend_use_msgpack)
)
columns = [c["name"] for c in obj["columns"]]
df = pd.DataFrame.from_records(obj["data"], columns=columns)
logger.info("Using pandas to convert to CSV")
else:
logger.info("Running a query to turn into CSV")
if query.select_sql:
sql = query.select_sql
limit = None
else:
sql = query.executed_sql
limit = ParsedQuery(sql).limit
if limit is not None and query.limiting_factor in {
LimitingFactor.QUERY,
LimitingFactor.DROPDOWN,
LimitingFactor.QUERY_AND_DROPDOWN,
}:
# remove extra row from `increased_limit`
limit -= 1
df = query.database.get_df(sql, query.schema)[:limit]
csv_data = csv.df_to_escaped_csv(df, index=False, **config["CSV_EXPORT"])
quoted_csv_name = parse.quote(query.name)
response = CsvResponse(
csv_data, headers=generate_download_headers("csv", quoted_csv_name)
)
event_info = {
"event_type": "data_export",
"client_id": client_id,
"row_count": len(df.index),
"database": query.database.name,
"schema": query.schema,
"sql": query.sql,
"exported_format": "csv",
}
event_rep = repr(event_info)
logger.debug(
"CSV exported: %s", event_rep, extra={"superset_event": event_info}
)
return response
- 导出excel权限问题
其余的导出excel功能也可以参考这种思路去实现
参考: