7天掌握大数据核心技术:餐饮服务许可证数据分析系统从0到完整项目详解

51 阅读6分钟

💖💖作者:计算机毕业设计小途 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目

@TOC

餐饮服务许可证数据可视化分析系统介绍

基于大数据的餐饮服务许可证数据可视化分析系统是一个综合性大数据应用平台,采用Hadoop+Spark大数据框架,通过HDFS实现海量数据存储管理,用Spark SQL快速查询处理数据,结合Pandas和NumPy进行深度分析。系统提供Python+Django和Java+Spring Boot两套后端解决方案,前端采用Vue+ElementUI+Echarts技术栈构建交互界面,支持MySQL数据库存储结构化数据。 该系统功能全面,涵盖用户管理、大屏可视化展示、餐饮许可证信息管理,以及经营业态、企业画像、空间地理和时间趋势等多维度智能分析功能。通过丰富图表和数据挖掘技术,为餐饮行业监管部门和企业提供科学决策支持,实现从数据采集到可视化分析的完整流程,体现大数据技术在政务服务和行业分析中的价值,是计算机专业学生的大数据毕业设计项目实践平台。

餐饮服务许可证数据可视化分析系统演示视频

演示视频

餐饮服务许可证数据可视化分析系统演示图片

餐饮许可证信息管理.png

登陆界面.png

经营业态分析.png

空间地理分析.png

企业画像分析.png

时间趋势分析.png

数据大屏.png

用户管理.png

餐饮服务许可证数据可视化分析系统代码展示

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, count, sum, avg, max, min, date_format, year, month, when
from pyspark.sql.types import StructType, StructField, StringType, IntegerType, DateType, DoubleType
import pandas as pd
import numpy as np
from django.http import JsonResponse
from django.views.decorators.csrf import csrf_exempt
import json

spark = SparkSession.builder.appName("RestaurantLicenseAnalysis").config("spark.sql.adaptive.enabled", "true").config("spark.sql.adaptive.coalescePartitions.enabled", "true").getOrCreate()

def business_type_analysis(request):
   license_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/restaurant_db").option("dbtable", "restaurant_license").option("user", "root").option("password", "password").load()
   business_type_stats = license_df.groupBy("business_type").agg(count("*").alias("count"), avg("registered_capital").alias("avg_capital"), sum("employee_count").alias("total_employees")).orderBy(col("count").desc())
   type_trend = license_df.withColumn("issue_year", year(col("issue_date"))).groupBy("business_type", "issue_year").agg(count("*").alias("yearly_count")).orderBy("business_type", "issue_year")
   region_type_dist = license_df.groupBy("district", "business_type").agg(count("*").alias("count")).orderBy("district", col("count").desc())
   avg_processing_time = license_df.withColumn("processing_days", col("processing_time")).groupBy("business_type").agg(avg("processing_days").alias("avg_days"), max("processing_days").alias("max_days"), min("processing_days").alias("min_days"))
   status_analysis = license_df.groupBy("business_type", "license_status").agg(count("*").alias("status_count")).orderBy("business_type", "license_status")
   business_stats_pd = business_type_stats.toPandas()
   trend_pd = type_trend.toPandas()
   region_pd = region_type_dist.toPandas()
   processing_pd = avg_processing_time.toPandas()
   status_pd = status_analysis.toPandas()
   growth_rate = trend_pd.groupby('business_type').apply(lambda x: ((x['yearly_count'].iloc[-1] - x['yearly_count'].iloc[0]) / x['yearly_count'].iloc[0] * 100) if len(x) > 1 and x['yearly_count'].iloc[0] > 0 else 0).reset_index()
   growth_rate.columns = ['business_type', 'growth_rate']
   risk_score = business_stats_pd.copy()
   risk_score['risk_score'] = np.where(risk_score['avg_capital'] < 100000, 3, np.where(risk_score['avg_capital'] < 500000, 2, 1))
   concentration_index = region_pd.groupby('business_type')['count'].apply(lambda x: (x**2).sum() / (x.sum()**2)).reset_index()
   concentration_index.columns = ['business_type', 'concentration_index']
   result_data = {
       'business_stats': business_stats_pd.to_dict('records'),
       'trend_data': trend_pd.to_dict('records'),
       'region_distribution': region_pd.to_dict('records'),
       'processing_time': processing_pd.to_dict('records'),
       'status_analysis': status_pd.to_dict('records'),
       'growth_rates': growth_rate.to_dict('records'),
       'risk_scores': risk_score.to_dict('records'),
       'concentration_index': concentration_index.to_dict('records')
   }
   return JsonResponse(result_data)

def enterprise_portrait_analysis(request):
   enterprise_id = request.GET.get('enterprise_id')
   license_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/restaurant_db").option("dbtable", "restaurant_license").option("user", "root").option("password", "password").load()
   target_enterprise = license_df.filter(col("enterprise_id") == enterprise_id)
   if target_enterprise.count() == 0:
       return JsonResponse({'error': '企业不存在'}, status=404)
   enterprise_info = target_enterprise.select("enterprise_name", "business_type", "registered_capital", "employee_count", "district", "issue_date", "license_status").first()
   same_type_enterprises = license_df.filter(col("business_type") == enterprise_info.business_type)
   capital_percentile = same_type_enterprises.selectExpr(f"percentile_approx(registered_capital, 0.5) as median").collect()[0].median
   employee_percentile = same_type_enterprises.selectExpr(f"percentile_approx(employee_count, 0.5) as median").collect()[0].median
   same_district_count = license_df.filter(col("district") == enterprise_info.district).count()
   total_count = license_df.count()
   district_concentration = same_district_count / total_count * 100
   historical_licenses = license_df.filter(col("enterprise_id") == enterprise_id).orderBy("issue_date")
   license_history = historical_licenses.select("license_id", "issue_date", "expiry_date", "license_status", "business_scope").toPandas()
   operation_years = (pd.Timestamp.now() - pd.to_datetime(license_history['issue_date'].min())).days / 365
   compliance_score = 100
   expired_count = len(license_history[license_history['license_status'] == 'expired'])
   revoked_count = len(license_history[license_history['license_status'] == 'revoked'])
   compliance_score -= (expired_count * 10 + revoked_count * 20)
   compliance_score = max(0, compliance_score)
   similar_enterprises = same_type_enterprises.filter((col("registered_capital").between(enterprise_info.registered_capital * 0.8, enterprise_info.registered_capital * 1.2)) & (col("employee_count").between(enterprise_info.employee_count * 0.8, enterprise_info.employee_count * 1.2)) & (col("district") == enterprise_info.district)).select("enterprise_name", "registered_capital", "employee_count", "license_status").limit(10).toPandas()
   risk_factors = []
   if enterprise_info.registered_capital < capital_percentile:
       risk_factors.append("注册资本低于行业中位数")
   if enterprise_info.employee_count < employee_percentile:
       risk_factors.append("员工数量低于行业中位数")
   if compliance_score < 80:
       risk_factors.append("合规得分较低")
   business_expansion = len(license_history) > 1
   recent_activity = (pd.Timestamp.now() - pd.to_datetime(license_history['issue_date'].max())).days < 365
   portrait_result = {
       'basic_info': {
           'enterprise_name': enterprise_info.enterprise_name,
           'business_type': enterprise_info.business_type,
           'registered_capital': enterprise_info.registered_capital,
           'employee_count': enterprise_info.employee_count,
           'district': enterprise_info.district,
           'operation_years': round(operation_years, 1)
       },
       'industry_comparison': {
           'capital_vs_median': 'higher' if enterprise_info.registered_capital > capital_percentile else 'lower',
           'employee_vs_median': 'higher' if enterprise_info.employee_count > employee_percentile else 'lower',
           'district_concentration': round(district_concentration, 2)
       },
       'license_history': license_history.to_dict('records'),
       'compliance_score': compliance_score,
       'risk_factors': risk_factors,
       'similar_enterprises': similar_enterprises.to_dict('records'),
       'business_characteristics': {
           'expansion_tendency': business_expansion,
           'recent_activity': recent_activity,
           'license_count': len(license_history)
       }
   }
   return JsonResponse(portrait_result)

def spatial_geographic_analysis(request):
   license_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/restaurant_db").option("dbtable", "restaurant_license").option("user", "root").option("password", "password").load()
   district_stats = license_df.groupBy("district").agg(count("*").alias("total_count"), avg("registered_capital").alias("avg_capital"), sum("employee_count").alias("total_employees"), count(when(col("license_status") == "active", 1)).alias("active_count")).orderBy(col("total_count").desc())
   business_district_matrix = license_df.groupBy("district", "business_type").agg(count("*").alias("count")).orderBy("district", "business_type")
   density_analysis = district_stats.withColumn("density_score", col("total_count") / col("total_employees") * 1000).select("district", "density_score", "total_count")
   coordinate_data = license_df.select("enterprise_id", "latitude", "longitude", "district", "business_type", "registered_capital").filter((col("latitude").isNotNull()) & (col("longitude").isNotNull()))
   hotspot_analysis = coordinate_data.groupBy("district").agg(count("*").alias("enterprise_count"), avg("latitude").alias("center_lat"), avg("longitude").alias("center_lng"))
   growth_trend = license_df.withColumn("issue_year", year(col("issue_date"))).groupBy("district", "issue_year").agg(count("*").alias("yearly_count")).orderBy("district", "issue_year")
   district_pd = district_stats.toPandas()
   matrix_pd = business_district_matrix.toPandas()
   density_pd = density_analysis.toPandas()
   coordinate_pd = coordinate_data.toPandas()
   hotspot_pd = hotspot_analysis.toPandas()
   trend_pd = growth_trend.toPandas()
   district_pd['activity_rate'] = district_pd['active_count'] / district_pd['total_count'] * 100
   district_pd['capital_intensity'] = district_pd['avg_capital'] / district_pd['total_employees']
   competitiveness_score = district_pd.copy()
   competitiveness_score['competition_index'] = (competitiveness_score['total_count'].rank(pct=True) * 0.4 + competitiveness_score['avg_capital'].rank(pct=True) * 0.3 + competitiveness_score['activity_rate'].rank(pct=True) * 0.3) * 100
   pivot_matrix = matrix_pd.pivot(index='district', columns='business_type', values='count').fillna(0)
   diversity_index = pivot_matrix.apply(lambda x: 1 - sum((x/x.sum())**2) if x.sum() > 0 else 0, axis=1).reset_index()
   diversity_index.columns = ['district', 'diversity_index']
   clustering_data = coordinate_pd.groupby(['district', pd.cut(coordinate_pd['latitude'], bins=10), pd.cut(coordinate_pd['longitude'], bins=10)]).size().reset_index(name='cluster_count')
   max_cluster_size = clustering_data.groupby('district')['cluster_count'].max().reset_index()
   max_cluster_size.columns = ['district', 'max_cluster_size']
   spatial_result = {
       'district_overview': district_pd.to_dict('records'),
       'business_matrix': matrix_pd.to_dict('records'),
       'density_analysis': density_pd.to_dict('records'),
       'coordinate_data': coordinate_pd.to_dict('records'),
       'hotspot_centers': hotspot_pd.to_dict('records'),
       'growth_trends': trend_pd.to_dict('records'),
       'competitiveness_ranking': competitiveness_score[['district', 'competition_index']].to_dict('records'),
       'diversity_index': diversity_index.to_dict('records'),
       'clustering_analysis': max_cluster_size.to_dict('records')
   }
   return JsonResponse(spatial_result)

餐饮服务许可证数据可视化分析系统文档展示

文档.png

💖💖作者:计算机毕业设计小途 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目