【大数据】海底捞门店数据可视化系统计算机毕业设计项目 Hadoop+Spark环境配置数据科学与大数据技术附源码+文档+讲解

前言

💖💖作者：计算机程序员小杨 💙💙个人简介：我是一名计算机相关专业的从业者，擅长Java、微信小程序、Python、Golang、安卓Android等多个IT方向。会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。热爱技术，喜欢钻研新工具和框架，也乐于通过代码解决实际问题，大家有技术代码这一块的问题可以问我！ 💛💛想说的话：感谢大家的关注与支持！ 💕💕文末获取源码联系计算机程序员小杨 💜💜 网站实战项目安卓/小程序实战项目大数据实战项目深度学习实战项目计算机毕业设计选题 💜💜

一.开发工具简介

大数据框架：Hadoop+Spark（本次没用Hive，支持定制）开发语言：Python+Java（两个版本都支持）后端框架：Django+Spring Boot(Spring+SpringMVC+Mybatis)（两个版本都支持）前端：Vue+ElementUI+Echarts+HTML+CSS+JavaScript+jQuery 详细技术点：Hadoop、HDFS、Spark、Spark SQL、Pandas、NumPy 数据库：MySQL

二.系统内容简介

海底捞门店数据可视化系统是一个基于大数据技术构建的餐饮企业决策支持平台。系统采用Hadoop+Spark大数据框架作为核心处理引擎，以Python为主要开发语言，Django提供后端服务支撑，前端使用Vue+ElementUI+Echarts技术栈实现用户界面和数据可视化展示。系统通过HDFS分布式存储海量门店运营数据，利用Spark SQL进行高效的数据查询和分析，结合Pandas、NumPy等数据处理库对门店经营指标进行深度挖掘。平台提供用户权限管理、门店数据统计、市场竞争态势分析、经营策略评估、空间地理分布展示、选址决策支持等核心功能模块，通过直观的可视化大屏帮助管理者实时掌握门店运营状况，为企业扩张和运营优化提供数据驱动的决策依据。系统将复杂的大数据分析过程封装为简洁易用的可视化界面，使非技术背景的管理人员也能便捷地获取业务洞察。

三.系统功能演示

海底捞门店数据可视化系统

四.系统界面展示

在这里插入图片描述

五.系统源码展示


from pyspark.sql import SparkSession
from pyspark.sql.functions import col, sum as spark_sum, avg, count, when, desc, asc
from django.http import JsonResponse
from django.views.decorators.http import require_http_methods
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import json

spark = SparkSession.builder.appName("HaiDiLaoDataAnalysis").config("spark.sql.adaptive.enabled", "true").config("spark.sql.adaptive.coalescePartitions.enabled", "true").getOrCreate()

@require_http_methods(["GET"])
def get_store_performance_analysis(request):
    store_id = request.GET.get('store_id')
    start_date = request.GET.get('start_date')
    end_date = request.GET.get('end_date')
    df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/haidilao").option("dbtable", "store_daily_data").option("user", "root").option("password", "password").load()
    filtered_df = df.filter((col("store_id") == store_id) & (col("date") >= start_date) & (col("date") <= end_date))
    revenue_stats = filtered_df.agg(spark_sum("daily_revenue").alias("total_revenue"), avg("daily_revenue").alias("avg_revenue"), count("daily_revenue").alias("business_days")).collect()[0]
    customer_stats = filtered_df.agg(spark_sum("customer_count").alias("total_customers"), avg("customer_count").alias("avg_customers"), avg("per_capita_consumption").alias("avg_per_capita")).collect()[0]
    peak_analysis = filtered_df.select("date", "daily_revenue", "customer_count").orderBy(desc("daily_revenue")).limit(5)
    trend_data = filtered_df.select("date", "daily_revenue", "customer_count").orderBy("date").toPandas()
    trend_data['revenue_ma7'] = trend_data['daily_revenue'].rolling(window=7).mean()
    growth_rate = ((revenue_stats['total_revenue'] - filtered_df.filter(col("date") < start_date).agg(spark_sum("daily_revenue")).collect()[0][0]) / filtered_df.filter(col("date") < start_date).agg(spark_sum("daily_revenue")).collect()[0][0]) * 100 if filtered_df.filter(col("date") < start_date).count() > 0 else 0
    performance_score = min(100, max(0, (revenue_stats['avg_revenue'] / 50000 * 40) + (customer_stats['avg_customers'] / 200 * 30) + (customer_stats['avg_per_capita'] / 150 * 30)))
    result = {"revenue_analysis": {"total_revenue": float(revenue_stats['total_revenue']), "average_revenue": float(revenue_stats['avg_revenue']), "business_days": int(revenue_stats['business_days']), "growth_rate": round(growth_rate, 2)}, "customer_analysis": {"total_customers": int(customer_stats['total_customers']), "average_customers": float(customer_stats['avg_customers']), "avg_per_capita": float(customer_stats['avg_per_capita'])}, "peak_days": [{"date": str(row['date']), "revenue": float(row['daily_revenue']), "customers": int(row['customer_count'])} for row in peak_analysis.collect()], "trend_data": trend_data.to_dict('records'), "performance_score": round(performance_score, 1)}
    return JsonResponse(result)

@require_http_methods(["GET"])
def get_market_competition_analysis(request):
    city = request.GET.get('city')
    district = request.GET.get('district')
    radius = float(request.GET.get('radius', 3.0))
    store_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/haidilao").option("dbtable", "store_info").option("user", "root").option("password", "password").load()
    competitor_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/haidilao").option("dbtable", "competitor_info").option("user", "root").option("password", "password").load()
    target_stores = store_df.filter((col("city") == city) & (col("district") == district))
    nearby_competitors = competitor_df.filter((col("city") == city) & (col("district") == district))
    competitor_density = nearby_competitors.groupBy("brand_name").agg(count("*").alias("store_count"), avg("avg_price").alias("avg_brand_price")).orderBy(desc("store_count"))
    market_share_data = target_stores.join(nearby_competitors, ["city", "district"], "left").groupBy("brand_name").agg(count("*").alias("count")).withColumn("market_share", col("count") / competitor_density.agg(spark_sum("store_count")).collect()[0][0] * 100)
    price_comparison = nearby_competitors.groupBy("brand_name").agg(avg("avg_price").alias("average_price"), avg("rating").alias("average_rating")).orderBy("average_price")
    haidilao_performance = target_stores.join(spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/haidilao").option("dbtable", "store_daily_data").option("user", "root").option("password", "password").load(), "store_id").groupBy("store_id").agg(avg("daily_revenue").alias("avg_revenue"), avg("customer_count").alias("avg_customers"))
    competitor_performance = nearby_competitors.filter(col("brand_name") != "海底捞").groupBy("brand_name").agg(avg("estimated_revenue").alias("avg_competitor_revenue"), count("*").alias("competitor_count"))
    competitive_advantage = haidilao_performance.agg(avg("avg_revenue")).collect()[0][0] / competitor_performance.agg(avg("avg_competitor_revenue")).collect()[0][0] if competitor_performance.count() > 0 else 1.0
    threat_level = "高" if competitor_density.count() > 10 else "中" if competitor_density.count() > 5 else "低"
    market_saturation = min(100, (competitor_density.agg(spark_sum("store_count")).collect()[0][0] / (target_stores.count() + nearby_competitors.count())) * 100)
    result = {"competition_overview": {"total_competitors": int(nearby_competitors.count()), "threat_level": threat_level, "market_saturation": round(market_saturation, 1), "competitive_advantage_ratio": round(competitive_advantage, 2)}, "competitor_analysis": [{"brand": row['brand_name'], "store_count": int(row['store_count']), "avg_price": float(row['avg_brand_price'])} for row in competitor_density.collect()], "market_share": [{"brand": row['brand_name'], "share": round(float(row['market_share']), 2)} for row in market_share_data.collect()], "price_positioning": [{"brand": row['brand_name'], "price": float(row['average_price']), "rating": float(row['average_rating'])} for row in price_comparison.collect()]}
    return JsonResponse(result)

@require_http_methods(["POST"])
def analyze_store_location_potential(request):
    data = json.loads(request.body)
    latitude = float(data.get('latitude'))
    longitude = float(data.get('longitude'))
    city = data.get('city')
    district = data.get('district')
    location_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/haidilao").option("dbtable", "location_analysis").option("user", "root").option("password", "password").load()
    demographic_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/haidilao").option("dbtable", "demographic_data").option("user", "root").option("password", "password").load()
    poi_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/haidilao").option("dbtable", "poi_data").option("user", "root").option("password", "password").load()
    nearby_locations = location_df.filter((col("city") == city) & (col("district") == district)).withColumn("distance", ((col("latitude") - latitude) ** 2 + (col("longitude") - longitude) ** 2) ** 0.5).filter(col("distance") <= 0.01)
    demographic_score = demographic_df.filter((col("city") == city) & (col("district") == district)).select("population_density", "income_level", "age_structure").collect()
    if demographic_score:
        demo_data = demographic_score[0]
        population_score = min(100, demo_data['population_density'] / 10000 * 100)
        income_score = min(100, demo_data['income_level'] / 8000 * 100)
        age_score = 100 if 25 <= demo_data['age_structure'] <= 45 else 70
    else:
        population_score = income_score = age_score = 50
    nearby_pois = poi_df.filter((col("city") == city) & (col("district") == district)).withColumn("poi_distance", ((col("latitude") - latitude) ** 2 + (col("longitude") - longitude) ** 2) ** 0.5).filter(col("poi_distance") <= 0.005)
    shopping_malls = nearby_pois.filter(col("poi_type") == "购物中心").count()
    office_buildings = nearby_pois.filter(col("poi_type") == "写字楼").count()
    residential_areas = nearby_pois.filter(col("poi_type") == "住宅区").count()
    transportation_hubs = nearby_pois.filter(col("poi_type") == "交通枢纽").count()
    accessibility_score = min(100, (shopping_malls * 20 + office_buildings * 15 + residential_areas * 10 + transportation_hubs * 25))
    existing_stores = nearby_locations.filter(col("store_type") == "海底捞").count()
    competitors = nearby_locations.filter(col("store_type") != "海底捞").count()
    competition_score = max(0, 100 - (existing_stores * 30 + competitors * 15))
    traffic_data = nearby_locations.agg(avg("foot_traffic").alias("avg_traffic")).collect()[0]
    traffic_score = min(100, traffic_data['avg_traffic'] / 1000 * 100) if traffic_data['avg_traffic'] else 60
    overall_score = (population_score * 0.25 + income_score * 0.2 + age_score * 0.15 + accessibility_score * 0.2 + competition_score * 0.15 + traffic_score * 0.05)
    risk_factors = []
    if existing_stores > 2:
        risk_factors.append("附近海底捞门店密度过高")
    if competitors > 5:
        risk_factors.append("竞争对手较多")
    if population_score < 40:
        risk_factors.append("人口密度偏低")
    recommendation = "强烈推荐" if overall_score >= 80 else "推荐" if overall_score >= 60 else "谨慎考虑" if overall_score >= 40 else "不推荐"
    result = {"location_score": round(overall_score, 1), "score_breakdown": {"demographic": round(population_score, 1), "income": round(income_score, 1), "age_structure": round(age_score, 1), "accessibility": round(accessibility_score, 1), "competition": round(competition_score, 1), "traffic": round(traffic_score, 1)}, "nearby_analysis": {"shopping_malls": int(shopping_malls), "office_buildings": int(office_buildings), "residential_areas": int(residential_areas), "transport_hubs": int(transportation_hubs), "existing_stores": int(existing_stores), "competitors": int(competitors)}, "recommendation": recommendation, "risk_factors": risk_factors, "estimated_daily_customers": int(traffic_data['avg_traffic'] * 0.15) if traffic_data['avg_traffic'] else 0}
    return JsonResponse(result)

六.系统文档展示