💖💖作者:计算机毕业设计杰瑞 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学校实战项目 计算机毕业设计选题推荐
使用Vue.js构建的大数据分析与可视化系统介绍
使用Vue.js构建的大数据分析与可视化系统是一个面向旅游行业的综合性数据处理平台,该系统充分利用Hadoop分布式存储架构和Spark大数据计算引擎的强大处理能力,结合Vue.js现代化前端框架、ElementUI组件库以及Echarts数据可视化工具,构建了一套完整的旅游景点信息管理与数据分析解决方案。系统采用Spring Boot作为后端服务框架,通过SpringMVC处理请求响应,使用MyBatis进行数据持久化操作,底层数据存储依托MySQL关系型数据库。平台核心功能涵盖景点信息管理、用户行为分析、订单数据处理、评分统计分析以及地区分布可视化等模块,通过Spark SQL进行复杂查询操作,利用Pandas和NumPy进行数据预处理和统计计算,最终通过Echarts图表组件将分析结果以直观的可视化形式展现给用户,为旅游行业的经营决策提供数据支撑,同时也为计算机专业学生提供了一个集大数据处理、Web开发、数据可视化于一体的综合性实践项目。
使用Vue.js构建的大数据分析与可视化系统演示视频
使用Vue.js构建的大数据分析与可视化系统演示图片
使用Vue.js构建的大数据分析与可视化系统代码展示
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.sql.types import *
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import json
spark = SparkSession.builder.appName("TourismDataAnalysis").config("spark.sql.adaptive.enabled", "true").config("spark.sql.adaptive.coalescePartitions.enabled", "true").getOrCreate()
def analyze_scenic_spot_popularity():
scenic_df = spark.sql("SELECT spot_id, spot_name, region_id, category_id, visit_count, rating_avg FROM scenic_spots")
order_df = spark.sql("SELECT spot_id, order_date, user_id, order_amount FROM order_info WHERE order_status = '已完成'")
rating_df = spark.sql("SELECT spot_id, rating_score, rating_date, user_id FROM rating_info")
spot_order_df = scenic_df.join(order_df, "spot_id", "left")
spot_with_stats = spot_order_df.groupBy("spot_id", "spot_name", "region_id", "category_id").agg(
count("order_date").alias("total_orders"),
sum("order_amount").alias("total_revenue"),
countDistinct("user_id").alias("unique_visitors")
)
recent_orders = order_df.filter(col("order_date") >= date_sub(current_date(), 30))
monthly_stats = recent_orders.groupBy("spot_id").agg(
count("order_date").alias("monthly_orders"),
avg("order_amount").alias("avg_order_amount")
)
popularity_score = spot_with_stats.join(monthly_stats, "spot_id", "left").withColumn(
"popularity_index",
when(col("monthly_orders").isNull(), col("total_orders") * 0.3 + col("unique_visitors") * 0.7)
.otherwise(col("monthly_orders") * 0.4 + col("total_orders") * 0.2 + col("unique_visitors") * 0.4)
)
final_ranking = popularity_score.join(
rating_df.groupBy("spot_id").agg(avg("rating_score").alias("current_rating")), "spot_id", "left"
).withColumn(
"final_score",
col("popularity_index") * 0.6 + coalesce(col("current_rating"), lit(3.0)) * 0.4
).orderBy(col("final_score").desc())
result_pandas = final_ranking.limit(50).toPandas()
result_json = []
for index, row in result_pandas.iterrows():
spot_data = {
'spot_id': int(row['spot_id']),
'spot_name': row['spot_name'],
'region_id': int(row['region_id']),
'category_id': int(row['category_id']),
'total_orders': int(row['total_orders']) if pd.notna(row['total_orders']) else 0,
'total_revenue': float(row['total_revenue']) if pd.notna(row['total_revenue']) else 0.0,
'unique_visitors': int(row['unique_visitors']) if pd.notna(row['unique_visitors']) else 0,
'monthly_orders': int(row['monthly_orders']) if pd.notna(row['monthly_orders']) else 0,
'avg_order_amount': float(row['avg_order_amount']) if pd.notna(row['avg_order_amount']) else 0.0,
'current_rating': float(row['current_rating']) if pd.notna(row['current_rating']) else 3.0,
'popularity_index': float(row['popularity_index']) if pd.notna(row['popularity_index']) else 0.0,
'final_score': float(row['final_score']) if pd.notna(row['final_score']) else 0.0,
'rank': index + 1
}
result_json.append(spot_data)
return result_json
def generate_regional_distribution_analysis():
region_df = spark.sql("SELECT region_id, region_name, province, city FROM region_info")
scenic_df = spark.sql("SELECT spot_id, spot_name, region_id, category_id FROM scenic_spots")
order_df = spark.sql("SELECT spot_id, order_date, user_id, order_amount FROM order_info WHERE order_status IN ('已完成', '已确认')")
user_df = spark.sql("SELECT user_id, user_province, user_city, registration_date FROM user_info")
regional_scenic = region_df.join(scenic_df, "region_id", "inner")
regional_orders = regional_scenic.join(order_df, "spot_id", "inner")
regional_users = regional_orders.join(user_df, "user_id", "inner")
regional_stats = regional_users.groupBy("region_id", "region_name", "province", "city").agg(
countDistinct("spot_id").alias("spot_count"),
countDistinct("user_id").alias("visitor_count"),
count("order_date").alias("total_orders"),
sum("order_amount").alias("total_revenue"),
avg("order_amount").alias("avg_order_value")
)
visitor_source_analysis = regional_users.groupBy("region_id", "region_name", "user_province").agg(
countDistinct("user_id").alias("province_visitors"),
count("order_date").alias("province_orders")
)
visitor_diversity = visitor_source_analysis.groupBy("region_id", "region_name").agg(
countDistinct("user_province").alias("source_provinces"),
sum("province_visitors").alias("total_unique_visitors")
)
seasonal_analysis = regional_orders.withColumn("order_month", date_format(col("order_date"), "yyyy-MM")).groupBy("region_id", "region_name", "order_month").agg(
count("order_date").alias("monthly_orders"),
sum("order_amount").alias("monthly_revenue")
)
peak_months = seasonal_analysis.groupBy("region_id", "region_name").agg(
max("monthly_orders").alias("peak_orders"),
avg("monthly_orders").alias("avg_monthly_orders")
)
comprehensive_regional = regional_stats.join(visitor_diversity, ["region_id", "region_name"], "left").join(peak_months, ["region_id", "region_name"], "left").withColumn(
"tourism_intensity",
col("total_orders") / col("spot_count")
).withColumn(
"visitor_attraction_rate",
col("visitor_count") / col("spot_count")
).withColumn(
"revenue_per_spot",
col("total_revenue") / col("spot_count")
).orderBy(col("total_revenue").desc())
result_pandas = comprehensive_regional.toPandas()
distribution_data = []
for index, row in result_pandas.iterrows():
region_analysis = {
'region_id': int(row['region_id']),
'region_name': row['region_name'],
'province': row['province'],
'city': row['city'],
'spot_count': int(row['spot_count']),
'visitor_count': int(row['visitor_count']),
'total_orders': int(row['total_orders']),
'total_revenue': float(row['total_revenue']),
'avg_order_value': float(row['avg_order_value']) if pd.notna(row['avg_order_value']) else 0.0,
'source_provinces': int(row['source_provinces']) if pd.notna(row['source_provinces']) else 0,
'peak_orders': int(row['peak_orders']) if pd.notna(row['peak_orders']) else 0,
'avg_monthly_orders': float(row['avg_monthly_orders']) if pd.notna(row['avg_monthly_orders']) else 0.0,
'tourism_intensity': float(row['tourism_intensity']) if pd.notna(row['tourism_intensity']) else 0.0,
'visitor_attraction_rate': float(row['visitor_attraction_rate']) if pd.notna(row['visitor_attraction_rate']) else 0.0,
'revenue_per_spot': float(row['revenue_per_spot']) if pd.notna(row['revenue_per_spot']) else 0.0
}
distribution_data.append(region_analysis)
return distribution_data
def process_user_behavior_patterns():
user_df = spark.sql("SELECT user_id, registration_date, user_age, user_gender, user_province FROM user_info")
order_df = spark.sql("SELECT user_id, spot_id, order_date, order_amount, order_status FROM order_info")
rating_df = spark.sql("SELECT user_id, spot_id, rating_score, rating_date, rating_comment FROM rating_info")
scenic_df = spark.sql("SELECT spot_id, spot_name, category_id, region_id FROM scenic_spots")
category_df = spark.sql("SELECT category_id, category_name FROM scenic_category")
user_orders = user_df.join(order_df, "user_id", "inner").filter(col("order_status").isin(["已完成", "已确认"]))
user_with_scenic = user_orders.join(scenic_df, "spot_id", "inner").join(category_df, "category_id", "inner")
user_ratings = user_df.join(rating_df, "user_id", "inner")
user_behavior_stats = user_with_scenic.groupBy("user_id", "user_age", "user_gender", "user_province").agg(
count("order_date").alias("total_orders"),
sum("order_amount").alias("total_spending"),
avg("order_amount").alias("avg_order_amount"),
countDistinct("spot_id").alias("unique_spots_visited"),
countDistinct("category_id").alias("category_diversity"),
countDistinct("region_id").alias("region_diversity"),
min("order_date").alias("first_order_date"),
max("order_date").alias("last_order_date")
)
user_rating_behavior = user_ratings.groupBy("user_id").agg(
count("rating_score").alias("total_ratings"),
avg("rating_score").alias("avg_rating_given"),
stddev("rating_score").alias("rating_variance")
)
user_category_preferences = user_with_scenic.groupBy("user_id", "category_name").agg(
count("order_date").alias("category_orders"),
sum("order_amount").alias("category_spending")
)
user_top_categories = user_category_preferences.withColumn(
"row_num", row_number().over(Window.partitionBy("user_id").orderBy(col("category_orders").desc()))
).filter(col("row_num") <= 3).groupBy("user_id").agg(
collect_list("category_name").alias("preferred_categories")
)
comprehensive_user_analysis = user_behavior_stats.join(user_rating_behavior, "user_id", "left").join(user_top_categories, "user_id", "left").withColumn(
"days_active",
datediff(col("last_order_date"), col("first_order_date")) + 1
).withColumn(
"order_frequency",
col("total_orders") / col("days_active")
).withColumn(
"user_type",
when(col("total_orders") >= 10, "活跃用户")
.when(col("total_orders") >= 5, "中等用户")
.otherwise("轻度用户")
).withColumn(
"spending_level",
when(col("total_spending") >= 5000, "高消费")
.when(col("total_spending") >= 2000, "中等消费")
.otherwise("低消费")
)
age_groups = comprehensive_user_analysis.withColumn(
"age_group",
when(col("user_age") < 25, "青年群体")
.when(col("user_age") < 35, "中青年群体")
.when(col("user_age") < 50, "中年群体")
.otherwise("中老年群体")
)
pattern_analysis = age_groups.groupBy("age_group", "user_gender", "user_type", "spending_level").agg(
count("user_id").alias("user_count"),
avg("avg_order_amount").alias("group_avg_order"),
avg("category_diversity").alias("group_category_diversity"),
avg("avg_rating_given").alias("group_avg_rating")
).orderBy(col("user_count").desc())
result_pandas = pattern_analysis.toPandas()
behavior_patterns = []
for index, row in result_pandas.iterrows():
pattern_info = {
'age_group': row['age_group'],
'user_gender': row['user_gender'],
'user_type': row['user_type'],
'spending_level': row['spending_level'],
'user_count': int(row['user_count']),
'group_avg_order': float(row['group_avg_order']) if pd.notna(row['group_avg_order']) else 0.0,
'group_category_diversity': float(row['group_category_diversity']) if pd.notna(row['group_category_diversity']) else 0.0,
'group_avg_rating': float(row['group_avg_rating']) if pd.notna(row['group_avg_rating']) else 0.0
}
behavior_patterns.append(pattern_info)
return behavior_patterns
使用Vue.js构建的大数据分析与可视化系统文档展示
💖💖作者:计算机毕业设计杰瑞 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学校实战项目 计算机毕业设计选题推荐