一、个人简介
💖💖作者:计算机编程果茶熊 💙💙个人简介:曾长期从事计算机专业培训教学,担任过编程老师,同时本人也热爱上课教学,擅长Java、微信小程序、Python、Golang、安卓Android等多个IT方向。会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 计算机毕业设计选题 💕💕文末获取源码联系计算机编程果茶熊
二、系统介绍
大数据框架:Hadoop+Spark(Hive需要定制修改) 开发语言:Java+Python(两个版本都支持) 数据库:MySQL 后端框架:SpringBoot(Spring+SpringMVC+Mybatis)+Django(两个版本都支持) 前端:Vue+Echarts+HTML+CSS+JavaScript+jQuery
《游戏行业销售数据可视化分析系统》是一套基于大数据技术的游戏市场分析平台,采用Hadoop+Spark分布式计算框架处理海量游戏销售数据,通过Python语言结合Django后端框架构建数据处理服务,前端运用Vue+ElementUI+Echarts技术栈实现交互式数据可视化界面。系统核心功能涵盖市场总览分析、平台策略分析、类型偏好分析、发行商策略分析、销售特征分析以及可视化大屏展示等六大模块,能够深度挖掘游戏销售数据中的市场规律和商业价值。系统运用Spark SQL进行大规模数据查询分析,结合Pandas和NumPy进行数据处理和统计计算,将复杂的销售数据转化为直观的图表和报告,为游戏行业从业者提供科学的数据支撑和决策依据,帮助企业洞察市场趋势、优化产品策略、提升销售业绩。
三、视频解说
四、部分功能展示
五、部分代码展示
spark = SparkSession.builder.appName("GameSalesAnalysis").config("spark.sql.adaptive.enabled", "true").getOrCreate()
def market_overview_analysis(self):
sales_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/gamedb").option("dbtable", "sales_data").option("user", "root").option("password", "password").load()
sales_df.createOrReplaceTempView("sales")
monthly_sales = spark.sql("SELECT DATE_FORMAT(sale_date, 'yyyy-MM') as month, SUM(sales_amount) as total_sales, COUNT(*) as game_count FROM sales GROUP BY DATE_FORMAT(sale_date, 'yyyy-MM') ORDER BY month")
monthly_pandas = monthly_sales.toPandas()
growth_rates = []
for i in range(1, len(monthly_pandas)):
prev_sales = monthly_pandas.iloc[i-1]['total_sales']
curr_sales = monthly_pandas.iloc[i]['total_sales']
growth_rate = ((curr_sales - prev_sales) / prev_sales) * 100 if prev_sales > 0 else 0
growth_rates.append(growth_rate)
monthly_pandas['growth_rate'] = [0] + growth_rates
platform_analysis = spark.sql("SELECT platform, SUM(sales_amount) as platform_sales, AVG(user_rating) as avg_rating FROM sales GROUP BY platform ORDER BY platform_sales DESC")
platform_pandas = platform_analysis.toPandas()
market_share = platform_pandas['platform_sales'] / platform_pandas['platform_sales'].sum() * 100
platform_pandas['market_share'] = market_share
top_genres = spark.sql("SELECT genre, SUM(sales_amount) as genre_sales, COUNT(DISTINCT game_id) as game_count FROM sales GROUP BY genre ORDER BY genre_sales DESC LIMIT 10")
regional_analysis = spark.sql("SELECT region, SUM(sales_amount) as regional_sales, AVG(price) as avg_price FROM sales GROUP BY region ORDER BY regional_sales DESC")
quarterly_trend = spark.sql("SELECT CONCAT(YEAR(sale_date), '-Q', QUARTER(sale_date)) as quarter, SUM(sales_amount) as quarterly_sales, COUNT(DISTINCT user_id) as active_users FROM sales GROUP BY YEAR(sale_date), QUARTER(sale_date) ORDER BY quarter")
result_data = {'monthly_trend': monthly_pandas.to_dict('records'), 'platform_analysis': platform_pandas.to_dict('records'), 'top_genres': top_genres.toPandas().to_dict('records'), 'regional_analysis': regional_analysis.toPandas().to_dict('records'), 'quarterly_trend': quarterly_trend.toPandas().to_dict('records')}
return result_data
def platform_strategy_analysis(self):
platform_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/gamedb").option("dbtable", "platform_data").option("user", "root").option("password", "password").load()
platform_df.createOrReplaceTempView("platforms")
platform_performance = spark.sql("SELECT platform_name, SUM(total_sales) as revenue, COUNT(game_id) as game_count, AVG(commission_rate) as avg_commission, SUM(download_count) as total_downloads FROM platforms GROUP BY platform_name ORDER BY revenue DESC")
platform_pandas = platform_performance.toPandas()
platform_pandas['revenue_per_game'] = platform_pandas['revenue'] / platform_pandas['game_count']
platform_pandas['download_to_sales_ratio'] = platform_pandas['total_downloads'] / platform_pandas['revenue']
pricing_strategy = spark.sql("SELECT platform_name, AVG(game_price) as avg_price, MIN(game_price) as min_price, MAX(game_price) as max_price, STDDEV(game_price) as price_variance FROM platforms WHERE game_price > 0 GROUP BY platform_name")
user_behavior = spark.sql("SELECT platform_name, AVG(session_duration) as avg_session, AVG(retention_rate) as avg_retention, SUM(in_app_purchases) as total_iap FROM platforms GROUP BY platform_name")
competitive_analysis = spark.sql("SELECT platform_name, COUNT(DISTINCT publisher_id) as publisher_count, COUNT(DISTINCT genre) as genre_diversity, AVG(metacritic_score) as avg_quality FROM platforms GROUP BY platform_name")
market_penetration = spark.sql("SELECT platform_name, region, SUM(sales_amount) as regional_sales, COUNT(DISTINCT user_id) as unique_users FROM platforms GROUP BY platform_name, region")
platform_growth = spark.sql("SELECT platform_name, DATE_FORMAT(launch_date, 'yyyy-MM') as month, SUM(new_user_count) as new_users, SUM(monthly_revenue) as monthly_rev FROM platforms GROUP BY platform_name, DATE_FORMAT(launch_date, 'yyyy-MM') ORDER BY platform_name, month")
feature_analysis = spark.sql("SELECT platform_name, social_features, cloud_save, cross_platform, COUNT(*) as feature_usage FROM platforms GROUP BY platform_name, social_features, cloud_save, cross_platform")
result_data = {'platform_performance': platform_pandas.to_dict('records'), 'pricing_strategy': pricing_strategy.toPandas().to_dict('records'), 'user_behavior': user_behavior.toPandas().to_dict('records'), 'competitive_analysis': competitive_analysis.toPandas().to_dict('records'), 'market_penetration': market_penetration.toPandas().to_dict('records'), 'platform_growth': platform_growth.toPandas().to_dict('records'), 'feature_analysis': feature_analysis.toPandas().to_dict('records')}
return result_data
def sales_characteristics_analysis(self):
sales_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/gamedb").option("dbtable", "detailed_sales").option("user", "root").option("password", "password").load()
sales_df.createOrReplaceTempView("detailed_sales")
seasonal_patterns = spark.sql("SELECT MONTH(sale_date) as month, DAYOFWEEK(sale_date) as day_of_week, HOUR(sale_timestamp) as hour, SUM(sales_amount) as hourly_sales, COUNT(*) as transaction_count FROM detailed_sales GROUP BY MONTH(sale_date), DAYOFWEEK(sale_date), HOUR(sale_timestamp) ORDER BY month, day_of_week, hour")
seasonal_pandas = seasonal_patterns.toPandas()
seasonal_pandas['sales_intensity'] = seasonal_pandas['hourly_sales'] / seasonal_pandas['transaction_count']
price_elasticity = spark.sql("SELECT price_range, COUNT(*) as sales_count, SUM(sales_amount) as total_revenue, AVG(user_rating) as avg_rating FROM detailed_sales GROUP BY CASE WHEN price <= 10 THEN 'Low' WHEN price <= 30 THEN 'Medium' WHEN price <= 60 THEN 'High' ELSE 'Premium' END ORDER BY total_revenue DESC")
customer_segments = spark.sql("SELECT customer_type, age_group, gender, SUM(purchase_amount) as segment_spending, COUNT(DISTINCT customer_id) as customer_count, AVG(purchase_frequency) as avg_frequency FROM detailed_sales GROUP BY customer_type, age_group, gender ORDER BY segment_spending DESC")
discount_impact = spark.sql("SELECT discount_percentage, COUNT(*) as discounted_sales, SUM(original_price - final_price) as total_discount, AVG(sales_boost) as avg_sales_boost FROM detailed_sales WHERE discount_percentage > 0 GROUP BY discount_percentage ORDER BY discount_percentage")
payment_analysis = spark.sql("SELECT payment_method, region, COUNT(*) as payment_count, SUM(transaction_amount) as payment_volume, AVG(processing_time) as avg_processing_time FROM detailed_sales GROUP BY payment_method, region ORDER BY payment_volume DESC")
refund_patterns = spark.sql("SELECT reason_category, platform, COUNT(*) as refund_count, SUM(refund_amount) as total_refunds, AVG(days_to_refund) as avg_refund_days FROM detailed_sales WHERE refund_status = 'completed' GROUP BY reason_category, platform ORDER BY refund_count DESC")
sales_velocity = spark.sql("SELECT game_id, DATE_FORMAT(sale_date, 'yyyy-MM-dd') as sale_day, SUM(daily_sales) as daily_volume, LAG(SUM(daily_sales)) OVER (PARTITION BY game_id ORDER BY DATE_FORMAT(sale_date, 'yyyy-MM-dd')) as prev_day_sales FROM detailed_sales GROUP BY game_id, DATE_FORMAT(sale_date, 'yyyy-MM-dd')")
velocity_pandas = sales_velocity.toPandas()
velocity_pandas['velocity_change'] = ((velocity_pandas['daily_volume'] - velocity_pandas['prev_day_sales']) / velocity_pandas['prev_day_sales'] * 100).fillna(0)
bundle_analysis = spark.sql("SELECT bundle_type, COUNT(*) as bundle_sales, AVG(bundle_discount) as avg_bundle_discount, SUM(bundle_revenue) as total_bundle_revenue FROM detailed_sales WHERE is_bundle = true GROUP BY bundle_type ORDER BY total_bundle_revenue DESC")
result_data = {'seasonal_patterns': seasonal_pandas.to_dict('records'), 'price_elasticity': price_elasticity.toPandas().to_dict('records'), 'customer_segments': customer_segments.toPandas().to_dict('records'), 'discount_impact': discount_impact.toPandas().to_dict('records'), 'payment_analysis': payment_analysis.toPandas().to_dict('records'), 'refund_patterns': refund_patterns.toPandas().to_dict('records'), 'sales_velocity': velocity_pandas.to_dict('records'), 'bundle_analysis': bundle_analysis.toPandas().to_dict('records')}
return result_data
六、部分文档展示
七、END
💕💕文末获取源码联系计算机编程果茶熊