大四党必看:基于Hadoop的茅台数据分析系统,毕设选题不再迷茫

49 阅读6分钟

💖💖作者:计算机编程小咖 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目

@TOC

贵州茅台股票数据分析系统介绍

基于大数据的贵州茅台股票数据分析系统是一个集数据采集、存储、分析与可视化于一体的综合性金融数据分析平台。该系统采用Hadoop分布式存储框架和Spark大数据处理引擎作为核心技术架构,通过HDFS实现海量股票数据的可靠存储,利用Spark SQL进行高效的数据查询和处理,结合Pandas和NumPy进行深度数据分析。系统提供Python和Java两种开发语言版本,后端分别采用Django和Spring Boot框架,前端使用Vue+ElementUI构建现代化用户界面,通过Echarts实现丰富的数据可视化效果。功能模块涵盖系统首页、个人中心、用户管理等基础功能,以及茅台股票数据管理、价格趋势分析、技术指标分析、波动率与风险分析、成交量与流动性分析等核心业务功能,并配备大屏可视化展示模块。系统能够对茅台股票的历史数据进行多维度分析,通过技术指标计算帮助用户了解股票走势,通过波动率分析评估投资风险,通过成交量分析判断市场流动性,为投资者提供全方位的数据支撑和决策参考。整个系统充分体现了大数据技术在金融数据分析领域的应用价值,展现了现代化数据处理技术与传统金融分析相结合的技术实力。

贵州茅台股票数据分析系统演示视频

演示视频

贵州茅台股票数据分析系统演示图片

波动率与风险分析.png

成交量与流动性分析.png

登陆界面.png

技术指标分析.png

价格趋势分析.png

茅台股票数据管理.png

数据大屏.png

用户管理.png

贵州茅台股票数据分析系统代码展示

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, avg, stddev, max, min, sum, count, lag, when
from pyspark.sql.window import Window
import pandas as pd
import numpy as np
from django.http import JsonResponse
from django.views.decorators.csrf import csrf_exempt
import json

spark = SparkSession.builder.appName("MaotaiStockAnalysis").config("spark.sql.adaptive.enabled", "true").getOrCreate()

@csrf_exempt
def price_trend_analysis(request):
   if request.method == 'POST':
       data = json.loads(request.body)
       start_date = data.get('start_date')
       end_date = data.get('end_date')
       df = spark.sql(f"SELECT * FROM maotai_stock WHERE trade_date BETWEEN '{start_date}' AND '{end_date}' ORDER BY trade_date")
       window_spec = Window.orderBy("trade_date").rowsBetween(-4, 0)
       df_with_ma = df.withColumn("ma5", avg("close_price").over(window_spec))
       window_spec_20 = Window.orderBy("trade_date").rowsBetween(-19, 0)
       df_with_ma = df_with_ma.withColumn("ma20", avg("close_price").over(window_spec_20))
       df_with_change = df_with_ma.withColumn("prev_close", lag("close_price", 1).over(Window.orderBy("trade_date")))
       df_with_change = df_with_change.withColumn("price_change", col("close_price") - col("prev_close"))
       df_with_change = df_with_change.withColumn("change_rate", (col("price_change") / col("prev_close")) * 100)
       trend_direction = df_with_change.select(avg("change_rate").alias("avg_change_rate")).collect()[0]["avg_change_rate"]
       volatility = df_with_change.select(stddev("change_rate").alias("volatility")).collect()[0]["volatility"]
       max_price = df_with_change.select(max("close_price").alias("max_price")).collect()[0]["max_price"]
       min_price = df_with_change.select(min("close_price").alias("min_price")).collect()[0]["min_price"]
       trend_analysis = "上升趋势" if trend_direction > 0.5 else "下降趋势" if trend_direction < -0.5 else "震荡趋势"
       result_data = df_with_change.select("trade_date", "close_price", "ma5", "ma20", "price_change", "change_rate").toPandas().to_dict('records')
       analysis_result = {"trend_data": result_data, "trend_direction": trend_analysis, "avg_change_rate": round(trend_direction, 2), "volatility": round(volatility, 2), "max_price": max_price, "min_price": min_price, "price_range": round(max_price - min_price, 2)}
       return JsonResponse({"status": "success", "data": analysis_result})

@csrf_exempt
def technical_indicator_analysis(request):
   if request.method == 'POST':
       data = json.loads(request.body)
       stock_code = data.get('stock_code', '600519')
       period = data.get('period', 30)
       df = spark.sql(f"SELECT * FROM maotai_stock WHERE stock_code = '{stock_code}' ORDER BY trade_date DESC LIMIT {period}")
       window_spec = Window.orderBy("trade_date").rowsBetween(-13, 0)
       df_with_rsi = df.withColumn("price_change", col("close_price") - lag("close_price", 1).over(Window.orderBy("trade_date")))
       df_with_rsi = df_with_rsi.withColumn("gain", when(col("price_change") > 0, col("price_change")).otherwise(0))
       df_with_rsi = df_with_rsi.withColumn("loss", when(col("price_change") < 0, -col("price_change")).otherwise(0))
       df_with_rsi = df_with_rsi.withColumn("avg_gain", avg("gain").over(window_spec))
       df_with_rsi = df_with_rsi.withColumn("avg_loss", avg("loss").over(window_spec))
       df_with_rsi = df_with_rsi.withColumn("rs", col("avg_gain") / col("avg_loss"))
       df_with_rsi = df_with_rsi.withColumn("rsi", 100 - (100 / (1 + col("rs"))))
       bollinger_window = Window.orderBy("trade_date").rowsBetween(-19, 0)
       df_with_bollinger = df_with_rsi.withColumn("bb_middle", avg("close_price").over(bollinger_window))
       df_with_bollinger = df_with_bollinger.withColumn("bb_std", stddev("close_price").over(bollinger_window))
       df_with_bollinger = df_with_bollinger.withColumn("bb_upper", col("bb_middle") + (2 * col("bb_std")))
       df_with_bollinger = df_with_bollinger.withColumn("bb_lower", col("bb_middle") - (2 * col("bb_std")))
       macd_window_12 = Window.orderBy("trade_date").rowsBetween(-11, 0)
       macd_window_26 = Window.orderBy("trade_date").rowsBetween(-25, 0)
       df_with_macd = df_with_bollinger.withColumn("ema12", avg("close_price").over(macd_window_12))
       df_with_macd = df_with_macd.withColumn("ema26", avg("close_price").over(macd_window_26))
       df_with_macd = df_with_macd.withColumn("macd_line", col("ema12") - col("ema26"))
       df_with_macd = df_with_macd.withColumn("signal_line", avg("macd_line").over(Window.orderBy("trade_date").rowsBetween(-8, 0)))
       df_with_macd = df_with_macd.withColumn("macd_histogram", col("macd_line") - col("signal_line"))
       current_rsi = df_with_macd.select("rsi").orderBy(col("trade_date").desc()).first()["rsi"]
       current_bb_position = df_with_macd.select("close_price", "bb_upper", "bb_lower").orderBy(col("trade_date").desc()).first()
       bb_signal = "超买" if current_bb_position["close_price"] > current_bb_position["bb_upper"] else "超卖" if current_bb_position["close_price"] < current_bb_position["bb_lower"] else "正常区间"
       rsi_signal = "超买" if current_rsi > 70 else "超卖" if current_rsi < 30 else "正常区间"
       technical_data = df_with_macd.select("trade_date", "close_price", "rsi", "bb_upper", "bb_middle", "bb_lower", "macd_line", "signal_line", "macd_histogram").toPandas().to_dict('records')
       indicator_summary = {"current_rsi": round(current_rsi, 2), "rsi_signal": rsi_signal, "bb_signal": bb_signal, "technical_data": technical_data}
       return JsonResponse({"status": "success", "data": indicator_summary})

@csrf_exempt
def volatility_risk_analysis(request):
   if request.method == 'POST':
       data = json.loads(request.body)
       analysis_period = data.get('period', 60)
       risk_free_rate = data.get('risk_free_rate', 0.03)
       df = spark.sql(f"SELECT * FROM maotai_stock ORDER BY trade_date DESC LIMIT {analysis_period}")
       df_with_returns = df.withColumn("prev_close", lag("close_price", 1).over(Window.orderBy("trade_date")))
       df_with_returns = df_with_returns.withColumn("daily_return", (col("close_price") - col("prev_close")) / col("prev_close"))
       df_with_returns = df_with_returns.filter(col("daily_return").isNotNull())
       volatility_stats = df_with_returns.select(stddev("daily_return").alias("daily_volatility"), avg("daily_return").alias("mean_return")).collect()[0]
       annual_volatility = volatility_stats["daily_volatility"] * np.sqrt(252)
       annual_return = volatility_stats["mean_return"] * 252
       sharpe_ratio = (annual_return - risk_free_rate) / annual_volatility if annual_volatility != 0 else 0
       returns_list = df_with_returns.select("daily_return").rdd.map(lambda x: x[0]).collect()
       returns_array = np.array([r for r in returns_list if r is not None])
       var_95 = np.percentile(returns_array, 5) if len(returns_array) > 0 else 0
       var_99 = np.percentile(returns_array, 1) if len(returns_array) > 0 else 0
       rolling_window = Window.orderBy("trade_date").rowsBetween(-19, 0)
       df_with_rolling_vol = df_with_returns.withColumn("rolling_volatility", stddev("daily_return").over(rolling_window))
       max_volatility = df_with_rolling_vol.select(max("rolling_volatility").alias("max_vol")).collect()[0]["max_vol"]
       min_volatility = df_with_rolling_vol.select(min("rolling_volatility").alias("min_vol")).collect()[0]["min_vol"]
       positive_returns = df_with_returns.filter(col("daily_return") > 0).count()
       total_returns = df_with_returns.count()
       win_rate = (positive_returns / total_returns) * 100 if total_returns > 0 else 0
       risk_level = "高风险" if annual_volatility > 0.3 else "中风险" if annual_volatility > 0.15 else "低风险"
       volatility_data = df_with_rolling_vol.select("trade_date", "close_price", "daily_return", "rolling_volatility").toPandas().to_dict('records')
       risk_analysis = {"annual_volatility": round(annual_volatility, 4), "annual_return": round(annual_return, 4), "sharpe_ratio": round(sharpe_ratio, 4), "var_95": round(var_95, 4), "var_99": round(var_99, 4), "win_rate": round(win_rate, 2), "risk_level": risk_level, "max_volatility": round(max_volatility, 4) if max_volatility else 0, "min_volatility": round(min_volatility, 4) if min_volatility else 0, "volatility_data": volatility_data}
       return JsonResponse({"status": "success", "data": risk_analysis})

贵州茅台股票数据分析系统文档展示

文档.png

💖💖作者:计算机编程小咖 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目