【数据分析】基于大数据的个人财务健康状况分析系统 | 大数据毕设实战项目 数据可视化分析大屏 大数据选题推荐 Hadoop SPark java

63 阅读8分钟

💖💖作者:计算机毕业设计江挽 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目

基于大数据的个人财务健康状况分析系统介绍

本系统是一款基于大数据技术构建的个人财务健康状况分析平台,采用Hadoop+Spark分布式计算框架作为核心技术架构,能够对用户的财务数据进行深度挖掘与智能分析。系统通过HDFS分布式文件系统存储海量财务记录,利用Spark SQL进行高效的数据查询与处理,结合Pandas和NumPy进行数据清洗与统计计算。前端采用Vue+ElementUI+Echarts构建可视化界面,后端支持Django和Spring Boot双框架实现,数据持久化层使用MySQL数据库。系统涵盖用户管理、个人财务健康状况管理、收支结构消费行为分析、储蓄能力投资习惯分析、债务水平信用风险分析以及财务稳定性压力评估等核心功能模块,能够从多维度对个人财务状况进行量化评估,生成直观的数据报表与可视化图表,帮助用户全面了解自身财务健康水平,为个人理财决策提供科学的数据支撑,具有较强的实用价值和技术示范意义。

基于大数据的个人财务健康状况分析系统演示视频

演示视频

基于大数据的个人财务健康状况分析系统演示图片

在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述

基于大数据的个人财务健康状况分析系统代码展示

from pyspark.sql import SparkSession
from pyspark.sql.functions import col,sum as spark_sum,avg,count,when,round as spark_round
from pyspark.sql.types import StructType,StructField,StringType,DoubleType,IntegerType,TimestampType
import pandas as pd
import numpy as np
from django.http import JsonResponse
from django.views import View
from datetime import datetime,timedelta
import json
def analyze_income_expense_structure(request):
    spark=SparkSession.builder.appName("FinanceAnalysis").config("spark.sql.warehouse.dir","/user/hive/warehouse").config("spark.executor.memory","2g").config("spark.driver.memory","1g").getOrCreate()
    user_id=request.GET.get('user_id')
    start_date=request.GET.get('start_date')
    end_date=request.GET.get('end_date')
    schema=StructType([StructField("id",IntegerType(),True),StructField("user_id",IntegerType(),True),StructField("transaction_type",StringType(),True),StructField("category",StringType(),True),StructField("amount",DoubleType(),True),StructField("transaction_date",TimestampType(),True),StructField("description",StringType(),True)])
    finance_df=spark.read.format("jdbc").option("url","jdbc:mysql://localhost:3306/finance_db").option("driver","com.mysql.cj.jdbc.Driver").option("dbtable","finance_records").option("user","root").option("password","123456").load()
    filtered_df=finance_df.filter((col("user_id")==int(user_id))&(col("transaction_date")>=start_date)&(col("transaction_date")<=end_date))
    income_df=filtered_df.filter(col("transaction_type")=="收入")
    expense_df=filtered_df.filter(col("transaction_type")=="支出")
    income_by_category=income_df.groupBy("category").agg(spark_sum("amount").alias("total_amount"),count("id").alias("transaction_count")).orderBy(col("total_amount").desc())
    expense_by_category=expense_df.groupBy("category").agg(spark_sum("amount").alias("total_amount"),count("id").alias("transaction_count")).orderBy(col("total_amount").desc())
    total_income=income_df.agg(spark_sum("amount").alias("total")).collect()[0]["total"]
    total_expense=expense_df.agg(spark_sum("amount").alias("total")).collect()[0]["total"]
    total_income=0 if total_income is None else float(total_income)
    total_expense=0 if total_expense is None else float(total_expense)
    net_income=total_income-total_expense
    income_category_list=income_by_category.collect()
    expense_category_list=expense_by_category.collect()
    income_structure=[]
    for row in income_category_list:
        category_name=row["category"]
        amount=float(row["total_amount"])
        trans_count=row["transaction_count"]
        percentage=round((amount/total_income)*100,2) if total_income>0 else 0
        income_structure.append({"category":category_name,"amount":amount,"count":trans_count,"percentage":percentage})
    expense_structure=[]
    for row in expense_category_list:
        category_name=row["category"]
        amount=float(row["total_amount"])
        trans_count=row["transaction_count"]
        percentage=round((amount/total_expense)*100,2) if total_expense>0 else 0
        expense_structure.append({"category":category_name,"amount":amount,"count":trans_count,"percentage":percentage})
    expense_pandas_df=expense_df.select("category","amount").toPandas()
    if len(expense_pandas_df)>0:
        category_std=expense_pandas_df.groupby("category")["amount"].std().to_dict()
        category_mean=expense_pandas_df.groupby("category")["amount"].mean().to_dict()
        consumption_volatility={cat:round(float(category_std.get(cat,0)),2) for cat in category_std}
        consumption_average={cat:round(float(category_mean.get(cat,0)),2) for cat in category_mean}
    else:
        consumption_volatility={}
        consumption_average={}
    spark.stop()
    result_data={"total_income":round(total_income,2),"total_expense":round(total_expense,2),"net_income":round(net_income,2),"income_structure":income_structure,"expense_structure":expense_structure,"consumption_volatility":consumption_volatility,"consumption_average":consumption_average,"analysis_period":{"start_date":start_date,"end_date":end_date}}
    return JsonResponse({"code":200,"message":"收支结构消费行为分析成功","data":result_data})
def analyze_saving_investment_capacity(request):
    spark=SparkSession.builder.appName("SavingInvestmentAnalysis").config("spark.sql.warehouse.dir","/user/hive/warehouse").config("spark.executor.memory","2g").getOrCreate()
    user_id=request.GET.get('user_id')
    months=int(request.GET.get('months',6))
    end_date=datetime.now()
    start_date=end_date-timedelta(days=months*30)
    finance_df=spark.read.format("jdbc").option("url","jdbc:mysql://localhost:3306/finance_db").option("driver","com.mysql.cj.jdbc.Driver").option("dbtable","finance_records").option("user","root").option("password","123456").load()
    filtered_df=finance_df.filter((col("user_id")==int(user_id))&(col("transaction_date")>=start_date.strftime("%Y-%m-%d"))&(col("transaction_date")<=end_date.strftime("%Y-%m-%d")))
    monthly_data=filtered_df.withColumn("month",col("transaction_date").substr(1,7)).groupBy("month","transaction_type").agg(spark_sum("amount").alias("monthly_amount"))
    monthly_pivot=monthly_data.groupBy("month").pivot("transaction_type",["收入","支出"]).sum("monthly_amount").fillna(0)
    monthly_pivot=monthly_pivot.withColumn("monthly_saving",col("收入")-col("支出"))
    monthly_pivot=monthly_pivot.withColumn("saving_rate",when(col("收入")>0,spark_round((col("monthly_saving")/col("收入"))*100,2)).otherwise(0))
    monthly_list=monthly_pivot.orderBy("month").collect()
    saving_trend=[]
    total_saving=0
    for row in monthly_list:
        month=row["month"]
        income=float(row["收入"]) if row["收入"] else 0
        expense=float(row["支出"]) if row["支出"] else 0
        saving=float(row["monthly_saving"]) if row["monthly_saving"] else 0
        rate=float(row["saving_rate"]) if row["saving_rate"] else 0
        total_saving+=saving
        saving_trend.append({"month":month,"income":round(income,2),"expense":round(expense,2),"saving":round(saving,2),"saving_rate":rate})
    avg_monthly_saving=total_saving/len(monthly_list) if len(monthly_list)>0 else 0
    investment_df=filtered_df.filter(col("category").isin(["股票","基金","理财产品","债券","其他投资"]))
    investment_by_type=investment_df.groupBy("category").agg(spark_sum("amount").alias("investment_amount"),count("id").alias("transaction_count")).orderBy(col("investment_amount").desc())
    total_investment=investment_df.agg(spark_sum("amount").alias("total")).collect()[0]["total"]
    total_investment=0 if total_investment is None else float(total_investment)
    investment_list=investment_by_type.collect()
    investment_structure=[]
    for row in investment_list:
        inv_type=row["category"]
        amount=float(row["investment_amount"])
        trans_count=row["transaction_count"]
        percentage=round((amount/total_investment)*100,2) if total_investment>0 else 0
        investment_structure.append({"type":inv_type,"amount":round(amount,2),"count":trans_count,"percentage":percentage})
    investment_pandas_df=investment_df.select("amount").toPandas()
    if len(investment_pandas_df)>0:
        investment_frequency=len(investment_pandas_df)
        avg_investment_amount=investment_pandas_df["amount"].mean()
        investment_std=investment_pandas_df["amount"].std()
    else:
        investment_frequency=0
        avg_investment_amount=0
        investment_std=0
    saving_pandas_df=pd.DataFrame([{"saving":row["saving"]} for row in saving_trend])
    if len(saving_pandas_df)>0:
        saving_stability=saving_pandas_df["saving"].std()
        saving_consistency_score=100-min(saving_stability/10,100) if saving_stability>0 else 100
    else:
        saving_stability=0
        saving_consistency_score=0
    spark.stop()
    result_data={"total_saving":round(total_saving,2),"avg_monthly_saving":round(avg_monthly_saving,2),"saving_trend":saving_trend,"total_investment":round(total_investment,2),"investment_structure":investment_structure,"investment_frequency":investment_frequency,"avg_investment_amount":round(float(avg_investment_amount),2) if avg_investment_amount else 0,"investment_volatility":round(float(investment_std),2) if investment_std else 0,"saving_consistency_score":round(float(saving_consistency_score),2),"analysis_months":months}
    return JsonResponse({"code":200,"message":"储蓄能力投资习惯分析成功","data":result_data})
def analyze_debt_credit_risk(request):
    spark=SparkSession.builder.appName("DebtCreditRiskAnalysis").config("spark.sql.warehouse.dir","/user/hive/warehouse").config("spark.executor.memory","2g").config("spark.driver.memory","1g").getOrCreate()
    user_id=request.GET.get('user_id')
    finance_df=spark.read.format("jdbc").option("url","jdbc:mysql://localhost:3306/finance_db").option("driver","com.mysql.cj.jdbc.Driver").option("dbtable","finance_records").option("user","root").option("password","123456").load()
    debt_df=spark.read.format("jdbc").option("url","jdbc:mysql://localhost:3306/finance_db").option("driver","com.mysql.cj.jdbc.Driver").option("dbtable","debt_records").option("user","root").option("password","123456").load()
    user_debt_df=debt_df.filter(col("user_id")==int(user_id))
    total_debt=user_debt_df.agg(spark_sum("remaining_amount").alias("total")).collect()[0]["total"]
    total_debt=0 if total_debt is None else float(total_debt)
    debt_by_type=user_debt_df.groupBy("debt_type").agg(spark_sum("remaining_amount").alias("debt_amount"),spark_sum("monthly_payment").alias("monthly_payment_sum"),count("id").alias("debt_count")).orderBy(col("debt_amount").desc())
    debt_type_list=debt_by_type.collect()
    debt_structure=[]
    total_monthly_payment=0
    for row in debt_type_list:
        debt_type=row["debt_type"]
        amount=float(row["debt_amount"])
        monthly_pay=float(row["monthly_payment_sum"]) if row["monthly_payment_sum"] else 0
        debt_count=row["debt_count"]
        percentage=round((amount/total_debt)*100,2) if total_debt>0 else 0
        total_monthly_payment+=monthly_pay
        debt_structure.append({"debt_type":debt_type,"amount":round(amount,2),"monthly_payment":round(monthly_pay,2),"count":debt_count,"percentage":percentage})
    months=6
    end_date=datetime.now()
    start_date=end_date-timedelta(days=months*30)
    user_finance_df=finance_df.filter((col("user_id")==int(user_id))&(col("transaction_date")>=start_date.strftime("%Y-%m-%d"))&(col("transaction_date")<=end_date.strftime("%Y-%m-%d")))
    monthly_income_df=user_finance_df.filter(col("transaction_type")=="收入").withColumn("month",col("transaction_date").substr(1,7)).groupBy("month").agg(spark_sum("amount").alias("monthly_income"))
    avg_monthly_income_row=monthly_income_df.agg(avg("monthly_income").alias("avg_income")).collect()[0]
    avg_monthly_income=float(avg_monthly_income_row["avg_income"]) if avg_monthly_income_row["avg_income"] else 0
    debt_to_income_ratio=round((total_monthly_payment/avg_monthly_income)*100,2) if avg_monthly_income>0 else 0
    overdue_df=user_debt_df.filter(col("status")=="逾期")
    overdue_count=overdue_df.count()
    overdue_amount=overdue_df.agg(spark_sum("remaining_amount").alias("total")).collect()[0]["total"]
    overdue_amount=0 if overdue_amount is None else float(overdue_amount)
    if debt_to_income_ratio<30:
        risk_level="低风险"
        risk_score=85
    elif debt_to_income_ratio<50:
        risk_level="中风险"
        risk_score=60
    else:
        risk_level="高风险"
        risk_score=35
    if overdue_count>0:
        risk_score=max(risk_score-overdue_count*5,0)
        risk_level="高风险" if risk_score<50 else risk_level
    credit_payment_df=user_finance_df.filter(col("category").isin(["信用卡还款","贷款还款"]))
    on_time_payment_count=credit_payment_df.filter(col("description").like("%按时%")).count()
    total_payment_count=credit_payment_df.count()
    payment_punctuality_rate=round((on_time_payment_count/total_payment_count)*100,2) if total_payment_count>0 else 0
    debt_pandas_df=user_debt_df.select("remaining_amount","monthly_payment","interest_rate").toPandas()
    if len(debt_pandas_df)>0:
        avg_interest_rate=debt_pandas_df["interest_rate"].mean()
        max_single_debt=debt_pandas_df["remaining_amount"].max()
    else:
        avg_interest_rate=0
        max_single_debt=0
    spark.stop()
    result_data={"total_debt":round(total_debt,2),"debt_structure":debt_structure,"total_monthly_payment":round(total_monthly_payment,2),"avg_monthly_income":round(avg_monthly_income,2),"debt_to_income_ratio":debt_to_income_ratio,"overdue_count":overdue_count,"overdue_amount":round(overdue_amount,2),"risk_level":risk_level,"risk_score":risk_score,"payment_punctuality_rate":payment_punctuality_rate,"avg_interest_rate":round(float(avg_interest_rate),2) if avg_interest_rate else 0,"max_single_debt":round(float(max_single_debt),2) if max_single_debt else 0}
    return JsonResponse({"code":200,"message":"债务水平信用风险分析成功","data":result_data})

基于大数据的个人财务健康状况分析系统文档展示

在这里插入图片描述

💖💖作者:计算机毕业设计江挽 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目