医疗大数据成毕设热点:帕金森病数据可视化分析系统技术实现详解|毕设|计算机毕设|程序开发|项目实战

48 阅读6分钟

前言

💖💖作者:计算机程序员小杨 💙💙个人简介:我是一名计算机相关专业的从业者,擅长Java、微信小程序、Python、Golang、安卓Android等多个IT方向。会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。热爱技术,喜欢钻研新工具和框架,也乐于通过代码解决实际问题,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💕💕文末获取源码联系 计算机程序员小杨 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目 计算机毕业设计选题 💜💜

一.开发工具简介

大数据框架:Hadoop+Spark(本次没用Hive,支持定制) 开发语言:Python+Java(两个版本都支持) 后端框架:Django+Spring Boot(Spring+SpringMVC+Mybatis)(两个版本都支持) 前端:Vue+ElementUI+Echarts+HTML+CSS+JavaScript+jQuery 详细技术点:Hadoop、HDFS、Spark、Spark SQL、Pandas、NumPy 数据库:MySQL

二.系统内容简介

基于大数据的帕金森病数据可视化分析系统是一个集成了Hadoop分布式存储和Spark大数据处理技术的医疗数据分析平台。该系统采用Python作为主要开发语言,结合Django后端框架和Vue前端技术栈,构建了一个功能完善的帕金森病数据分析环境。系统通过HDFS分布式文件系统存储海量的帕金森病相关数据,利用Spark SQL进行高效的数据查询和处理,同时集成Pandas和NumPy等数据科学库进行深度数据分析。在可视化方面,系统运用Echarts图表库将复杂的医疗数据以直观的图表形式展现,包括语音声学特征分析、多维特征关联分析和非线性动力学分析等多个维度。系统提供了完整的用户管理、数据管理、数据集总体分析和可视化大屏等九大核心功能模块,能够帮助医疗研究人员和数据分析师更好地理解帕金森病患者的数据特征和病情发展规律,为帕金森病的早期诊断和治疗方案制定提供数据支撑。

三.系统功能演示

医疗大数据成毕设热点:帕金森病数据可视化分析系统技术实现详解|毕设|计算机毕设|程序开发|项目实战

四.系统界面展示

在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述

五.系统源码展示


from pyspark.sql import SparkSession
from pyspark.sql.functions import col, avg, count, stddev, corr, max, min
from pyspark.ml.feature import VectorAssembler
from pyspark.ml.stat import Correlation
import pandas as pd
import numpy as np
from django.http import JsonResponse
from django.views import View
import json

spark = SparkSession.builder.appName("ParkinsonDataAnalysis").config("spark.sql.adaptive.enabled", "true").config("spark.sql.adaptive.coalescePartitions.enabled", "true").getOrCreate()

def parkinson_data_overall_analysis(request):
    try:
        df = spark.read.option("header", "true").option("inferSchema", "true").csv("hdfs://localhost:9000/parkinson_data/dataset.csv")
        total_records = df.count()
        parkinson_patients = df.filter(col("status") == 1).count()
        healthy_patients = df.filter(col("status") == 0).count()
        parkinson_ratio = round((parkinson_patients / total_records) * 100, 2)
        age_stats = df.select(avg("age").alias("avg_age"), min("age").alias("min_age"), max("age").alias("max_age"), stddev("age").alias("std_age")).collect()[0]
        gender_distribution = df.groupBy("gender").agg(count("*").alias("count")).collect()
        male_count = next((row["count"] for row in gender_distribution if row["gender"] == "M"), 0)
        female_count = next((row["count"] for row in gender_distribution if row["gender"] == "F"), 0)
        voice_features = ["MDVP:Fo(Hz)", "MDVP:Fhi(Hz)", "MDVP:Flo(Hz)", "MDVP:Jitter(%)", "MDVP:Shimmer"]
        voice_stats = {}
        for feature in voice_features:
            stats = df.select(avg(feature).alias("mean"), stddev(feature).alias("std"), min(feature).alias("min"), max(feature).alias("max")).collect()[0]
            voice_stats[feature] = {"mean": round(stats["mean"], 4), "std": round(stats["std"], 4), "min": round(stats["min"], 4), "max": round(stats["max"], 4)}
        result_data = {"total_records": total_records, "parkinson_patients": parkinson_patients, "healthy_patients": healthy_patients, "parkinson_ratio": parkinson_ratio, "age_statistics": {"average": round(age_stats["avg_age"], 2), "minimum": int(age_stats["min_age"]), "maximum": int(age_stats["max_age"]), "standard_deviation": round(age_stats["std_age"], 2)}, "gender_distribution": {"male": male_count, "female": female_count}, "voice_feature_statistics": voice_stats}
        return JsonResponse({"status": "success", "data": result_data})
    except Exception as e:
        return JsonResponse({"status": "error", "message": str(e)})

def voice_acoustic_feature_analysis(request):
    try:
        df = spark.read.option("header", "true").option("inferSchema", "true").csv("hdfs://localhost:9000/parkinson_data/voice_features.csv")
        parkinson_df = df.filter(col("status") == 1)
        healthy_df = df.filter(col("status") == 0)
        acoustic_features = ["MDVP:Fo(Hz)", "MDVP:Fhi(Hz)", "MDVP:Flo(Hz)", "MDVP:Jitter(%)", "MDVP:Jitter(Abs)", "MDVP:RAP", "MDVP:PPQ", "Jitter:DDP", "MDVP:Shimmer", "MDVP:Shimmer(dB)", "Shimmer:APQ3", "Shimmer:APQ5", "MDVP:APQ", "Shimmer:DDA"]
        comparison_results = {}
        for feature in acoustic_features:
            parkinson_stats = parkinson_df.select(avg(feature).alias("avg"), stddev(feature).alias("std"), min(feature).alias("min"), max(feature).alias("max")).collect()[0]
            healthy_stats = healthy_df.select(avg(feature).alias("avg"), stddev(feature).alias("std"), min(feature).alias("min"), max(feature).alias("max")).collect()[0]
            difference_percentage = round(((parkinson_stats["avg"] - healthy_stats["avg"]) / healthy_stats["avg"]) * 100, 2)
            comparison_results[feature] = {"parkinson_group": {"average": round(parkinson_stats["avg"], 4), "std_dev": round(parkinson_stats["std"], 4), "min_value": round(parkinson_stats["min"], 4), "max_value": round(parkinson_stats["max"], 4)}, "healthy_group": {"average": round(healthy_stats["avg"], 4), "std_dev": round(healthy_stats["std"], 4), "min_value": round(healthy_stats["min"], 4), "max_value": round(healthy_stats["max"], 4)}, "difference_percentage": difference_percentage}
        jitter_features = [f for f in acoustic_features if "Jitter" in f]
        shimmer_features = [f for f in acoustic_features if "Shimmer" in f]
        fundamental_freq_features = ["MDVP:Fo(Hz)", "MDVP:Fhi(Hz)", "MDVP:Flo(Hz)"]
        jitter_analysis = {}
        for feature in jitter_features:
            parkinson_avg = parkinson_df.select(avg(feature)).collect()[0][0]
            healthy_avg = healthy_df.select(avg(feature)).collect()[0][0]
            jitter_analysis[feature] = {"parkinson_avg": round(parkinson_avg, 6), "healthy_avg": round(healthy_avg, 6), "severity_indicator": "高" if parkinson_avg > healthy_avg * 1.5 else "中等" if parkinson_avg > healthy_avg * 1.2 else "轻微"}
        shimmer_analysis = {}
        for feature in shimmer_features:
            parkinson_avg = parkinson_df.select(avg(feature)).collect()[0][0]
            healthy_avg = healthy_df.select(avg(feature)).collect()[0][0]
            shimmer_analysis[feature] = {"parkinson_avg": round(parkinson_avg, 6), "healthy_avg": round(healthy_avg, 6), "amplitude_variation": "显著" if parkinson_avg > healthy_avg * 1.3 else "中等" if parkinson_avg > healthy_avg * 1.1 else "轻微"}
        return JsonResponse({"status": "success", "data": {"feature_comparison": comparison_results, "jitter_detailed_analysis": jitter_analysis, "shimmer_detailed_analysis": shimmer_analysis}})
    except Exception as e:
        return JsonResponse({"status": "error", "message": str(e)})

def multidimensional_correlation_analysis(request):
    try:
        df = spark.read.option("header", "true").option("inferSchema", "true").csv("hdfs://localhost:9000/parkinson_data/comprehensive_features.csv")
        feature_columns = ["MDVP:Fo(Hz)", "MDVP:Fhi(Hz)", "MDVP:Flo(Hz)", "MDVP:Jitter(%)", "MDVP:Shimmer", "NHR", "HNR", "RPDE", "DFA", "spread1", "spread2", "D2", "PPE"]
        assembler = VectorAssembler(inputCols=feature_columns, outputCol="features")
        feature_df = assembler.transform(df)
        correlation_matrix = Correlation.corr(feature_df, "features").head()[0].toArray()
        correlation_dict = {}
        for i, feature1 in enumerate(feature_columns):
            correlation_dict[feature1] = {}
            for j, feature2 in enumerate(feature_columns):
                correlation_dict[feature1][feature2] = round(float(correlation_matrix[i][j]), 4)
        strong_correlations = []
        for i, feature1 in enumerate(feature_columns):
            for j, feature2 in enumerate(feature_columns):
                if i < j:
                    corr_value = correlation_matrix[i][j]
                    if abs(corr_value) > 0.7:
                        strong_correlations.append({"feature1": feature1, "feature2": feature2, "correlation": round(float(corr_value), 4), "strength": "强正相关" if corr_value > 0 else "强负相关"})
        parkinson_df = df.filter(col("status") == 1)
        healthy_df = df.filter(col("status") == 0)
        group_correlations = {}
        for group_name, group_df in [("parkinson", parkinson_df), ("healthy", healthy_df)]:
            group_assembler = VectorAssembler(inputCols=feature_columns, outputCol="features")
            group_feature_df = group_assembler.transform(group_df)
            if group_feature_df.count() > 1:
                group_corr_matrix = Correlation.corr(group_feature_df, "features").head()[0].toArray()
                group_correlations[group_name] = {}
                for i, feature1 in enumerate(feature_columns):
                    group_correlations[group_name][feature1] = {}
                    for j, feature2 in enumerate(feature_columns):
                        group_correlations[group_name][feature1][feature2] = round(float(group_corr_matrix[i][j]), 4)
        feature_importance = {}
        for feature in feature_columns:
            total_correlation = sum([abs(correlation_dict[feature][other_feature]) for other_feature in feature_columns if other_feature != feature])
            feature_importance[feature] = round(total_correlation / (len(feature_columns) - 1), 4)
        sorted_importance = sorted(feature_importance.items(), key=lambda x: x[1], reverse=True)
        dimensional_analysis = {"voice_quality": ["MDVP:Fo(Hz)", "MDVP:Fhi(Hz)", "MDVP:Flo(Hz)", "NHR", "HNR"], "voice_stability": ["MDVP:Jitter(%)", "MDVP:Shimmer"], "nonlinear_dynamics": ["RPDE", "DFA", "D2", "PPE"], "spectral_measures": ["spread1", "spread2"]}
        dimension_correlations = {}
        for dimension, dimension_features in dimensional_analysis.items():
            dimension_correlations[dimension] = {}
            for feature1 in dimension_features:
                for feature2 in dimension_features:
                    if feature1 != feature2:
                        dimension_correlations[dimension][f"{feature1}_vs_{feature2}"] = correlation_dict[feature1][feature2]
        return JsonResponse({"status": "success", "data": {"correlation_matrix": correlation_dict, "strong_correlations": strong_correlations, "group_specific_correlations": group_correlations, "feature_importance_ranking": sorted_importance, "dimensional_correlation_analysis": dimension_correlations}})
    except Exception as e:
        return JsonResponse({"status": "error", "message": str(e)})

六.系统文档展示

在这里插入图片描述

结束

💕💕文末获取源码联系 计算机程序员小杨