基于大数据的哺乳动物睡眠数据可视化分析系统【python毕设项目、python实战、Hadoop、spark、大数据毕设选题、大数据毕设项目】

41 阅读7分钟

💖💖作者:计算机毕业设计小途 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目

@TOC

基于大数据的哺乳动物睡眠数据可视化分析系统介绍

《基于大数据的哺乳动物睡眠数据可视化分析系统》是一套集成了Hadoop分布式存储、Spark大数据处理引擎和现代化Web技术的综合性数据分析平台,系统采用Hadoop+Spark作为核心大数据处理框架,通过HDFS分布式文件系统存储海量哺乳动物睡眠相关数据,利用Spark SQL进行高效的数据查询和处理,结合Pandas、NumPy等Python科学计算库进行深度数据挖掘,后端基于Django/Spring Boot框架构建RESTful API接口,前端采用Vue+ElementUI+Echarts技术栈打造现代化的用户界面和丰富的数据可视化效果,数据持久化通过MySQL数据库实现。系统功能涵盖完整的用户管理模块(系统首页、个人信息、修改密码)、系统管理功能和核心的数据分析模块,特别是针对哺乳动物睡眠特征的多维度分析,包括生理指标雷达分析、睡眠时长统计分析、睡眠与体脑重关系分析、睡眠与生命周期关联分析、脑体比与睡眠模式分析、危险等级与睡眠时长相关性分析、危险等级与寿命关系分析、捕食风险对睡眠影响分析、睡眠暴露度与时长关系分析、睡眠时长Top10排行分析、睡眠类型占比统计、睡眠危险综合指数评估、睡眠模式聚类分析以及各指标间相关性矩阵分析等15个专业分析功能,同时提供大屏可视化展示功能,通过Echarts图表库实现动态、交互式的数据展示效果,为用户提供全方位、多角度的哺乳动物睡眠数据洞察和分析支持。

基于大数据的哺乳动物睡眠数据可视化分析系统演示视频

演示视频

基于大数据的哺乳动物睡眠数据可视化分析系统演示图片

捕食风险与失眠分析.png

数据大屏上.png

数据大屏下.png

睡眠类型占比分析.png

睡眠最短Top10分析.png

睡眠最长Top10分析.png

危险等级与寿命分析.png

指标相关性矩阵分析.png

基于大数据的哺乳动物睡眠数据可视化分析系统代码展示

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, avg, count, max, min, stddev, when
from pyspark.ml.clustering import KMeans
from pyspark.ml.feature import VectorAssembler
from pyspark.ml.stat import Correlation
import pandas as pd
import numpy as np
spark = SparkSession.builder.appName("MammalSleepAnalysis").config("spark.sql.adaptive.enabled", "true").config("spark.sql.adaptive.coalescePartitions.enabled", "true").getOrCreate()
def analyze_sleep_duration_statistics():
    df = spark.sql("SELECT species_name, sleep_total, sleep_rem, sleep_cycle, body_weight, brain_weight, life_span FROM mammal_sleep_data WHERE sleep_total IS NOT NULL")
    sleep_stats = df.groupBy().agg(
        avg("sleep_total").alias("avg_sleep_total"),
        max("sleep_total").alias("max_sleep_total"),
        min("sleep_total").alias("min_sleep_total"),
        stddev("sleep_total").alias("stddev_sleep_total"),
        count("sleep_total").alias("total_records")
    ).collect()[0]
    species_sleep_stats = df.groupBy("species_name").agg(
        avg("sleep_total").alias("species_avg_sleep"),
        avg("sleep_rem").alias("species_avg_rem"),
        avg("sleep_cycle").alias("species_avg_cycle")
    ).orderBy(col("species_avg_sleep").desc())
    sleep_duration_categories = df.withColumn("sleep_category", 
        when(col("sleep_total") < 5, "短睡眠型")
        .when((col("sleep_total") >= 5) & (col("sleep_total") < 10), "中等睡眠型")
        .when((col("sleep_total") >= 10) & (col("sleep_total") < 15), "长睡眠型")
        .otherwise("超长睡眠型")
    )
    category_distribution = sleep_duration_categories.groupBy("sleep_category").agg(
        count("*").alias("category_count"),
        avg("sleep_total").alias("category_avg_sleep"),
        avg("body_weight").alias("category_avg_weight")
    ).orderBy("category_avg_sleep")
    sleep_weight_correlation = df.select("sleep_total", "body_weight").filter(
        (col("sleep_total").isNotNull()) & (col("body_weight").isNotNull())
    )
    correlation_result = sleep_weight_correlation.stat.corr("sleep_total", "body_weight")
    rem_sleep_analysis = df.filter(col("sleep_rem").isNotNull()).select(
        "species_name", "sleep_total", "sleep_rem", 
        (col("sleep_rem") / col("sleep_total") * 100).alias("rem_percentage")
    ).orderBy(col("rem_percentage").desc())
    species_sleep_efficiency = df.withColumn("sleep_efficiency", 
        when(col("body_weight") > 0, col("sleep_total") / col("body_weight")).otherwise(0)
    ).select("species_name", "sleep_total", "body_weight", "sleep_efficiency").orderBy(col("sleep_efficiency").desc())
    return {
        "overall_stats": sleep_stats.asDict(),
        "species_rankings": [row.asDict() for row in species_sleep_stats.collect()[:20]],
        "category_distribution": [row.asDict() for row in category_distribution.collect()],
        "sleep_weight_correlation": correlation_result,
        "rem_analysis": [row.asDict() for row in rem_sleep_analysis.collect()[:15]],
        "efficiency_rankings": [row.asDict() for row in species_sleep_efficiency.collect()[:15]]
    }
def perform_sleep_pattern_clustering():
    df = spark.sql("SELECT species_name, sleep_total, sleep_rem, sleep_cycle, body_weight, brain_weight, life_span, predation_risk, exposure_index FROM mammal_sleep_data WHERE sleep_total IS NOT NULL AND sleep_rem IS NOT NULL AND body_weight IS NOT NULL AND brain_weight IS NOT NULL")
    feature_columns = ["sleep_total", "sleep_rem", "sleep_cycle", "body_weight", "brain_weight", "life_span", "predation_risk", "exposure_index"]
    assembler = VectorAssembler(inputCols=feature_columns, outputCol="features")
    feature_df = assembler.transform(df)
    normalized_df = feature_df.select("species_name", "features")
    kmeans_3 = KMeans(k=3, seed=42, featuresCol="features", predictionCol="cluster_3")
    model_3 = kmeans_3.fit(normalized_df)
    clustered_3 = model_3.transform(normalized_df)
    kmeans_4 = KMeans(k=4, seed=42, featuresCol="features", predictionCol="cluster_4")
    model_4 = kmeans_4.fit(normalized_df)
    clustered_4 = model_4.transform(normalized_df)
    kmeans_5 = KMeans(k=5, seed=42, featuresCol="features", predictionCol="cluster_5")
    model_5 = kmeans_5.fit(normalized_df)
    clustered_5 = model_5.transform(normalized_df)
    cluster_3_analysis = clustered_3.join(df, "species_name").groupBy("cluster_3").agg(
        count("*").alias("cluster_size"),
        avg("sleep_total").alias("avg_sleep_total"),
        avg("sleep_rem").alias("avg_sleep_rem"),
        avg("body_weight").alias("avg_body_weight"),
        avg("predation_risk").alias("avg_predation_risk")
    ).orderBy("cluster_3")
    cluster_4_analysis = clustered_4.join(df, "species_name").groupBy("cluster_4").agg(
        count("*").alias("cluster_size"),
        avg("sleep_total").alias("avg_sleep_total"),
        avg("sleep_rem").alias("avg_sleep_rem"),
        avg("body_weight").alias("avg_body_weight"),
        avg("brain_weight").alias("avg_brain_weight")
    ).orderBy("cluster_4")
    species_cluster_mapping = clustered_3.join(df, "species_name").select(
        "species_name", "cluster_3", "sleep_total", "sleep_rem", "body_weight"
    ).orderBy("cluster_3", col("sleep_total").desc())
    cluster_characteristics = {}
    for cluster_id in range(3):
        cluster_species = clustered_3.join(df, "species_name").filter(col("cluster_3") == cluster_id)
        cluster_characteristics[f"cluster_{cluster_id}"] = {
            "dominant_sleep_pattern": cluster_species.agg(avg("sleep_total")).collect()[0][0],
            "dominant_body_size": cluster_species.agg(avg("body_weight")).collect()[0][0],
            "species_count": cluster_species.count(),
            "representative_species": [row.species_name for row in cluster_species.select("species_name").limit(5).collect()]
        }
    return {
        "cluster_3_results": [row.asDict() for row in cluster_3_analysis.collect()],
        "cluster_4_results": [row.asDict() for row in cluster_4_analysis.collect()],
        "species_mappings": [row.asDict() for row in species_cluster_mapping.collect()],
        "cluster_characteristics": cluster_characteristics,
        "model_metrics": {
            "k3_cost": model_3.summary.trainingCost,
            "k4_cost": model_4.summary.trainingCost,
            "k5_cost": model_5.summary.trainingCost
        }
    }
def calculate_indicators_correlation_matrix():
    df = spark.sql("SELECT sleep_total, sleep_rem, sleep_cycle, body_weight, brain_weight, life_span, predation_risk, exposure_index, danger_level FROM mammal_sleep_data WHERE sleep_total IS NOT NULL AND body_weight IS NOT NULL AND brain_weight IS NOT NULL")
    correlation_columns = ["sleep_total", "sleep_rem", "sleep_cycle", "body_weight", "brain_weight", "life_span", "predation_risk", "exposure_index", "danger_level"]
    filtered_df = df.select(*correlation_columns).filter(
        col("sleep_total").isNotNull() & 
        col("sleep_rem").isNotNull() & 
        col("body_weight").isNotNull() & 
        col("brain_weight").isNotNull() &
        col("life_span").isNotNull()
    )
    assembler = VectorAssembler(inputCols=correlation_columns, outputCol="correlation_features")
    vector_df = assembler.transform(filtered_df)
    correlation_matrix = Correlation.corr(vector_df, "correlation_features", "pearson").head()
    corr_array = correlation_matrix[0].toArray()
    correlation_results = {}
    for i, col1 in enumerate(correlation_columns):
        correlation_results[col1] = {}
        for j, col2 in enumerate(correlation_columns):
            correlation_results[col1][col2] = float(corr_array[i][j])
    strong_correlations = []
    for i, col1 in enumerate(correlation_columns):
        for j, col2 in enumerate(correlation_columns):
            if i < j and abs(corr_array[i][j]) > 0.5:
                strong_correlations.append({
                    "indicator1": col1,
                    "indicator2": col2,
                    "correlation_value": float(corr_array[i][j]),
                    "correlation_strength": "强正相关" if corr_array[i][j] > 0.7 else "强负相关" if corr_array[i][j] < -0.7 else "中等相关"
                })
    sleep_related_correlations = {}
    sleep_indicators = ["sleep_total", "sleep_rem", "sleep_cycle"]
    for sleep_indicator in sleep_indicators:
        sleep_idx = correlation_columns.index(sleep_indicator)
        sleep_related_correlations[sleep_indicator] = {
            "body_weight": float(corr_array[sleep_idx][correlation_columns.index("body_weight")]),
            "brain_weight": float(corr_array[sleep_idx][correlation_columns.index("brain_weight")]),
            "life_span": float(corr_array[sleep_idx][correlation_columns.index("life_span")]),
            "predation_risk": float(corr_array[sleep_idx][correlation_columns.index("predation_risk")]),
            "exposure_index": float(corr_array[sleep_idx][correlation_columns.index("exposure_index")])
        }
    biological_factor_correlations = {
        "body_brain_correlation": float(corr_array[correlation_columns.index("body_weight")][correlation_columns.index("brain_weight")]),
        "weight_lifespan_correlation": float(corr_array[correlation_columns.index("body_weight")][correlation_columns.index("life_span")]),
        "brain_lifespan_correlation": float(corr_array[correlation_columns.index("brain_weight")][correlation_columns.index("life_span")]),
        "risk_exposure_correlation": float(corr_array[correlation_columns.index("predation_risk")][correlation_columns.index("exposure_index")])
    }
    return {
        "full_correlation_matrix": correlation_results,
        "strong_correlations": sorted(strong_correlations, key=lambda x: abs(x["correlation_value"]), reverse=True),
        "sleep_related_correlations": sleep_related_correlations,
        "biological_correlations": biological_factor_correlations,
        "matrix_summary": {
            "total_indicators": len(correlation_columns),
            "strong_positive_count": len([c for c in strong_correlations if c["correlation_value"] > 0.5]),
            "strong_negative_count": len([c for c in strong_correlations if c["correlation_value"] < -0.5]),
            "data_points_analyzed": filtered_df.count()
        }
    }

基于大数据的哺乳动物睡眠数据可视化分析系统文档展示

文档.png

💖💖作者:计算机毕业设计小途 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目