【大数据】体脂数据可视化分析系统 计算机项目 Hadoop+Spark环境配置 数据科学与大数据技术 附源码+文档+讲解

39 阅读7分钟

一、个人简介

💖💖作者:计算机编程果茶熊 💙💙个人简介:曾长期从事计算机专业培训教学,担任过编程老师,同时本人也热爱上课教学,擅长Java、微信小程序、Python、Golang、安卓Android等多个IT方向。会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 计算机毕业设计选题 💕💕文末获取源码联系计算机编程果茶熊

二、系统介绍

大数据框架:Hadoop+Spark(Hive需要定制修改) 开发语言:Java+Python(两个版本都支持) 数据库:MySQL 后端框架:SpringBoot(Spring+SpringMVC+Mybatis)+Django(两个版本都支持) 前端:Vue+Echarts+HTML+CSS+JavaScript+jQuery

体脂数据可视化分析系统是一个基于大数据技术构建的健康数据分析平台,采用Hadoop+Spark分布式计算框架处理海量体脂相关数据,结合Django后端框架提供稳定的数据处理服务。系统前端采用Vue.js配合ElementUI组件库构建用户界面,通过Echarts图表库实现数据的多维度可视化展示。系统核心功能涵盖体脂率核心指标分布分析、年龄结构体脂分析、BMI健康评估分析、BMI与体脂率一致性分析以及腰臀比健康风险分析等多个维度,为用户提供全面的身体健康状况评估。通过HDFS分布式文件系统存储大规模体脂数据,利用Spark SQL进行高效的数据查询和分析,结合Pandas和NumPy进行精确的数据处理和统计计算。系统最终通过可视化大屏的形式,将复杂的健康数据转化为直观易懂的图表和报告,帮助用户更好地了解自身健康状况,为健康管理提供科学的数据支撑和决策依据。

三、视频解说

体脂数据可视化分析系统

四、部分功能展示

在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述

五、部分代码展示


from pyspark.sql import SparkSession
from pyspark.sql.functions import col, avg, count, when, percentile_approx, corr
from django.http import JsonResponse
from django.views.decorators.csrf import csrf_exempt
import pandas as pd
import numpy as np
import json

spark = SparkSession.builder.appName("BodyFatAnalysis").config("spark.sql.adaptive.enabled", "true").getOrCreate()

def core_indicator_distribution_analysis(request):
    body_fat_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/bodyfat_db").option("dbtable", "body_fat_data").option("user", "root").option("password", "password").load()
    body_fat_df.createOrReplaceTempView("body_fat_view")
    distribution_stats = spark.sql("SELECT CASE WHEN body_fat_rate < 10 THEN '偏瘦' WHEN body_fat_rate BETWEEN 10 AND 15 THEN '正常' WHEN body_fat_rate BETWEEN 15 AND 20 THEN '轻度肥胖' ELSE '重度肥胖' END as fat_level, COUNT(*) as count FROM body_fat_view GROUP BY fat_level").collect()
    percentile_data = body_fat_df.select(percentile_approx(col("body_fat_rate"), 0.25).alias("q1"), percentile_approx(col("body_fat_rate"), 0.5).alias("median"), percentile_approx(col("body_fat_rate"), 0.75).alias("q3")).collect()[0]
    outlier_detection = body_fat_df.filter((col("body_fat_rate") < percentile_data["q1"] - 1.5 * (percentile_data["q3"] - percentile_data["q1"])) | (col("body_fat_rate") > percentile_data["q3"] + 1.5 * (percentile_data["q3"] - percentile_data["q1"]))).count()
    gender_distribution = spark.sql("SELECT gender, AVG(body_fat_rate) as avg_fat_rate, STDDEV(body_fat_rate) as std_fat_rate FROM body_fat_view GROUP BY gender").collect()
    trend_analysis = spark.sql("SELECT DATE_FORMAT(measurement_date, 'yyyy-MM') as month, AVG(body_fat_rate) as monthly_avg FROM body_fat_view WHERE measurement_date >= DATE_SUB(CURRENT_DATE, 365) GROUP BY month ORDER BY month").collect()
    risk_assessment = body_fat_df.withColumn("risk_level", when(col("body_fat_rate") > 25, "高风险").when(col("body_fat_rate") > 20, "中等风险").otherwise("低风险")).groupBy("risk_level").count().collect()
    correlation_matrix = body_fat_df.select(corr("body_fat_rate", "weight").alias("weight_corr"), corr("body_fat_rate", "height").alias("height_corr"), corr("body_fat_rate", "age").alias("age_corr")).collect()[0]
    result_data = {"distribution": [{"level": row["fat_level"], "count": row["count"]} for row in distribution_stats], "percentiles": {"q1": percentile_data["q1"], "median": percentile_data["median"], "q3": percentile_data["q3"]}, "outliers": outlier_detection, "gender_stats": [{"gender": row["gender"], "avg_rate": float(row["avg_fat_rate"]), "std_rate": float(row["std_fat_rate"])} for row in gender_distribution], "trend": [{"month": row["month"], "avg": float(row["monthly_avg"])} for row in trend_analysis], "risk": [{"level": row["risk_level"], "count": row["count"]} for row in risk_assessment], "correlations": {"weight": float(correlation_matrix["weight_corr"]), "height": float(correlation_matrix["height_corr"]), "age": float(correlation_matrix["age_corr"])}}
    return JsonResponse(result_data)

def age_structure_bodyfat_analysis(request):
    body_fat_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/bodyfat_db").option("dbtable", "body_fat_data").option("user", "root").option("password", "password").load()
    body_fat_df.createOrReplaceTempView("age_bodyfat_view")
    age_group_analysis = spark.sql("SELECT CASE WHEN age < 25 THEN '18-24岁' WHEN age < 35 THEN '25-34岁' WHEN age < 45 THEN '35-44岁' WHEN age < 55 THEN '45-54岁' ELSE '55岁以上' END as age_group, AVG(body_fat_rate) as avg_fat_rate, MIN(body_fat_rate) as min_fat_rate, MAX(body_fat_rate) as max_fat_rate, COUNT(*) as sample_count FROM age_bodyfat_view GROUP BY age_group ORDER BY age_group").collect()
    age_gender_cross = spark.sql("SELECT CASE WHEN age < 25 THEN '18-24岁' WHEN age < 35 THEN '25-34岁' WHEN age < 45 THEN '35-44岁' WHEN age < 55 THEN '45-54岁' ELSE '55岁以上' END as age_group, gender, AVG(body_fat_rate) as avg_fat_rate, STDDEV(body_fat_rate) as std_fat_rate FROM age_bodyfat_view GROUP BY age_group, gender ORDER BY age_group, gender").collect()
    age_progression = spark.sql("SELECT age, AVG(body_fat_rate) as avg_rate FROM age_bodyfat_view WHERE age BETWEEN 18 AND 65 GROUP BY age ORDER BY age").collect()
    polynomial_features = body_fat_df.select("age", "body_fat_rate").toPandas()
    age_coeffs = np.polyfit(polynomial_features["age"], polynomial_features["body_fat_rate"], 2)
    predicted_values = np.polyval(age_coeffs, polynomial_features["age"])
    r_squared = 1 - (np.sum((polynomial_features["body_fat_rate"] - predicted_values) ** 2) / np.sum((polynomial_features["body_fat_rate"] - np.mean(polynomial_features["body_fat_rate"])) ** 2))
    health_standards = body_fat_df.withColumn("health_status", when((col("age") < 30) & (col("body_fat_rate") < 15), "健康").when((col("age").between(30, 50)) & (col("body_fat_rate") < 18), "健康").when((col("age") > 50) & (col("body_fat_rate") < 20), "健康").otherwise("需关注")).groupBy("health_status").count().collect()
    age_risk_matrix = spark.sql("SELECT CASE WHEN age < 30 THEN '青年' WHEN age < 50 THEN '中年' ELSE '老年' END as life_stage, CASE WHEN body_fat_rate < 15 THEN '正常' WHEN body_fat_rate < 25 THEN '偏高' ELSE '超标' END as fat_status, COUNT(*) as count FROM age_bodyfat_view GROUP BY life_stage, fat_status").collect()
    result_data = {"age_groups": [{"group": row["age_group"], "avg_rate": float(row["avg_fat_rate"]), "min_rate": float(row["min_fat_rate"]), "max_rate": float(row["max_fat_rate"]), "count": row["sample_count"]} for row in age_group_analysis], "gender_cross": [{"group": row["age_group"], "gender": row["gender"], "avg_rate": float(row["avg_fat_rate"]), "std_rate": float(row["std_fat_rate"]) if row["std_fat_rate"] else 0} for row in age_gender_cross], "progression": [{"age": row["age"], "avg_rate": float(row["avg_rate"])} for row in age_progression], "polynomial": {"coefficients": age_coeffs.tolist(), "r_squared": float(r_squared)}, "health_distribution": [{"status": row["health_status"], "count": row["count"]} for row in health_standards], "risk_matrix": [{"stage": row["life_stage"], "status": row["fat_status"], "count": row["count"]} for row in age_risk_matrix]}
    return JsonResponse(result_data)

def bmi_bodyfat_consistency_analysis(request):
    body_fat_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/bodyfat_db").option("dbtable", "body_fat_data").option("user", "root").option("password", "password").load()
    bmi_bodyfat_df = body_fat_df.withColumn("bmi", col("weight") / (col("height") / 100) ** 2).withColumn("bmi_category", when(col("bmi") < 18.5, "偏瘦").when(col("bmi") < 24, "正常").when(col("bmi") < 28, "超重").otherwise("肥胖")).withColumn("bodyfat_category", when(col("body_fat_rate") < 15, "偏瘦").when(col("body_fat_rate") < 20, "正常").when(col("body_fat_rate") < 25, "超重").otherwise("肥胖"))
    bmi_bodyfat_df.createOrReplaceTempView("consistency_view")
    consistency_matrix = spark.sql("SELECT bmi_category, bodyfat_category, COUNT(*) as count FROM consistency_view GROUP BY bmi_category, bodyfat_category ORDER BY bmi_category, bodyfat_category").collect()
    consistency_rate = spark.sql("SELECT SUM(CASE WHEN bmi_category = bodyfat_category THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as consistency_percentage FROM consistency_view").collect()[0]["consistency_percentage"]
    correlation_analysis = bmi_bodyfat_df.select(corr("bmi", "body_fat_rate").alias("correlation")).collect()[0]["correlation"]
    inconsistency_cases = spark.sql("SELECT bmi, body_fat_rate, bmi_category, bodyfat_category, age, gender FROM consistency_view WHERE bmi_category != bodyfat_category ORDER BY ABS(bmi - body_fat_rate) DESC LIMIT 20").collect()
    gender_consistency = spark.sql("SELECT gender, SUM(CASE WHEN bmi_category = bodyfat_category THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as gender_consistency FROM consistency_view GROUP BY gender").collect()
    age_group_consistency = spark.sql("SELECT CASE WHEN age < 30 THEN '青年' WHEN age < 50 THEN '中年' ELSE '老年' END as age_group, SUM(CASE WHEN bmi_category = bodyfat_category THEN 1 ELSE 0 END) * 100.0 / COUNT(*) as age_consistency FROM consistency_view GROUP BY age_group").collect()
    regression_data = bmi_bodyfat_df.select("bmi", "body_fat_rate").toPandas()
    slope, intercept = np.polyfit(regression_data["bmi"], regression_data["body_fat_rate"], 1)
    predicted_bodyfat = slope * regression_data["bmi"] + intercept
    rmse = np.sqrt(np.mean((regression_data["body_fat_rate"] - predicted_bodyfat) ** 2))
    outlier_threshold = 2 * np.std(regression_data["body_fat_rate"] - predicted_bodyfat)
    outliers = regression_data[np.abs(regression_data["body_fat_rate"] - predicted_bodyfat) > outlier_threshold]
    result_data = {"consistency_matrix": [{"bmi_cat": row["bmi_category"], "bodyfat_cat": row["bodyfat_category"], "count": row["count"]} for row in consistency_matrix], "overall_consistency": float(consistency_rate), "correlation": float(correlation_analysis), "inconsistent_samples": [{"bmi": float(row["bmi"]), "bodyfat": float(row["body_fat_rate"]), "bmi_cat": row["bmi_category"], "bodyfat_cat": row["bodyfat_category"], "age": row["age"], "gender": row["gender"]} for row in inconsistency_cases], "gender_analysis": [{"gender": row["gender"], "consistency": float(row["gender_consistency"])} for row in gender_consistency], "age_analysis": [{"group": row["age_group"], "consistency": float(row["age_consistency"])} for row in age_group_consistency], "regression": {"slope": float(slope), "intercept": float(intercept), "rmse": float(rmse)}, "outliers_count": len(outliers)}
    return JsonResponse(result_data)

六、部分文档展示

在这里插入图片描述

七、END

💕💕文末获取源码联系计算机编程果茶熊