计算机专业的你还在为毕设发愁?海洋塑料污染数据分析系统来救场

50 阅读7分钟

💖💖作者:计算机毕业设计小途 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目

@TOC

海洋塑料污染数据分析与可视化系统介绍

基于大数据的海洋塑料污染数据分析与可视化系统是一套综合运用现代大数据技术栈构建的环境监测分析平台,该系统采用Hadoop+Spark大数据处理框架作为核心技术架构,通过HDFS分布式文件系统存储海量海洋污染数据,利用Spark强大的内存计算能力和Spark SQL进行高效的数据处理与分析。系统支持Python+Django和Java+SpringBoot双技术栈开发模式,前端采用Vue+ElementUI构建现代化用户界面,结合Echarts图表库实现丰富的数据可视化效果,后端数据存储基于MySQL数据库管理。该系统具备完整的功能模块体系,包括系统首页展示、用户个人信息管理、海洋塑料污染数据的录入与管理、多维度的污染时间尺度分析、精确的污染区域分布分析、详细的塑料来源构成分析以及综合性的海洋污染分析功能,同时配备专业的大屏可视化展示模块,能够直观呈现海洋塑料污染的分布规律、变化趋势和成因构成。系统通过集成Pandas、NumPy等Python数据分析库,实现对海洋塑料污染数据的深度挖掘和智能分析,为海洋环境保护决策提供科学的数据支撑,是一套技术先进、功能完善的大数据分析应用系统。

海洋塑料污染数据分析与可视化系统演示视频

演示视频

海洋塑料污染数据分析与可视化系统演示图片

登陆界面.png

海洋塑料污染数据管理.png

海洋污染综合分析.png

数据大屏.png

塑料来源构成分析.png

污染区域分布分析.png

污染时间尺度分析.png

用户管理.png

海洋塑料污染数据分析与可视化系统代码展示

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, sum as spark_sum, avg, count, when, date_format, desc
from pyspark.sql.types import StructType, StructField, StringType, DoubleType, IntegerType, DateType
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
from django.http import JsonResponse
from django.views import View
import json

spark = SparkSession.builder.appName("OceanPlasticPollutionAnalysis").config("spark.sql.adaptive.enabled", "true").config("spark.sql.adaptive.coalescePartitions.enabled", "true").getOrCreate()

def ocean_pollution_data_management(request):
   if request.method == 'POST':
       data = json.loads(request.body)
       pollution_data = spark.createDataFrame([
           (data.get('location_id'), data.get('longitude'), data.get('latitude'), 
            data.get('pollution_level'), data.get('plastic_type'), data.get('collection_date'),
            data.get('pollution_source'), data.get('depth'), data.get('temperature'))
       ], ['location_id', 'longitude', 'latitude', 'pollution_level', 'plastic_type', 
          'collection_date', 'pollution_source', 'depth', 'temperature'])
       pollution_data.write.mode('append').option("driver", "com.mysql.cj.jdbc.Driver").option("url", "jdbc:mysql://localhost:3306/ocean_pollution").option("dbtable", "pollution_data").option("user", "root").option("password", "password").save()
       pollution_df = spark.read.format("jdbc").option("driver", "com.mysql.cj.jdbc.Driver").option("url", "jdbc:mysql://localhost:3306/ocean_pollution").option("dbtable", "pollution_data").option("user", "root").option("password", "password").load()
       total_records = pollution_df.count()
       avg_pollution = pollution_df.agg(avg(col("pollution_level")).alias("avg_pollution")).collect()[0]["avg_pollution"]
       pollution_distribution = pollution_df.groupBy("plastic_type").agg(count("*").alias("count"), avg("pollution_level").alias("avg_level")).orderBy(desc("count"))
       result_data = pollution_distribution.collect()
       processed_result = [{"plastic_type": row["plastic_type"], "count": row["count"], "avg_level": float(row["avg_level"])} for row in result_data]
       validation_df = pollution_df.filter((col("pollution_level") >= 0) & (col("pollution_level") <= 100) & (col("longitude").between(-180, 180)) & (col("latitude").between(-90, 90)))
       valid_count = validation_df.count()
       success_rate = (valid_count / total_records) * 100 if total_records > 0 else 0
       return JsonResponse({"status": "success", "total_records": total_records, "avg_pollution": float(avg_pollution), "pollution_distribution": processed_result, "data_quality": success_rate, "message": "海洋塑料污染数据管理成功"})
   elif request.method == 'GET':
       pollution_df = spark.read.format("jdbc").option("driver", "com.mysql.cj.jdbc.Driver").option("url", "jdbc:mysql://localhost:3306/ocean_pollution").option("dbtable", "pollution_data").option("user", "root").option("password", "password").load()
       recent_data = pollution_df.filter(col("collection_date") >= (datetime.now() - timedelta(days=30)).strftime('%Y-%m-%d'))
       location_stats = recent_data.groupBy("location_id").agg(count("*").alias("record_count"), avg("pollution_level").alias("avg_pollution"), spark_sum("pollution_level").alias("total_pollution")).orderBy(desc("avg_pollution"))
       location_result = location_stats.collect()
       formatted_locations = [{"location_id": row["location_id"], "record_count": row["record_count"], "avg_pollution": float(row["avg_pollution"]), "total_pollution": float(row["total_pollution"])} for row in location_result]
       return JsonResponse({"status": "success", "locations": formatted_locations, "message": "获取海洋污染数据成功"})

def pollution_regional_distribution_analysis(request):
   pollution_df = spark.read.format("jdbc").option("driver", "com.mysql.cj.jdbc.Driver").option("url", "jdbc:mysql://localhost:3306/ocean_pollution").option("dbtable", "pollution_data").option("user", "root").option("password", "password").load()
   regional_df = pollution_df.withColumn("longitude_zone", when(col("longitude") < -60, "西大西洋").when(col("longitude") < 20, "东大西洋").when(col("longitude") < 100, "印度洋西部").when(col("longitude") < 180, "太平洋")).withColumn("latitude_zone", when(col("latitude") > 60, "北极海域").when(col("latitude") > 23.5, "北温带").when(col("latitude") > -23.5, "热带").when(col("latitude") > -60, "南温带").otherwise("南极海域"))
   regional_stats = regional_df.groupBy("longitude_zone", "latitude_zone").agg(count("*").alias("sample_count"), avg("pollution_level").alias("avg_pollution"), spark_sum("pollution_level").alias("total_pollution"), avg("depth").alias("avg_depth"), avg("temperature").alias("avg_temperature")).orderBy(desc("avg_pollution"))
   regional_result = regional_stats.collect()
   processed_regions = []
   for row in regional_result:
       if row["longitude_zone"] and row["latitude_zone"]:
           region_data = {"region": f"{row['longitude_zone']}-{row['latitude_zone']}", "sample_count": row["sample_count"], "avg_pollution": float(row["avg_pollution"]), "total_pollution": float(row["total_pollution"]), "avg_depth": float(row["avg_depth"]) if row["avg_depth"] else 0, "avg_temperature": float(row["avg_temperature"]) if row["avg_temperature"] else 0}
           processed_regions.append(region_data)
   pollution_density = regional_df.groupBy("longitude_zone").agg((spark_sum("pollution_level") / count("*")).alias("pollution_density")).orderBy(desc("pollution_density"))
   density_result = pollution_density.collect()
   density_data = [{"zone": row["longitude_zone"], "density": float(row["pollution_density"])} for row in density_result if row["longitude_zone"]]
   hotspot_analysis = regional_df.filter(col("pollution_level") > 70).groupBy("longitude_zone", "latitude_zone").agg(count("*").alias("hotspot_count")).orderBy(desc("hotspot_count"))
   hotspot_result = hotspot_analysis.collect()
   hotspot_data = [{"region": f"{row['longitude_zone']}-{row['latitude_zone']}", "hotspot_count": row["hotspot_count"]} for row in hotspot_result[:10] if row["longitude_zone"] and row["latitude_zone"]]
   correlation_analysis = regional_df.select("pollution_level", "depth", "temperature").toPandas()
   correlation_matrix = correlation_analysis.corr().to_dict()
   return JsonResponse({"status": "success", "regional_distribution": processed_regions, "pollution_density": density_data, "pollution_hotspots": hotspot_data, "environmental_correlation": correlation_matrix, "message": "海洋污染区域分布分析完成"})

def ocean_pollution_comprehensive_analysis(request):
   pollution_df = spark.read.format("jdbc").option("driver", "com.mysql.cj.jdbc.Driver").option("url", "jdbc:mysql://localhost:3306/ocean_pollution").option("dbtable", "pollution_data").option("user", "root").option("password", "password").load()
   comprehensive_stats = pollution_df.agg(count("*").alias("total_samples"), avg("pollution_level").alias("overall_avg_pollution"), spark_sum("pollution_level").alias("total_pollution_load"), avg("depth").alias("avg_sampling_depth"), avg("temperature").alias("avg_water_temperature")).collect()[0]
   source_analysis = pollution_df.groupBy("pollution_source").agg(count("*").alias("source_count"), avg("pollution_level").alias("avg_source_pollution"), spark_sum("pollution_level").alias("total_source_pollution")).orderBy(desc("total_source_pollution"))
   source_result = source_analysis.collect()
   source_contribution = [{"source": row["pollution_source"], "count": row["source_count"], "avg_pollution": float(row["avg_source_pollution"]), "total_pollution": float(row["total_source_pollution"]), "contribution_rate": (float(row["total_source_pollution"]) / float(comprehensive_stats["total_pollution_load"])) * 100} for row in source_result if row["pollution_source"]]
   plastic_type_analysis = pollution_df.groupBy("plastic_type").agg(count("*").alias("type_count"), avg("pollution_level").alias("avg_type_pollution")).orderBy(desc("type_count"))
   plastic_result = plastic_type_analysis.collect()
   plastic_distribution = [{"plastic_type": row["plastic_type"], "count": row["type_count"], "avg_pollution": float(row["avg_type_pollution"]), "percentage": (row["type_count"] / comprehensive_stats["total_samples"]) * 100} for row in plastic_result if row["plastic_type"]]
   severity_classification = pollution_df.withColumn("severity_level", when(col("pollution_level") <= 30, "轻度污染").when(col("pollution_level") <= 60, "中度污染").when(col("pollution_level") <= 80, "重度污染").otherwise("严重污染")).groupBy("severity_level").agg(count("*").alias("level_count")).orderBy(desc("level_count"))
   severity_result = severity_classification.collect()
   severity_data = [{"severity": row["severity_level"], "count": row["level_count"], "percentage": (row["level_count"] / comprehensive_stats["total_samples"]) * 100} for row in severity_result]
   temporal_trend = pollution_df.withColumn("collection_month", date_format(col("collection_date"), "yyyy-MM")).groupBy("collection_month").agg(avg("pollution_level").alias("monthly_avg_pollution"), count("*").alias("monthly_samples")).orderBy("collection_month")
   temporal_result = temporal_trend.collect()
   trend_data = [{"month": row["collection_month"], "avg_pollution": float(row["monthly_avg_pollution"]), "sample_count": row["monthly_samples"]} for row in temporal_result]
   depth_correlation = pollution_df.select("depth", "pollution_level").toPandas()
   depth_corr_coefficient = depth_correlation.corr().iloc[0, 1] if len(depth_correlation) > 1 else 0
   comprehensive_summary = {"total_samples": comprehensive_stats["total_samples"], "overall_avg_pollution": float(comprehensive_stats["overall_avg_pollution"]), "total_pollution_load": float(comprehensive_stats["total_pollution_load"]), "avg_sampling_depth": float(comprehensive_stats["avg_sampling_depth"]), "avg_water_temperature": float(comprehensive_stats["avg_water_temperature"]), "depth_pollution_correlation": float(depth_corr_coefficient)}
   return JsonResponse({"status": "success", "comprehensive_summary": comprehensive_summary, "source_analysis": source_contribution, "plastic_distribution": plastic_distribution, "severity_classification": severity_data, "temporal_trends": trend_data, "message": "海洋污染综合分析完成"})

海洋塑料污染数据分析与可视化系统文档展示

文档.png

💖💖作者:计算机毕业设计小途 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目