一、个人简介
💖💖作者:计算机编程果茶熊 💙💙个人简介:曾长期从事计算机专业培训教学,担任过编程老师,同时本人也热爱上课教学,擅长Java、微信小程序、Python、Golang、安卓Android等多个IT方向。会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 计算机毕业设计选题 💕💕文末获取源码联系计算机编程果茶熊
二、系统介绍
大数据框架:Hadoop+Spark(Hive需要定制修改) 开发语言:Java+Python(两个版本都支持) 数据库:MySQL 后端框架:SpringBoot(Spring+SpringMVC+Mybatis)+Django(两个版本都支持) 前端:Vue+Echarts+HTML+CSS+JavaScript+jQuery
《谷物农作物数据可视化分析系统》是一套基于大数据技术的综合性农业数据分析平台,采用Hadoop+Spark大数据框架作为底层支撑,结合Python语言开发,构建了完整的数据处理与分析体系。系统后端采用Django框架提供稳定的服务支持,前端运用Vue+ElementUI+Echarts技术栈实现直观的数据可视化展示。在数据处理层面,系统充分利用HDFS分布式文件系统进行海量农业数据存储,通过Spark SQL实现高效的数据查询与计算,结合Pandas和NumPy进行精细化的数据分析处理,并使用MySQL数据库管理结构化业务数据。系统核心功能涵盖价格趋势分析、生产与产量分析、灾害影响分析、宏观经济关联分析以及价产效益综合分析五大模块,为用户提供从数据采集、处理、分析到可视化展示的完整解决方案,帮助农业相关部门和研究机构更好地理解农作物市场动态,把握生产规律,制定科学合理的农业发展策略。
三、视频解说
四、部分功能展示
五、部分代码展示
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, avg, sum, count, when, year, month, lag, percent_rank
from pyspark.sql.window import Window
import pandas as pd
import numpy as np
from django.http import JsonResponse
from django.views.decorators.http import require_http_methods
import json
spark = SparkSession.builder.appName("GrainAnalysisSystem").config("spark.sql.adaptive.enabled", "true").config("spark.sql.adaptive.coalescePartitions.enabled", "true").getOrCreate()
@require_http_methods(["GET"])
def price_trend_analysis(request):
grain_type = request.GET.get('grain_type', 'wheat')
start_date = request.GET.get('start_date')
end_date = request.GET.get('end_date')
price_df = spark.sql(f"SELECT date, price, market_location FROM grain_prices WHERE grain_type='{grain_type}' AND date BETWEEN '{start_date}' AND '{end_date}'")
monthly_prices = price_df.withColumn("year_month", year(col("date")).cast("string") + "-" + month(col("date")).cast("string")).groupBy("year_month", "market_location").agg(avg("price").alias("avg_price"), count("price").alias("price_count"))
price_trend = monthly_prices.withColumn("price_change", col("avg_price") - lag("avg_price", 1).over(Window.partitionBy("market_location").orderBy("year_month")))
price_volatility = price_trend.withColumn("volatility_level", when(col("price_change") > 0.1, "high").when(col("price_change") < -0.1, "low").otherwise("stable"))
regional_comparison = price_trend.groupBy("year_month").agg(avg("avg_price").alias("national_avg"), count("market_location").alias("market_count"))
price_ranking = price_trend.withColumn("price_rank", percent_rank().over(Window.partitionBy("year_month").orderBy(col("avg_price").desc())))
seasonal_pattern = price_df.withColumn("month", month(col("date"))).groupBy("month").agg(avg("price").alias("seasonal_avg"))
price_forecasting_data = price_trend.select("year_month", "avg_price").orderBy("year_month").collect()
prices_array = np.array([row.avg_price for row in price_forecasting_data])
if len(prices_array) > 3:
trend_slope = np.polyfit(range(len(prices_array)), prices_array, 1)[0]
next_month_prediction = prices_array[-1] + trend_slope
else:
next_month_prediction = np.mean(prices_array) if len(prices_array) > 0 else 0
result_data = {
'trend_data': [{'month': row.year_month, 'price': row.avg_price, 'change': row.price_change} for row in price_trend.collect()],
'volatility_analysis': price_volatility.groupBy("volatility_level").count().collect(),
'regional_comparison': [{'month': row.year_month, 'avg': row.national_avg} for row in regional_comparison.collect()],
'seasonal_pattern': [{'month': row.month, 'avg': row.seasonal_avg} for row in seasonal_pattern.collect()],
'price_prediction': next_month_prediction
}
return JsonResponse(result_data)
@require_http_methods(["GET"])
def production_yield_analysis(request):
region = request.GET.get('region', 'all')
crop_type = request.GET.get('crop_type', 'wheat')
analysis_year = request.GET.get('year', '2023')
production_df = spark.sql(f"SELECT region, crop_type, planting_area, yield_per_hectare, total_production, production_cost FROM crop_production WHERE crop_type='{crop_type}' AND year='{analysis_year}'")
if region != 'all':
production_df = production_df.filter(col("region") == region)
yield_efficiency = production_df.withColumn("efficiency_ratio", col("total_production") / col("planting_area")).withColumn("cost_per_unit", col("production_cost") / col("total_production"))
regional_ranking = yield_efficiency.withColumn("yield_rank", percent_rank().over(Window.orderBy(col("yield_per_hectare").desc()))).withColumn("efficiency_rank", percent_rank().over(Window.orderBy(col("efficiency_ratio").desc())))
productivity_analysis = yield_efficiency.groupBy("region").agg(avg("yield_per_hectare").alias("avg_yield"), sum("total_production").alias("regional_total"), avg("efficiency_ratio").alias("avg_efficiency"))
cost_benefit_analysis = yield_efficiency.withColumn("profit_margin", (col("total_production") * 3.5 - col("production_cost")) / col("production_cost"))
climate_impact_df = spark.sql(f"SELECT c.region, c.total_production, w.temperature, w.precipitation, w.disaster_days FROM crop_production c JOIN weather_data w ON c.region = w.region WHERE c.crop_type='{crop_type}' AND c.year='{analysis_year}'")
climate_correlation = climate_impact_df.select(col("total_production"), col("temperature"), col("precipitation"), col("disaster_days")).toPandas()
if len(climate_correlation) > 0:
temp_correlation = np.corrcoef(climate_correlation['total_production'], climate_correlation['temperature'])[0,1]
precip_correlation = np.corrcoef(climate_correlation['total_production'], climate_correlation['precipitation'])[0,1]
disaster_correlation = np.corrcoef(climate_correlation['total_production'], climate_correlation['disaster_days'])[0,1]
else:
temp_correlation = precip_correlation = disaster_correlation = 0
yield_prediction_data = productivity_analysis.select("avg_yield").collect()
yield_values = np.array([row.avg_yield for row in yield_prediction_data])
if len(yield_values) > 0:
yield_forecast = np.mean(yield_values) * 1.02
else:
yield_forecast = 0
optimal_regions = regional_ranking.filter(col("yield_rank") > 0.7).select("region", "yield_per_hectare", "efficiency_ratio").collect()
result_data = {
'yield_analysis': [{'region': row.region, 'yield': row.avg_yield, 'total': row.regional_total, 'efficiency': row.avg_efficiency} for row in productivity_analysis.collect()],
'cost_benefit': [{'region': row.region, 'cost_per_unit': row.cost_per_unit, 'profit_margin': row.profit_margin} for row in cost_benefit_analysis.collect()],
'climate_impact': {'temperature_correlation': temp_correlation, 'precipitation_correlation': precip_correlation, 'disaster_correlation': disaster_correlation},
'yield_forecast': yield_forecast,
'optimal_regions': [{'region': row.region, 'yield': row.yield_per_hectare, 'efficiency': row.efficiency_ratio} for row in optimal_regions]
}
return JsonResponse(result_data)
@require_http_methods(["GET"])
def disaster_impact_analysis(request):
disaster_type = request.GET.get('disaster_type', 'all')
affected_region = request.GET.get('region', 'all')
time_range = request.GET.get('time_range', '2023')
disaster_df = spark.sql(f"SELECT disaster_type, affected_region, disaster_date, affected_area, crop_loss_rate, economic_loss FROM disaster_records WHERE YEAR(disaster_date)='{time_range}'")
if disaster_type != 'all':
disaster_df = disaster_df.filter(col("disaster_type") == disaster_type)
if affected_region != 'all':
disaster_df = disaster_df.filter(col("affected_region") == affected_region)
disaster_frequency = disaster_df.groupBy("disaster_type", "affected_region").agg(count("disaster_date").alias("frequency"), sum("affected_area").alias("total_affected_area"), avg("crop_loss_rate").alias("avg_loss_rate"))
severity_analysis = disaster_df.withColumn("severity_level", when(col("crop_loss_rate") > 0.3, "severe").when(col("crop_loss_rate") > 0.1, "moderate").otherwise("mild"))
severity_distribution = severity_analysis.groupBy("disaster_type", "severity_level").count()
economic_impact = disaster_df.groupBy("affected_region").agg(sum("economic_loss").alias("total_economic_loss"), avg("economic_loss").alias("avg_economic_loss"))
recovery_analysis_df = spark.sql(f"SELECT d.affected_region, d.disaster_date, d.crop_loss_rate, p.total_production as post_disaster_production FROM disaster_records d JOIN crop_production p ON d.affected_region = p.region WHERE YEAR(d.disaster_date)='{time_range}' AND p.year='{time_range}'")
recovery_rate = recovery_analysis_df.withColumn("expected_production", col("post_disaster_production") / (1 - col("crop_loss_rate"))).withColumn("recovery_efficiency", col("post_disaster_production") / col("expected_production"))
seasonal_disaster_pattern = disaster_df.withColumn("disaster_month", month(col("disaster_date"))).groupBy("disaster_month", "disaster_type").count().orderBy("disaster_month")
risk_assessment = disaster_frequency.withColumn("risk_score", col("frequency") * col("avg_loss_rate")).withColumn("risk_level", when(col("risk_score") > 0.5, "high").when(col("risk_score") > 0.2, "medium").otherwise("low"))
vulnerable_areas = risk_assessment.filter(col("risk_level") == "high").select("affected_region", "disaster_type", "risk_score").collect()
historical_trend_df = spark.sql("SELECT YEAR(disaster_date) as year, disaster_type, COUNT(*) as yearly_count, AVG(crop_loss_rate) as yearly_avg_loss FROM disaster_records GROUP BY YEAR(disaster_date), disaster_type ORDER BY year")
trend_data = historical_trend_df.collect()
if len(trend_data) > 2:
yearly_counts = np.array([row.yearly_count for row in trend_data])
trend_slope = np.polyfit(range(len(yearly_counts)), yearly_counts, 1)[0]
disaster_trend = "increasing" if trend_slope > 0 else "decreasing"
else:
disaster_trend = "stable"
mitigation_effectiveness = recovery_rate.groupBy("affected_region").agg(avg("recovery_efficiency").alias("avg_recovery_rate"))
result_data = {
'disaster_frequency': [{'type': row.disaster_type, 'region': row.affected_region, 'frequency': row.frequency, 'affected_area': row.total_affected_area, 'loss_rate': row.avg_loss_rate} for row in disaster_frequency.collect()],
'severity_analysis': [{'type': row.disaster_type, 'severity': row.severity_level, 'count': row.count} for row in severity_distribution.collect()],
'economic_impact': [{'region': row.affected_region, 'total_loss': row.total_economic_loss, 'avg_loss': row.avg_economic_loss} for row in economic_impact.collect()],
'recovery_analysis': [{'region': row.affected_region, 'recovery_rate': row.recovery_efficiency} for row in recovery_rate.collect()],
'seasonal_pattern': [{'month': row.disaster_month, 'type': row.disaster_type, 'count': row.count} for row in seasonal_disaster_pattern.collect()],
'risk_assessment': [{'region': row.affected_region, 'type': row.disaster_type, 'risk_score': row.risk_score} for row in vulnerable_areas],
'disaster_trend': disaster_trend,
'mitigation_effectiveness': [{'region': row.affected_region, 'recovery_rate': row.avg_recovery_rate} for row in mitigation_effectiveness.collect()]
}
return JsonResponse(result_data)
六、部分文档展示
七、END
💕💕文末获取源码联系计算机编程果茶熊