💖💖作者:计算机毕业设计小途 💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目
@TOC
存量房网上签约月统计信息可视化分析系统介绍
《基于大数据的存量房网上签约月统计信息可视化分析系统》是一个集数据处理、统计分析与可视化展示于一体的综合性房地产数据分析平台。该系统采用Hadoop分布式文件系统和Spark大数据处理框架作为核心技术架构,通过HDFS实现海量存量房签约数据的分布式存储,利用Spark和Spark SQL进行高效的数据清洗、转换和分析计算。系统提供Python+Django和Java+Spring Boot双版本技术实现方案,前端采用Vue框架结合ElementUI组件库构建用户交互界面,通过Echarts图表库实现丰富的数据可视化效果。系统核心功能模块包括存量房签约信息管理、机构效益质量分析、机构市场格局分析、宏观市场趋势分析和市场结构风险分析等,能够从多个维度对房地产市场数据进行深度挖掘和统计分析。数据处理层面运用Pandas和NumPy进行数据预处理和数值计算,结合MySQL数据库实现结构化数据的持久化存储。系统通过大屏可视化功能将复杂的房地产市场数据转化为直观的图表和报表,为房地产管理部门、中介机构和市场研究人员提供科学的决策支持,实现了从原始签约数据到深度市场洞察的完整数据分析链条。
存量房网上签约月统计信息可视化分析系统演示视频
存量房网上签约月统计信息可视化分析系统演示图片
存量房网上签约月统计信息可视化分析系统代码展示
```python
from pyspark.sql import SparkSession
from pyspark.sql.functions import col, sum, count, avg, max, min, date_format, when, desc
from pyspark.sql.types import StructType, StructField, StringType, IntegerType, DoubleType, DateType
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
spark = SparkSession.builder.appName("存量房签约数据分析系统").config("spark.sql.adaptive.enabled", "true").config("spark.sql.adaptive.coalescePartitions.enabled", "true").getOrCreate()
def analyze_contract_monthly_statistics():
contract_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/house_contract").option("dbtable", "contract_info").option("user", "root").option("password", "123456").load()
contract_df = contract_df.withColumn("contract_month", date_format(col("contract_date"), "yyyy-MM"))
monthly_stats = contract_df.groupBy("contract_month").agg(count("*").alias("total_contracts"),sum("contract_amount").alias("total_amount"),avg("contract_amount").alias("avg_amount"),count(when(col("contract_status") == "completed", 1)).alias("completed_contracts"))
monthly_trend = monthly_stats.withColumn("completion_rate", (col("completed_contracts") / col("total_contracts") * 100))
monthly_trend = monthly_trend.withColumn("amount_trend", when(col("total_amount") > col("total_amount"), "上升").when(col("total_amount") < col("total_amount"), "下降").otherwise("平稳"))
area_analysis = contract_df.groupBy("contract_month", "district").agg(count("*").alias("area_contracts"),sum("contract_amount").alias("area_amount"))
hot_areas = area_analysis.filter(col("area_contracts") > 50).orderBy(desc("area_amount"))
monthly_comparison = contract_df.filter(col("contract_date") >= (datetime.now() - timedelta(days=365)).strftime("%Y-%m-%d"))
monthly_comparison = monthly_comparison.groupBy("contract_month").agg(count("*").alias("current_year_contracts"))
last_year_data = contract_df.filter((col("contract_date") >= (datetime.now() - timedelta(days=730)).strftime("%Y-%m-%d")) & (col("contract_date") < (datetime.now() - timedelta(days=365)).strftime("%Y-%m-%d")))
last_year_monthly = last_year_data.groupBy("contract_month").agg(count("*").alias("last_year_contracts"))
growth_analysis = monthly_comparison.join(last_year_monthly, "contract_month", "left_outer")
growth_analysis = growth_analysis.withColumn("growth_rate", ((col("current_year_contracts") - col("last_year_contracts")) / col("last_year_contracts") * 100))
price_analysis = contract_df.groupBy("contract_month").agg(avg("unit_price").alias("avg_unit_price"),max("unit_price").alias("max_unit_price"),min("unit_price").alias("min_unit_price"))
final_result = monthly_trend.join(price_analysis, "contract_month").join(growth_analysis, "contract_month", "left_outer")
result_pandas = final_result.orderBy("contract_month").toPandas()
return result_pandas.to_dict('records')
def analyze_agency_efficiency_quality():
agency_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/house_contract").option("dbtable", "agency_contract_view").option("user", "root").option("password", "123456").load()
efficiency_metrics = agency_df.groupBy("agency_id", "agency_name").agg(count("*").alias("total_transactions"),avg("processing_days").alias("avg_processing_time"),count(when(col("processing_days") <= 7, 1)).alias("fast_transactions"),sum("commission_amount").alias("total_commission"))
efficiency_metrics = efficiency_metrics.withColumn("efficiency_score", (col("fast_transactions") / col("total_transactions") * 100))
quality_metrics = agency_df.groupBy("agency_id", "agency_name").agg(count(when(col("customer_rating") >= 4, 1)).alias("high_rating_count"),count(when(col("contract_status") == "completed", 1)).alias("successful_contracts"),count(when(col("complaint_flag") == 1, 1)).alias("complaint_count"),avg("customer_rating").alias("avg_rating"))
quality_metrics = quality_metrics.withColumn("quality_score", ((col("high_rating_count") / col("total_transactions") * 50) + (col("successful_contracts") / col("total_transactions") * 30) - (col("complaint_count") / col("total_transactions") * 20)))
performance_analysis = efficiency_metrics.join(quality_metrics, ["agency_id", "agency_name"])
performance_analysis = performance_analysis.withColumn("comprehensive_score", (col("efficiency_score") * 0.4 + col("quality_score") * 0.6))
performance_ranking = performance_analysis.withColumn("efficiency_rank", row_number().over(Window.orderBy(desc("efficiency_score"))))
performance_ranking = performance_ranking.withColumn("quality_rank", row_number().over(Window.orderBy(desc("quality_score"))))
top_agencies = performance_analysis.filter(col("comprehensive_score") > 80).orderBy(desc("comprehensive_score"))
problem_agencies = performance_analysis.filter((col("efficiency_score") < 60) | (col("quality_score") < 60))
monthly_performance = agency_df.withColumn("performance_month", date_format(col("contract_date"), "yyyy-MM"))
monthly_agency_stats = monthly_performance.groupBy("agency_id", "agency_name", "performance_month").agg(count("*").alias("monthly_transactions"),avg("processing_days").alias("monthly_avg_time"),avg("customer_rating").alias("monthly_rating"))
trend_analysis = monthly_agency_stats.withColumn("performance_trend", when(col("monthly_transactions") > lag("monthly_transactions", 1).over(Window.partitionBy("agency_id").orderBy("performance_month")), "improving").otherwise("declining"))
commission_analysis = agency_df.groupBy("agency_id", "agency_name").agg(sum("commission_amount").alias("total_revenue"),avg("commission_rate").alias("avg_commission_rate"),count("*").alias("transaction_volume"))
commission_analysis = commission_analysis.withColumn("revenue_per_transaction", col("total_revenue") / col("transaction_volume"))
final_agency_result = performance_analysis.join(commission_analysis, ["agency_id", "agency_name"])
agency_result_pandas = final_agency_result.orderBy(desc("comprehensive_score")).toPandas()
return agency_result_pandas.to_dict('records')
def analyze_market_structure_risk():
market_df = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/house_contract").option("dbtable", "market_analysis_view").option("user", "root").option("password", "123456").load()
price_risk_analysis = market_df.groupBy("district", "property_type").agg(avg("unit_price").alias("avg_price"),stddev("unit_price").alias("price_volatility"),count("*").alias("transaction_count"),max("unit_price").alias("max_price"),min("unit_price").alias("min_price"))
price_risk_analysis = price_risk_analysis.withColumn("price_risk_level", when(col("price_volatility") > col("avg_price") * 0.3, "高风险").when(col("price_volatility") > col("avg_price") * 0.15, "中风险").otherwise("低风险"))
volume_risk_analysis = market_df.withColumn("transaction_month", date_format(col("contract_date"), "yyyy-MM"))
monthly_volume = volume_risk_analysis.groupBy("transaction_month", "district").agg(count("*").alias("monthly_volume"),sum("contract_amount").alias("monthly_amount"))
volume_volatility = monthly_volume.groupBy("district").agg(stddev("monthly_volume").alias("volume_volatility"),avg("monthly_volume").alias("avg_monthly_volume"))
volume_risk = volume_volatility.withColumn("volume_risk_level", when(col("volume_volatility") > col("avg_monthly_volume") * 0.4, "高风险").when(col("volume_volatility") > col("avg_monthly_volume") * 0.2, "中风险").otherwise("低风险"))
concentration_analysis = market_df.groupBy("agency_id").agg(count("*").alias("agency_transactions"))
total_transactions = market_df.count()
market_concentration = concentration_analysis.withColumn("market_share", col("agency_transactions") / total_transactions * 100)
top_agencies_share = market_concentration.filter(col("market_share") > 5).agg(sum("market_share").alias("top_agencies_total_share")).collect()[0]["top_agencies_total_share"]
concentration_risk = "高风险" if top_agencies_share > 60 else ("中风险" if top_agencies_share > 40 else "低风险")
liquidity_analysis = market_df.withColumn("days_on_market", datediff(col("contract_date"), col("listing_date")))
liquidity_metrics = liquidity_analysis.groupBy("district", "property_type").agg(avg("days_on_market").alias("avg_days_on_market"),count(when(col("days_on_market") <= 30, 1)).alias("quick_sales"),count("*").alias("total_listings"))
liquidity_metrics = liquidity_metrics.withColumn("liquidity_ratio", col("quick_sales") / col("total_listings") * 100)
liquidity_risk = liquidity_metrics.withColumn("liquidity_risk_level", when(col("liquidity_ratio") < 20, "高风险").when(col("liquidity_ratio") < 50, "中风险").otherwise("低风险"))
supply_demand_analysis = market_df.groupBy("district").agg(count(when(col("transaction_type") == "supply", 1)).alias("supply_count"),count(when(col("transaction_type") == "demand", 1)).alias("demand_count"))
supply_demand_ratio = supply_demand_analysis.withColumn("supply_demand_ratio", col("supply_count") / col("demand_count"))
supply_demand_risk = supply_demand_ratio.withColumn("supply_demand_risk_level", when((col("supply_demand_ratio") > 1.5) | (col("supply_demand_ratio") < 0.5), "高风险").when((col("supply_demand_ratio") > 1.2) | (col("supply_demand_ratio") < 0.8), "中风险").otherwise("低风险"))
comprehensive_risk = price_risk_analysis.join(volume_risk, "district").join(liquidity_risk, ["district", "property_type"]).join(supply_demand_risk, "district")
risk_score_calculation = comprehensive_risk.withColumn("risk_score", when(col("price_risk_level") == "高风险", 30).when(col("price_risk_level") == "中风险", 20).otherwise(10) + when(col("volume_risk_level") == "高风险", 25).when(col("volume_risk_level") == "中风险", 15).otherwise(5) + when(col("liquidity_risk_level") == "高风险", 25).when(col("liquidity_risk_level") == "中风险", 15).otherwise(5) + when(col("supply_demand_risk_level") == "高风险", 20).when(col("supply_demand_risk_level") == "中风险", 10).otherwise(5))
final_risk_assessment = risk_score_calculation.withColumn("overall_risk_level", when(col("risk_score") > 70, "高风险").when(col("risk_score") > 40, "中风险").otherwise("低风险"))
risk_result_pandas = final_risk_assessment.orderBy(desc("risk_score")).toPandas()
return risk_result_pandas.to_dict('records')
# 存量房网上签约月统计信息可视化分析系统文档展示

> 💖💖作者:计算机毕业设计小途
💙💙个人简介:曾长期从事计算机专业培训教学,本人也热爱上课教学,语言擅长Java、微信小程序、Python、Golang、安卓Android等,开发项目包括大数据、深度学习、网站、小程序、安卓、算法。平常会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。平常喜欢分享一些自己开发中遇到的问题的解决办法,也喜欢交流技术,大家有技术代码这一块的问题可以问我!
💛💛想说的话:感谢大家的关注与支持!
💜💜
[网站实战项目](https://blog.csdn.net/2501_92808674/category_13011385.html)
[安卓/小程序实战项目](https://blog.csdn.net/2501_92808674/category_13011386.html)
[大数据实战项目](https://blog.csdn.net/2501_92808674/category_13011387.html)
[深度学习实战项目](https://blog.csdn.net/2501_92808674/category_13011390.html)