985高校导师推荐:基于Python+Vue的汽车品牌投诉数据分析系统毕设指南|毕设|计算机毕设|程序开发|项目实战

51 阅读5分钟

前言

💖💖作者:计算机程序员小杨 💙💙个人简介:我是一名计算机相关专业的从业者,擅长Java、微信小程序、Python、Golang、安卓Android等多个IT方向。会做一些项目定制化开发、代码讲解、答辩教学、文档编写、也懂一些降重方面的技巧。热爱技术,喜欢钻研新工具和框架,也乐于通过代码解决实际问题,大家有技术代码这一块的问题可以问我! 💛💛想说的话:感谢大家的关注与支持! 💕💕文末获取源码联系 计算机程序员小杨 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目 计算机毕业设计选题 💜💜

一.开发工具简介

大数据框架:Hadoop+Spark(本次没用Hive,支持定制) 开发语言:Python+Java(两个版本都支持) 后端框架:Django+Spring Boot(Spring+SpringMVC+Mybatis)(两个版本都支持) 前端:Vue+ElementUI+Echarts+HTML+CSS+JavaScript+jQuery 详细技术点:Hadoop、HDFS、Spark、Spark SQL、Pandas、NumPy 数据库:MySQL

二.系统内容简介

基于Python+Vue的汽车品牌投诉数据分析系统是一套综合运用大数据技术和现代Web开发框架的数据分析平台。该系统采用Hadoop+Spark作为大数据处理框架,通过Python语言实现数据采集、清洗和分析功能,结合Django后端框架构建稳定的API服务层。前端采用Vue+ElementUI+Echarts技术栈,为用户提供直观友好的数据可视化界面。系统核心功能涵盖汽车品牌分析、车型对比分析、问题类型统计分析以及文本挖掘分析,能够从多维度深入挖掘汽车投诉数据的价值。通过Spark SQL和Pandas、NumPy等数据处理工具,系统可以高效处理大规模汽车投诉数据,生成品牌投诉趋势图、车型问题分布图、投诉热点词云等多种可视化图表。管理员端支持用户管理和投诉信息管理,普通用户可以浏览各类分析报告和数据可视化大屏,为汽车行业从业者、消费者和研究人员提供有价值的数据洞察。

三.系统功能演示

985高校导师推荐:基于Python+Vue的汽车品牌投诉数据分析系统毕设指南|毕设|计算机毕设|程序开发|项目实战

四.系统界面展示

在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述 在这里插入图片描述

五.系统源码展示


from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.ml.feature import Tokenizer, StopWordsRemover
import pandas as pd
import numpy as np
from django.http import JsonResponse
import jieba
import re
from collections import Counter

spark = SparkSession.builder.appName("CarComplaintAnalysis").config("spark.sql.adaptive.enabled", "true").getOrCreate()

def brand_analysis_processing(request):
    complaint_data = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/car_complaint").option("driver", "com.mysql.cj.jdbc.Driver").option("dbtable", "car_complaint_info").option("user", "root").option("password", "123456").load()
    brand_complaint_count = complaint_data.groupBy("brand_name").agg(count("complaint_id").alias("complaint_count"), countDistinct("car_model").alias("model_count")).orderBy(desc("complaint_count"))
    brand_trend_data = complaint_data.withColumn("complaint_month", date_format(col("complaint_date"), "yyyy-MM")).groupBy("brand_name", "complaint_month").agg(count("complaint_id").alias("monthly_count")).orderBy("brand_name", "complaint_month")
    brand_problem_distribution = complaint_data.groupBy("brand_name", "problem_category").agg(count("complaint_id").alias("problem_count")).orderBy("brand_name", desc("problem_count"))
    top_brands = brand_complaint_count.limit(10).collect()
    trend_result = brand_trend_data.filter(col("brand_name").isin([row.brand_name for row in top_brands])).collect()
    problem_result = brand_problem_distribution.filter(col("brand_name").isin([row.brand_name for row in top_brands])).collect()
    brand_analysis_result = {"brand_ranking": [{"brand": row.brand_name, "complaints": row.complaint_count, "models": row.model_count} for row in top_brands], "trend_data": [{"brand": row.brand_name, "month": row.complaint_month, "count": row.monthly_count} for row in trend_result], "problem_distribution": [{"brand": row.brand_name, "category": row.problem_category, "count": row.problem_count} for row in problem_result]}
    return JsonResponse(brand_analysis_result)

def car_model_analysis_processing(request):
    complaint_data = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/car_complaint").option("driver", "com.mysql.cj.jdbc.Driver").option("dbtable", "car_complaint_info").option("user", "root").option("password", "123456").load()
    model_complaint_stats = complaint_data.groupBy("brand_name", "car_model").agg(count("complaint_id").alias("total_complaints"), avg("satisfaction_score").alias("avg_satisfaction"), countDistinct("problem_category").alias("problem_types")).orderBy(desc("total_complaints"))
    model_year_analysis = complaint_data.groupBy("car_model", "production_year").agg(count("complaint_id").alias("year_complaints")).orderBy("car_model", "production_year")
    model_severity_analysis = complaint_data.groupBy("car_model", "severity_level").agg(count("complaint_id").alias("severity_count")).orderBy("car_model", "severity_level")
    popular_models = model_complaint_stats.limit(15).collect()
    year_data = model_year_analysis.filter(col("car_model").isin([row.car_model for row in popular_models])).collect()
    severity_data = model_severity_analysis.filter(col("car_model").isin([row.car_model for row in popular_models])).collect()
    model_comparison_data = complaint_data.filter(col("car_model").isin([row.car_model for row in popular_models[:8]])).groupBy("car_model").agg(avg("repair_cost").alias("avg_cost"), max("complaint_date").alias("latest_complaint"), min("complaint_date").alias("earliest_complaint")).collect()
    model_analysis_result = {"model_stats": [{"brand": row.brand_name, "model": row.car_model, "complaints": row.total_complaints, "satisfaction": round(row.avg_satisfaction, 2), "problem_types": row.problem_types} for row in popular_models], "year_trend": [{"model": row.car_model, "year": row.production_year, "complaints": row.year_complaints} for row in year_data], "severity_analysis": [{"model": row.car_model, "severity": row.severity_level, "count": row.severity_count} for row in severity_data], "model_comparison": [{"model": row.car_model, "avg_cost": row.avg_cost, "latest": str(row.latest_complaint), "earliest": str(row.earliest_complaint)} for row in model_comparison_data]}
    return JsonResponse(model_analysis_result)

def text_mining_analysis_processing(request):
    complaint_data = spark.read.format("jdbc").option("url", "jdbc://localhost:3306/car_complaint").option("driver", "com.mysql.cj.jdbc.Driver").option("dbtable", "car_complaint_info").option("user", "root").option("password", "123456").load()
    complaint_texts = complaint_data.select("complaint_content", "brand_name", "problem_category").collect()
    all_text_content = " ".join([row.complaint_content for row in complaint_texts if row.complaint_content])
    cleaned_text = re.sub(r'[^\u4e00-\u9fff\w\s]', '', all_text_content)
    word_list = jieba.lcut(cleaned_text)
    stop_words = {'的', '了', '和', '是', '在', '我', '有', '就', '不', '人', '都', '一', '一个', '上', '也', '很', '到', '说', '要', '去', '你', '会', '着', '没有', '看', '好', '自己', '这'}
    filtered_words = [word for word in word_list if len(word) > 1 and word not in stop_words]
    word_freq = Counter(filtered_words)
    top_keywords = word_freq.most_common(50)
    brand_keywords = {}
    for row in complaint_texts:
        if row.complaint_content and row.brand_name:
            brand_text = jieba.lcut(row.complaint_content)
            brand_filtered = [word for word in brand_text if len(word) > 1 and word not in stop_words]
            if row.brand_name not in brand_keywords:
                brand_keywords[row.brand_name] = []
            brand_keywords[row.brand_name].extend(brand_filtered)
    brand_word_analysis = {}
    for brand, words in brand_keywords.items():
        brand_freq = Counter(words)
        brand_word_analysis[brand] = brand_freq.most_common(10)
    problem_text_analysis = {}
    for row in complaint_texts:
        if row.complaint_content and row.problem_category:
            if row.problem_category not in problem_text_analysis:
                problem_text_analysis[row.problem_category] = []
            category_words = jieba.lcut(row.complaint_content)
            category_filtered = [word for word in category_words if len(word) > 1 and word not in stop_words]
            problem_text_analysis[row.problem_category].extend(category_filtered)
    category_keywords = {}
    for category, words in problem_text_analysis.items():
        category_freq = Counter(words)
        category_keywords[category] = category_freq.most_common(8)
    text_mining_result = {"overall_keywords": [{"word": word, "frequency": freq} for word, freq in top_keywords], "brand_keywords": {brand: [{"word": word, "frequency": freq} for word, freq in keywords] for brand, keywords in brand_word_analysis.items()}, "category_keywords": {category: [{"word": word, "frequency": freq} for word, freq in keywords] for category, keywords in category_keywords.items()}}
    return JsonResponse(text_mining_result)

六.系统文档展示

在这里插入图片描述

结束

💛💛想说的话:感谢大家的关注与支持! 💕💕文末获取源码联系 计算机程序员小杨 💜💜 网站实战项目 安卓/小程序实战项目 大数据实战项目 深度学习实战项目 计算机毕业设计选题 💜💜