AI3013 Machine Learning Course Project Description: This is a GROUP project (each group should have 4-6 students), which aims at applying machine learning models as well as machine learning techniques (including but not limited to those covered in our lectures) to solve complex real-world tasks using Python. Notice: This project should differ from the one you are undertaking in the Machine Learning Workshop Course. Notice on Deep Learning Models: You may decide to work on Deep learning models, and since our course mainly focus on machine learning models and techniques, deep learning model not be considered as more superior than other machine learning models if you just repeat a model that is designed by others. Also, training deep learning models can be very time consuming, so make sure you have the necessary computing resources. Project Requirement: Problem Selection: • Choose a real-world problem 代写AI3013 Machine Learning from a domain of interest (e.g., healthcare, finance, image recognition, natural language processing, etc.). • Describe the problem, including data sources and the type of machine learning model that will be applied (e.g., regression, classification, clustering, etc.). Dataset Selection: • Choose a dataset from public repositories (e.g., UCI Machine Learning Repository, Kaggle) suitable for this topic. • Ensure the dataset has a sufficient number of samples and features to allow for meaningful analysis and model comparison. • Apply appropriate data preprocessing steps (e.g., handling missing values, encoding categorical features, scaling). Model Theory and Implementation: • Select and implement at least 2 machine learning models for comparison. • Provide a comprehensive explanation of the theoretical background of the chosen models (e.g., loss functions, optimization techniques, and assumptions). • Discuss the strengths and weaknesses of the chosen models. • Include mathematical derivations where relevant (e.g., gradient descent for linear regression). • Implement the selected models From Scratch without using any existing machine learning libraries (e.g., scikit-learn, TensorFlow, Keras, etc.). The implementation should be done in Python using only basic libraries such as NumPy, Pandas, and Matplotlib. Model Evaluation: • Evaluate each model using suitable metrics (e.g., accuracy, precision, recall, F1 score, RMSE) for the problem. • Use cross-validation to ensure model robustness and avoid overfitting. • Analyze the behavior of the models based on the dataset, including bias-variance trade-offs, overfitting, and underfitting. Analysis and Comparison: • Compare the models in terms of: o Performance (accuracy, precision, etc.). o Computational complexity (training time, memory usage). o Suitability for the dataset (e.g., which model performs best, why). • Provide a comparison of the models' performances with appropriate visualizations (e.g., bar plots or tables comparing metrics). • Discuss how the assumptions of each model affect its suitability for the problem. Submission Requirement: Upon completion, each group must submit the following materials:
- Progress report a) Abstract b) Introduction: problem statement, motivation and background of the topic c) Related works and existing techniques of the topic d) Methodology e) Progress/Current Status f) Next Steps and Plan for Completion
- Project report, your report should contain but not limited to the followingcontent: a) Abstract b) Introduction: problem statement, motivation and background of the topic c) Related works and existing techniques of the topic d) Methodology e) Experimental study and result analysis f) Future work and conclusion g) References h) Contribution of each team member
- Link and description to the Dataset and the implementation code.
- Your final report should be a minimum of 9 pages and a maximum of 12 pages
- For the final report, the similarity check Must Not exceed 20%, and the AI generation content check Must Not exceed 25%.
- Put all files (including: source code, presentation ppt and project report) into a ZIP file, then submit it on iSpace. Deadlines: Team Information should be submitted by the end of Week 3. The Progress Report should be submitted by the end of Week 10. The Presentation will be arranged in Weeks 13 and 14 of this semester. Final Project Report should be submitted by Friday of Week 15 (May.23.2025). Assessment: In general, projects will be evaluated based on: Significance. (Did the authors choose an interesting or a “real" problem to work on, or only a small “toy" problem? Is this work likely to be useful and/or haveimpact?) The technical quality of the work. (i.e., Does the technical material make sense? Are the things tried reasonable? Are the proposed algorithms or applications clever and interesting? Do the student convey novel insight about the problem and/or algorithms?) The novelty of the work. (Do you have any novel contributions, e.g., new model, new technique, new method, etc.? Is this project applying a common technique to a well studied problem, or is the problem or method relatively unexplored?) The workload of the project. (The workload of your project may depend on but not limit to the following aspects: the complexity of the problem; the complexity of your method; the complexity of the dataset; do you test your model on one or multiple datasets? do you conduct a thorough experimental analysis on your model?) Evaluation Percentage: Progress Report: 5% Final Report: 40% Presentation: 40% (Each group will have 15-20 minutesfor presentation, and each student must present no less than 3 minutes) Code: 15% It is YOUR responsibility to make sure: Your submitted files can be correctly opened. Your code can be compiled and run. Late submission = 0; Plagiarism (cheating) = F
WX:codinghelp