COMP8410 Data Mining S1 2025

95 阅读6分钟

COMP8410 ****Data ****Mining S1 2025

Postgraduate Assignme nt ****1

Maximum ****marks100
Weight20% of the total ****marks for the ****cou rse
Minto pass ****hurdle30%
LengthMaximum of 8 ****pages excluding cover ****page, ****biblio graphy and ****appendices.
LayoutA4. At ****least ****11 ****point type size. ****Use of typeface, ****margins and ****headings consistent with a ****professional style.
Submission deadline9am, Tuesday ****11 th March
Submission ****modeElectronic, ****PDF via W attle, file-name includes u-number
Estimated time15 ****hours
Penalty for ****lateness100% after the deadline ****has ****passed
First ****posted:17 ****February, 8am
Last ****modified:17 ****February, 8am
Questions to:Wattle ****Discussion ****Forum

This assignment specification may be updated to reflect clarifications and modifications after it is first issued.

You are required to submit a single essay ****in the form of a single PDF file ****with ****a ****file-name that ****includes your ****University ****u-numb er ****ID. The first page must have a clearly identified title and author, with both name and university u-number, which may form. a separate cover page. You may also attach supporting information as appendices in the same PDF file. Appendices will not be marked.

This is a single-person assignment and must be completed on your own. You must use quality reference material and carefully reference via in-text citations, including material provided to you in the course.  Any material that you 代写COMP8410 Data Mining S1 2025
quote must have the source clearly referenced. It is unacceptable to present any portion of another author's work as your own. Anyone found doing so will be penalised in marks. In addition, ANU plagiarism procedures apply. This course introduces fundamental concepts that could potentially be addressed by certain Generative AI tools (e.g., ChatGPT). Hence, the ****use of ****any ****Generativ e AI ****tools ****is ****not permitted in graded assessments within this cour se.

You are strongly encouraged to start working on the assignment right away. You can submit as many times as you wish. Only the last submission at the due date will be assessed.

Task

You are to write awell-researched essay that critically ****evaluates ****the ethics ****and ****social impact ****of a data ****mining ****project.

1. Select a ****Data ****Mining ****project and ****describe ****it.

You are asked to select a data ****mining ****project. ****This must be either a project (A) from your workplace (i.e. where you were employed in a paid position and that did not contribute to education credit by any institution) or it can be the (B) alternative project described below.

(A) A ****project from your workplace ****can be a past, completed project, a current, active project, or a future project in planning stages. You may select a scientific project, but it must be the case that the project raises sufficient genuine ethical questions for you to have something to write about in the assignment. For example, the project may use data corresponding to attributes of individual people or organisations that could be privacy-sensitive or for whom the mining results could entrench bias against them. The project must involvedata mining or analytics; simple data collection and release, whether intentional or not, is not sufficient. The project must be conducted by your employer and its agents, and you must be sufficiently involved in a professional capacity to have access to organisational information or insight. You are required to declare the ****nature of your ****involvement, in an appendix or cover sheet if you wish.

For a workplace project, you are encouraged to attach non-confidential background material, written by others, concerning the project about which you write, where this may help to support the information provided in your essay. This should be clearly marked as an appendix and its source and status identified. Your assignment submission will be treated confidentially, but it will be available to ANU staff involved in the course for the purposes of  marking. Please respect your employer’s expectations of confidentiality. If you cannot share sufficient information about your project in order to address the assignment questions, then please do choose a different projector take the (B) Alternative option.

(B) Alternatively, ****select the data mining project presented in New Scientist No. 3525, 11 January 2025, page 8 titled “AI boost **to cancer **detection **”. **This story is further discussed in the editorial in the same issue.  The story page includes a shorter story titled “Can AI **listen to patients” which maybe used to support your discussion if you choose. The stories and editorial are provided with this assignment spec.  Additional references are provided in the stories and others can be found on the Web. Remember to cite carefully. If you are unable to provide all the required information about this project, then you should make an informed guess and explain your reasoning for the answer you give. Otherwise, cite your sources.

Most students: You are expected to choose the (B) Alternative here, although you may choose the Workplace project if you prefer and are involved in it as above.  Applied Data Analytics Students: You are expected to choose the (A) Workplace project but if it is difficult for you to find one (for example, if you are not employed, or you cannot share sufficient information about a workplace project), then you may select the Alternative project.

In your essay you will need to describe the ****project ****in terms of its aims, its methods, the source and nature of the data ****it uses, the authority ****for the organisation’s access to the data, and the expected use and ****i mpact of any results obtained. For the impact you should consider not only how the results are planned to be used, but also how they otherwise could ****be or have ****been ****used.  In every case, you will need to consider whether the data was provided with informed consent, whether it is or could be seen to be of a personal ****nature, and whether the outcomes of the data mining will contribute to social ****improvement ****or improved services to consumers or the public.  You will also need to describe any other aspects of the project that are necessary for you to address the other aspects of your essay.

2. Consider the ethical aspects ****of ****the ****project.

In 2022 UNESCO published the Recommendation on the Ethics **of **Artificial **Intelligence (SHS/BIO/PI/2021/1) for voluntary application by Member States. The recommendation is broad in scope and far-reaching in implementation responsibilities over the whole AI system lifecycle. It includes a statement of values and 10 principles that should be respected by all actors in the AI lifecycle, including “data scientists, end-users, business enterprises, universities and pu blic and private entities” (p10). In 2018, the Australian Government Office of the Australian Information Commissioner released the Guide **to Data Analytics and the Australian Privacy Principles **(APP) .  Meanwhile, the research community has been actively  addressing the principle of explainability and progress is surveyed in Du, Liu and Hu, (2020) “ Techniques **for **Interpretable **Machine **Learning ”, Communications of the ACM 63(1).

You are asked to discuss the ethical aspects ****of your data mining project with particular reference to both the UNESCO recommendation and the APP . **You ****must consider the ****privacy of individuals where personal information is involved: such as credit card transactions, healthcare records, personal financial records, biological traits, criminal or justice investigations, ethnicity or lifestyle. choices.

You may need to address complex issues, like whether the potential cost to a few maybe outweighed by the benefit to many. You are not expected to provide simple, one-directional answers.  While your project may raise many ethical issues, paying attention to the page limit, you are advised to broadly introduce those that you recognise but then to focus your discussion more deeply on some particular issue(s) you choose. For the (B) Alternative project, you are expected to include  the issues raised in the stories in your discussion.

3. ****Recommend ****how the ****project should, could, or should ****have, ****managed ****ethical ****issues ****related to data ****mining.

You are expected to forman opinion on the appropriate measures to put in place to address the ethical issues you have identified.  You must place your opinion in the context of technological solutions available to address ethical issues in data mining. However, you are not asked to consider those methods in detail; a light coverage of the expected benefits of the approach is sufficient. The Duetal **paper will assist you with technical approaches to some ethical issues you may encounter. Other potential technical approaches are summarised in the course notes for Week 1.  You are also specifically required togo beyond such technical solutions alone to consider procedural, governance or educational approaches to managing ethical issues.

While you are asked to provide your own point of view of measures that could or should be taken, you are also asked to explicitly critique alternative views, such as, perhaps, the measures that were **put in place when the project was conducted, or measures that relate to the project that you can discover from the literature or Web sources. Alternatively, for the Workplace project (A) you could interview colleagues in your workplace (but not students of this course) in order to gain alternative points of view about what measures could betaken that are ethically acceptable and proportionate. You may also interview other people that are potentially affected by the results of the project. Consider attaching a transcript, recording or extracts from the interviews as appendices to your essay - such material, where relevant, will be considered as evidence of your research for the essay. You are free to conclude that ethical considerations would recommend against the project going ahead, or being applied in practice, but any conclusion you make must be supported by a well-reasoned argument.

General Comments

An abstract or executive summary is not required. A cover sheet is optional and does not contribute to the page count. No particular layout is specified, but you should follow a professional style. and use no smaller than 11 point typeface and stay within the maximum specified page count. It is a strict maximum: long-winded or irrelevant content within the   limit will be penalised and text beyond the limit will be treated as non-existent. Page margins, heading sizes, paragraph breaks and so forth are not specified but a professional style must be maintained. Appendices maybe used and do not contribute to the page count, but appendices may be only quickly scanned or used for reference and will not be specifically marked.

Your essay is expected to be a well-researched piece of critical ****writing. ****You may find this resource from Sydney University helpful information on what is expected, and noting that it necessarily includes elements of descriptive, analytical, and persuasive writing:

www.sydney.edu.au/students/wr…

You should play close attention to references, ****both to demonstrate the research component of your essay, to support your argument with expert opinion and evidence, and also to appropriately attribute the work of others including all reference documents made available to you (but not this assignment specification itself).  No ****particular ****referencing style. is ****required. ****However, you are expected to reference conventionally, conveniently, and consistently. Your references should be sufficient to both unambiguously identify the source, to describe the nature of the source, and to retrieve the source in online and (if possible) traditional publisher formats.

WX:codinghelp