COMP3425 Data Mining S1 2025

87 阅读5分钟

COMP3425 ****Data ****Mining S1 2025

Undergraduate Assignment ****1

Maximum ****marks100
Weight20% of the total ****marks for the ****cou rse
Minto pass ****hurdle30%
LengthMaximum of 8 ****pages excluding cover ****page, ****bibli ography and ****appendices.
LayoutA4. At ****least ****11 ****point type size . ****Use of typeface, ****margins and ****headings consistent with a ****professional style.
Submission deadline9am, Tuesday ****11 th March
Submission ****modeElectronic, ****PDF via W attle, file-name includes u-number
Estimated time15 ****hours
Penalty for ****lateness100% after the deadline ****has ****passed
First ****posted:17 ****February, 8am
Last ****modified:17 ****February, 8am
Questions to:Wattle ****Discussion ****Forum

This assignment specification may be updated to reflect clarifications and modifications after it is first issued.

You are required to submit a single report ****in the form. of a single PDF file with ****a ****file-name that ****includes your ****Universit y ****u-number ****ID.  The first page must have a clearly identified title and author, with both name and university u-number, which may form. a separate cover page. You may also attach supporting information as appendices in the same PDF file. Appendices will not be marked.

This is a single-person assignment and must be completed on your own. You must use quality reference material and carefully reference via in-text citations, including material provided to you in the course. Any material that you quote must have the source clearly referenced. It is unacceptable to present any portion of another author's work as your own. Anyone found doing so will be penalised in marks. In addition, ANU plagiarism procedures apply. This course introduces fundamental concepts that could potentially be addressed by certain Generative AI tools (e.g., ChatGPT). Hence, the ****use of ****any ****Generativ e AI ****tools ****is ****not permitted in graded assessments within this cour se.

You are strongly encouraged to start working on the assignment right away. You can submit as many times as you wish. Only the last submission at the due date will be assessed.

Task

The Australian **Computer Society Code of **Professional Conduct 2014 **is expected to be applied by all Computing Professionals in Australia.  It sets out six values but stresses the primacy of the public interest代写COMP3425 Data Mining S1 2025
as the overriding value. In 2018, the Australian Government Office of the Australian Information Commissioner released the Guide **to Data An alytics and **the Australian Privacy Principles **(APP) . In 2022 UNESCO published the Recommendation on the Ethics of **Artificial Intelligence **(SHS/BIO/PI/2021/1) for voluntary application by Member States. The recommendation is broad in scope and far-reaching in implementation responsibilities over the whole AI system lifecycle. It includes a statement of values and 10 principles that should be respected by all actors in the AI lifecycle, including “ data **scientists, end-users, business enterprises, **universities and p ublic and private entities” (p10). These three documents must be read and are provided with this assignment specification

You must also read the paper, Clarke R. (2018),  “Guidelines **for the Responsible Applic ation **of **Data Analytics” Computer Law & Security Review 34, 3 (Jul-Aug 2018), that is provided with this assignment specification and hereafter referred to as the **Guidelines. You must also read the paper, Du, Liu and Hu, (2020) “ Techniques **for Interpretable **Machine Learning”,

Communications of the ACM 63(1) that is also provided with the assignment specification.

You are to consider the application ****of the ACS code of conduct, the 10 UNESCO Principles, Clarke’s Guidelines and Duetal’s Techniques to the following fictitious ad **targeting scenario. You may also use the APP guide, where it is helpful.

Ad Targeting Scenario ****(from Clarke R. (2016) “Big Data, Big Risks”, Information Systems Journal 26, 1 (January 2016) 77-90, PrePrint athttp://www.rogerclarke.com/EC/BDBR.html

A **social **media **service - provider **accumulates **avast **amount **of **social **transaction **data , and **some economic transaction data, through activity on its own sites and those **of strategic **partners. It applies complex **data **analytics **techniques **to **this **data **to **infer **attributes **of **individual **digital **personae . It projects **third - party **ads **and **its **own **promotional **materials **based **on **the **inferred **attributes **of **online identities **and **the **characteristics **of **the **material **being **projected .

The **'brute **force' nature of the data consolidation and analysis means that no a ccount is taken **of **the incidence **of **partial **identities , conflatedidentities , obfuscated **identities , and **imaginary , fanciful , falsified **and **fraudulent **profiles . **This **results **in **mis - placement **of **a **significant **proportion **of **ads , to the detriment mostly of advertisers, but to some extent also of individual consumers. It is ch allenging to conduct audits of ad-targeting effectiveness, and hence advertisers remain unaware of **the **low quality of the data and of the inferences. **This approach **to business **is **undermined by **inappropriate content **appearing **on **childrens ' **screens , and **gambling **and **alcohol **ads **seen **by **partners **in **the browser - windows **of **nominally **reformed **gamblers **and **drinkers .

You must answer the following questions, clearly indicating which question you are answering within your submission. The page lengths suggested for each question here are for guidance only; the given page length limit for the overall assignment is mandatory.

Question ****1.  (1 ****page) Consider the ACS code of conduct. For each of the six values, taking account of any relevant sub-parts, discuss whether the value was demonstrated in the scenario and to what extent. If you assess any value as largely irrelevant to the scenario, then a very brief reason for this assessment is sufficient.

Question 2. ****(1/2 ****page) ****Consider the 10 UNESCO Principles [S III.2]. Looking closely at Principle Proportionality and Do No Harm **[p20], discuss how this principle is applied (or not) in the scenario and identify any potential harm that might have ensued.

Question 3.  (2 ****pages) Consider the numbered guidelines in Table 2 of Clarke’s Guidelines **for the responsible application of data analytics. **From every segment (1 General, 2 Data Acquisition, 3 Data analysis, and 4 Use of the Inferences) choose one guideline that you consider would have been applied **in the scenario. Its application may not be explicit in the scenario description, but it should be relevant and important to the scenario and you can argue that it was applied properly and therefore did not contribute to the negative consequences of the scenario. Explain **its role in the scenario including how it would have contributed to positive outcomes. Justify why it is more relevant than everyone of the other guidelines that you consider would have been applied in the same segment. Argue **how it is **more or less relevant than any guidelines in the same segment that you consider may have been disregarded in the scenario.  Be careful to consider the intention of the guidelines rather than an overly literal interpretation; you may rephrase the chosen guideline for the scenario context where beneficial. For further explanation of this point, see Section 3 in Clarke’s Guidelines.

Question 4. ****(1 ****page) ****(a) ****Choose one, numbered guideline (e.g. guideline 3.3) in Table 2 of the Guidelines that you consider to have been disregarded in the scenario. You may choose any guideline that you did not choose for Question 3.  Discuss how the failure to consider the guideline could have contributed to the negative outcome of the scenario. (b) ****In addition, identify any other potential consequences that could **have occurred due to the failure to consider that same guideline. For this purpose, the consequences you identify are not necessarily explicit within the scenario description.  You might find it helpful to think of this activity as contributing to a risk assessment process prior to your hypothetical involvement in the analysis work of the scenario.

Question 5. ****(1 ****page ) Consider the paper by Duetal, Techniques **for **Interpretable **Machine Learning. Discuss whether and how intrinsic and post-hoc interpretability techniques could be applied to the scenario and what benefits could ensue.

General Comments

An abstract or executive summary is not required.  A cover sheet is optional and does not contribute to the page count. No particular layout is specified, but you should follow a professional style. and use no smaller than 11 point typeface and stay within the maximum  specified page count.  Page margins, heading sizes, paragraph breaks and so forth are not  specified but a professional style must be maintained. Text beyond the page limit or word   count limit will be treated as non-existent. Appendices maybe used and do not contribute  to the page count, but appendices might be only quickly scanned or used for reference and will not be specifically marked.

You must properly attribute the source documents ****provided for your a ssignment ****(but not this assignment specification itself) and any other reference materials you choose to use.

You are not required to use additional materials. No ****particular ****referencing style ****is ****required. ****However, you are expected to reference conventionally, conveniently, and consistently. Your references should be sufficient to unambiguously identify the source, to  describe the nature of the source, and also to retrieve the source in online and (if possible) traditional publisher formats.

An assessment rubric is provided. The rubric will be used to mark your assignment. You are advised to ****use it to supplement your ****understanding of what ****is ****expected for ****the assignment and to direct your effort towards the ****most ****rewarding ****parts of the ****work.

Your assignment submission will be treated confidentially, but it will be available to ANU staff involved in the course for the purposes of marking. WX:codinghelp