CE306-Information Retrieval Report Writing - IT Computer Science Assignment Help

Download Solution Order New Solution
Assignment Task


Task 

The context of your task
To properly evaluate a system, you usually need to construct test collections. Given information needs and documents, you need to collect relevance assessments. This is a time-consuming and expensive process involving human beings (in this case you). For tiny collections, exhaustive judgments of relevance for each query and document pair can be obtained. For large modern collections, it is usual for relevance to be assessed only for a subset of the documents for each query. The most standard approach is pooling, where relevance is assessed over a subset of the collection that is formed from the top k documents returned by many different IR systems.

 

Your task
This task comes in stages. Marks are given for each stage. The stages are as follows:

  • Assessing relevance (20%) Suppose you are come to Colchester campus to attend you graduation with you parents, you travel by car and want to check out the car park policy for graduation. The search engine you used give you the following results, your task is to make a binary judgement for each document on their relevance (relevance/non-relevance) and explain why. (You need click each link and check the contents of the webpage)
  • Pooling (10%) Now suppose we have three IR systems (IR1, IR2 and IR3) developed and we need to create a test collection by using the Pooling. In total we have 20 documents (1 - 20), a program f (x) = x%2 is served as human assessor in this task i.e., f (1) = 1 (relevant) and f (2) = 0 (non-relevant). For this task we construct the pool by putting together top 10 retrieval results. What are the relevance labels we will get for each document? (N.B. Documents outside the pool are automatically considered to be irrelevant)

           Rank    1    2    3    4    5    6    7    8    9    10    11    12    13    14    15    16    17    18    19    20
           IR1    7    15    13    9    16    19    8    18    17    1    5    2    14    20    4    10    3    11    6    12
           IR2    20    8    9    1    16    3    17    18    4    2    14    7    13    5    11    10    15    6    19    12
           IR3    6    1    16    7    4    19    18    3    20    14    8    5    9    12    13    15    17    2    11    10

 

  • P/R@5 (15%) Once you have your relevance labels from pooling, you can explore the effect of each IR system on the evaluation results. The first task you will need to compute the P@5 and R@5 for all three systems.
  • Average Precision (15%) Next let’s compute the average precision for all three systems. Again, we will use the relevance labels from pooling.
  • Discounted Cumulative Gain (15%) Next you need to compute the DCG for all systems, here we use the binary judgement, i.e., rel = 0/1 for non-relevant and relevant documents respectively the label is decided by the pooling step.
  • Precision-Recall Curves (15%) Now you are required to draw the precision-recall curves for those systems. Suppose we need to select one system for a scholar search that require 80% of recall, which system you are going to choose?
  • Web Search (10%) Finally you need to choose a system for web search, which metrics you will use to make your decision and why? According to the metric you choose which system you will use.

 

This CE306-IT Computer Science Assignment has been solved by our IT Computer Science Expert at My Uni Papers. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing Style. Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. 

You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turn tin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

Get It Done! Today

Country
Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
+

Every Assignment. Every Solution. Instantly. Deadline Ahead? Grab Your Sample Now.