Programma non disponibile in italiano
INFORMATION RETRIEVAL
Code: 289AA | Credits: 6 | Semester: 1 | |
Lecturers: Ferragina Paolo |
Learning outcomes
KnowledgeThe student who successfully completes the course will have the ability to design a simple search engine or one of the numerous text mining tools which are at the core of modern Web applications.
Assessment criteria of knowledgeThe student will be assessed on his/her demonstrated ability to discuss the main course contents using the appropriate terminology.
Methods:
- Final oral exam
- Final written exam
- Test use of the Lucene and/or Elastic Search software
Further information on the home page of this course.
SkillsStudents will be able to evaluate a search engine and make design and SW choices related to IR applications
Assessment criteria of skillsVia written and oral exam
BehaviorsStudents will be able to understand and evaluate pro/cons of IR tools and which algorithmic solutions are the best for their IR problems at hand.
Assessment criteria of behaviorsVia written and oral exams
Prerequisites
Basics of Algorithms, Maths and Programming
Teaching methods
Delivery: face to face
Learning activities:
- attending lectures
Attendance: Advised
Teaching methods:
- Lectures
Further details in the home page of the course.
Syllabus
Study, design and analysis of IR systems which are efficient and effective to process, mine, search, cluster and classify documents, coming from textual as well as any unstructured domain. In the lectures, we will:
- study and analyze the main components of a modern search engine: Crawler, Parser, Compressor, Indexer, Query resolver, Query and Document annotator, Results Ranker;
- dig into some basic algorithmic techniques which are now ubiquitous in any IR application for data compression, indexing and sketching;
- describe few other IR tools which are used either as a component of a search engine or as independent tools and build up the previous algorithmic techniques, such as: Classification, Clustering, Recommendation, Random Sampling, Locality Sensitive Hashing.
Bibliography
C.D. Manning, P. Raghavan, H. Schutze. Introduction to Information Retrieval. Cambridge University Press, 2008 Chapter 2 “Text compression” of Managing Gigabytes, I.H. Witten and A. Moffat and T.C. Bell, Morgan Kauffman, Second edition, 1999.
Course web page
http://didawiki.di.unipi.it/doku.php/magistraleinformatica/ir/ir16/startFonte: ESSETRE e Portale esami