and searching in massive text collections • Information retrieval (search engines): Analysis, organization, storage, and retrieval of information ◦ Search engine architecture ◦ Retrieval models ◦ Search engine evaluation ◦ Knowledge graphs and semantic search • Text mining (text analytics): Deriving high-quality information from textual data by analyzing trends and patterns ◦ Text classification ◦ Text clustering ◦ Topic analysis 3 / 16
divided into Groups A and B ◦ Group A: last names starting A-L ◦ Group B: last names starting M-Z (M-Å) ◦ Group assignments can be found on Canvas • Each class/lab is given in two identical editions, for the two groups • Class attendance is logged (for COVID) • You will only be permitted to enter the classroom in your assigned timeslot 6 / 16
(weeks 35-42) ◦ Mondays and Tuesdays are classes for discussion and exercises (led by me) • Video lectures are made available (at least 24hours) before the class–You are expected to watch it before the class! ◦ Wednesdays are labs for getting help on the obligatory assignments (led by TA) • Trial exam (week 43) • Group project work (weeks 44-46) ◦ Complete a project in groups of 2-3 and write a report that will be graded • Bring your own device (laptop) ◦ Python 3.6+ (Anaconda distribution) ◦ GitHub user and git client (e.g., GitHub Desktop) 7 / 16
A-F • Project work ◦ 50% from individual assignments ◦ 50% from group project work ◦ Needs to be >F in order to pass the course! • Written exam ◦ Digital exam (Inspera) ◦ Open book ◦ Mixture of exercises, multiple choice, and essay questions 8 / 16
during the lecture period, with a deadline 2 weeks in the future (there may be exceptions) • Points vary based on difficulty • Single delivery (no resubmissions, corrections, etc.) • Deadlines are strict, no extensions, no exceptions! • Assignments account for 50% of the project work final grade • Wednesday labs are dedicated to working on assignments—this is the time and place to get help 9 / 16
you ◦ You need to fill out the sign-up form, if you haven’t already done so! • Starter files are pushed to your private repository • You need to complete the tasks—you know you’re done when your code passes all the tests ◦ Some tasks may have additional “hidden” tests for grading • You need to push and commit your changes ◦ Easiest is to check on GitHub web interface whether your latest version is submitted 10 / 16
There will be a pool of options to select a project from • The task will be to tackle a problem, perform experiments, and write a report about the findings • Groups can get weekly feedback during the group project period ◦ 15min dedicated weekly slots with lecture during the class hours (Mon/Tue) to discuss progress/ideas ◦ Feedback on draft report from the teaching assistant during lab hours (Wed) ◦ Both in-person and remote (Zoom) options will be available • More information will follow later • Accounts for 50% of the project work final grade 11 / 16
is [email protected] • Wednesday labs are for working on the assignments. This is the time to get help! • If you need to talk to the lecturer, make an appointment via email. No drop-ins unannounced! 15 / 16
discussion and related exercises • See the exercise workflow on GitHub under exercises: https://github.com/kbalog/ir-course/tree/master/exercises • Make sure you have Python 3.6+ (Anaconda distribution highly recommended) and the ipython-unittest package installed • Complete today’s exercise (under exercises/20200824) 16 / 16