### CS5131 T - Module part: Web-Mining Agents (WebMininga)

##### Duration:

1 Semester

##### Turnus of offer:

each winter semester

##### Credit Points:

8

##### Course of study, specific field and term:

- Master IT-Security 2019 (module part), module part, 1st or 2nd semester
- Master Computer Science 2019 (module part), module part, arbitrary semester
- Master Entrepreneurship in Digital Technologies 2020 (module part), module part, arbitrary semester
- Master Entrepreneurship in Digital Technologies 2014 (module part), module part, arbitrary semester
- Master Computer Science 2014 (module part), module part, arbitrary semester
- Certificate in Artificial Intelligence (module part), module part, 1st semester

##### Classes and lectures:

- Web-Mining Agents (lecture, 4 SWS)
- Web-Mining Agents (exercise, 1 SWS)
- Web-Mining Agents (practical course, 1 SWS)

##### Workload:

- 120 Hours private studies
- 90 Hours in-classroom work
- 30 Hours exam preparation

##### Contents of teaching:

- Probabilities and generative models for discrete data
- Gaussian models, Bayesian and frequentist statistics, regression,
- Probabilistic graphical models (e.g., Bayesian networks), learning parameters and structures of probabilistic graphical models (BME, MAP, ML, EM algorithm), probabilistic classification, probabilistic relational models
- Probabilistic reasoning over time (dynamic Bayesian networks, Markov assumption, transition model, sensor model, inference problems: filtering, prediction, smoothing, most-likely explanation, hidden Markov models, Kalman filters, exact inferences and approximations, learning dynamic Bayesian networks)
- Structural Causal Networks (Intervention, instrumental Variables, counterfactuals)
- Mixture models, latent linear models (LDA, LSI, PCA), sparse linear models,
- Decision making under uncertainty (utility theory, decision networks, value of information, sequential decision problems, value iteration, policy iteration, MDPs, decision-theoretic agents, POMDPs, reduction to multidimensional continuous MDPs, dynamic decision networks)
- Game theory, decisions with multiple agents (Nash equilibrium, Bayes-Nash equilibrium), social choice (voting, preferences, paradoxes, Arrow's Theorem, mechanism design (controlled autonomy), rules of encounter
- Building and exchanging symbolic annotations for web data (from named entity recognition to discourse representations)
- Building and exchanging symbolic annotations for web data (from named entity recognition to discourse representations)
- Information association, retrieval, query answering and recommendation

##### Qualification-goals/Competencies:

- Knowledge:Students can explain the agent abstraction, define web mining of rational behavior, and give details about the design of mining agents (goals, utilities, environments). They can describe the main features of environments. The notion of adversarial agent cooperation can be discussed in terms of decision problems and algorithms for solving these problems. For dealing with uncertainty in real-world scenarios, students can summarize how Bayesian networks can be employed as a knowledge representation and reasoning formalism in static and dynamic settings. In addition, students can define decision making procedures in simple and sequential settings, with and with complete access to the state of the environment. In this context, students can describe techniques for solving (partially observable) Markov decision problems, and they can recall techniques for measuring the value of information. Students can identify techniques for simultaneous localization and mapping, and can explain planning techniques for achieving desired states. Students can explain coordination problems and decision making in a multi-agent setting in term of different types of equilibria, social choice functions, voting protocol, and mechanism design techniques.Students can explain the difference between instance-based and model-based learning approaches, and they can enumerate basic machine learning technique for each of the two basic approaches, either on the basis of static data, or on the basis of incrementally incoming data . For dealing with uncertainty, students can describe suitable representation formalisms, and they explain how axioms, features, parameters, or structures used in these formalisms can be learned automatically with different algorithms. Students are also able to sketch different clustering techniques. They depict how the performance of learned classifiers can be improved by ensemble learning, and they can summarize how this influences computational learning theory. Algorithms for reinforcement learning can also be explained by students.
- Skills:Students can select an appropriate agent architecture for concrete agent application scenarios. For simplified agent application students can derive decision trees and apply basic optimization techniques. For those applications they can also create Bayesian networks/dynamic Bayesian networks and apply Bayesian reasoning for simple queries. Students can also name and apply different sampling techniques for simplified agent scenarios. For simple and complex decision making students can compute the best action or policies for concrete settings. In multi-agent situations students will apply techniques for finding different equilibria states, e.g., Nash equilibria. For multi-agent decision making students will apply different voting protocols and compare and explain the results.Students derive decision trees and, in turn, propositional rule sets from static data as well and temporal or streaming data. Students present and apply the basic idea of first-order inductive leaning. They apply the BME, MAP, ML, and EM algorithms for learning parameters of Bayesian networks and compare the different algorithms. They also know how to carry out Gaussian mixture learning. Students can describe basic clustering techniques and explain the basic components of those techniques. Students compare related machine learning techniques, e.g., k-means clustering and nearest neighbor classification. They can distinguish various ensemble learning techniques and compare the different goals of those techniques.
- Social competence: Students work in groups in order to solve small exercise and project assignments and present them in short talks in the plenum. In the associated project lab the students the develop a larger project using up-to-date programing languages and software tools for data science applications.

##### Grading through:

- exam type depends on main module

##### Responsible for this module:

- see main module

##### Teacher:

- Institute of Information Systems
- Prof. Dr. rer. nat. habil. Ralf Möller
- PD Dr. Özgür Özçep

##### Literature:

- M. Hall, I. Witten and E. Frank: Data Mining: Practical Machine Learning Tools and Techniques - Morgan Kaufmann, 2011
- D. Koller, N. Friedman: Probabilistic Graphical Models: Principles and Techniques - MIT Press, 2009
- K. Murphy: Machine Learning: A Probabilistic Perspective - MIT Press, 2012
- S. Russel, P. Norvig: Artificial Intelligence: A Modern Approach - Pearson Education, 2010
- Y. Shoham, K. Leyton-Brown: Multiagent-Systems: Algorithmic, Game-Theoretic, and Logical Foundations - Cambridge University Press, 2009
- References to journal articles on special themes are given in the lecture

##### Language:

- offered only in English

##### Notes

Prerequisites for attending the module:

- None

Prerequisites for the exam:

- Successful completion of homework assignments during the semester.

The competences of the following modules are required for this module (no strict prerequisites):

- Algorithm and Data Structures (CS1001)

- Linear Algebra and Discrete Structures I+II (MA1000, MA1500)

- Databases (CS2700)

- Stochastics (MA2510) or Statistics (PY1800)

- Logic (CS1002)

- Artificial Intelligence (CS3204)

- Information Systems (CS4130)

##### Last updated:

10.9.2020