- University: Salahaddin University-Erbil
- Department: Software Engineering Dept.
- My Status: Lecturer
- Level: MSc
- Year: 2020
Data Mining studies algorithms and computational paradigms that allow computers to find patterns and regularities in databases, perform prediction and forecasting, and generally improve their performance through interaction with data. It is currently regarded as the key element of a more general process called Knowledge Discovery that deals with extracting useful knowledge from raw data. The knowledge discovery process includes data selection, cleaning, coding, using different statistical and machine learning techniques, and visualization of the generated structures. The course will cover all these issues and will illustrate the whole process by examples. Special emphasis will be give to the Machine Learning methods as they provide the real knowledge discovery tools. Important related technologies, as data warehousing and on-line analytical processing (OLAP) will be also discussed. The students will use recent Data Mining software. Enrollment in this course is limited to 15 students.
- An Introduction to Machine Learning.
- Understanding how Data mining works
- Understanding how computers can generate information from data.
- Comparing different Machine learning algorithms to be able to use the best one for the task on hand.
On successful completion of the module students should be able to demonstrate:
- Identify and implement appropriate solutions to low, mid and high-level Data mining problems.
- Represent problems as mathematical models and apply appropriate machine learning and optimization techniques to solve those problems.
- Apply Data mining algorithms and explain their operation.
- Recommend appropriate statistical representations of static and dynamic objects and apply these to solve detection, classification and/or tracking problems.
- Numeric Data Analysis
- Contingency Table Analysis
- Graph Analysis
- Kernel Methods
- High Dimensional Analysis
- Principal Component Analysis
- Closed Itemset Mining
- Non-Derivable Itemsets
- Sequence Support
- Monte Carlo Sampling for Itemset Support
- Expectation Maximization Clustering
- Density-based Clustering
- Decision Trees
- Support Vector Machines
- Classification Assessment