Advanced Electronics and Communication Technology (English) with thesis | |||||
Master | TR-NQF-HE: Level 7 | QF-EHEA: Second Cycle | EQF-LLL: Level 7 |
Course Code: | AIE509 | ||||||||
Course Name: | Data Mining and Big Data | ||||||||
Course Semester: | Fall | ||||||||
Course Credits: |
|
||||||||
Language of instruction: | EN | ||||||||
Course Requisites: | |||||||||
Does the Course Require Work Experience?: | No | ||||||||
Type of course: | Department Elective | ||||||||
Course Level: |
|
||||||||
Mode of Delivery: | Face to face | ||||||||
Course Coordinator : | Prof. Dr. PINAR YILDIRIM | ||||||||
Course Lecturer(s): |
Dr.Öğr.Üyesi SİNA ALP |
||||||||
Course Assistants: |
Course Objectives: | The purpose of the course is to educate students about the main concepts and methods of data mining. The course contains these topics: classification, clustering, association algorithms and data mining studies in different areas. |
Course Content: | The purpose of the course is to educate students about the main concepts and methods of data mining. The course contains these topics: classification, clustering, association algorithms and data mining studies in different areas. |
The students who have succeeded in this course;
|
Week | Subject | Related Preparation |
1) | • What Motivated Data Mining? Why Is It Important? • So, What Is Data Mining? • Data Mining—On What Kind of Data? • Data Mining Functionalities—What Kinds of Patterns Can Be Mined? | Reading chapter 1 |
2) | • What is data? • Attributes. • Types of attributes. • Discrete and continuous variables. • Types of data set. • Record data. • Data matrix. • Document data. • Transaction data. • Graph data. • Chemical data. • Ordered data. • Why data preprocessing? • Why is data dirty? • Why is data preprocessing important? • Multi-dimensional measure of data quality. • Major tasks in data preprocessing. • Data quality. • Noise. • Outliers. | Reading chapter 2. |
3) | • Missing values. • Duplicate data. • Mining data descriptive characteristics. • Measuring the central tendency. • Symmetric vs. skewed data. • Properties of normal distribution curve. • Histogram analysis. • Positively and negatively correlated data. • Not correlated data. • Data cleaning. • How to handle missing data? • How to handle noisy data? • Simple discretization methods: Binning. • Regression. • Cluster analysis. • Data cleaning as a process. • Aggregation. • Sampling. • Types of sampling. • Sample size. | Reading chapter 2. |
4) | • Classification. • Illustrating classification task. • Examples of classification task. • Classification techniques. • Example of a decision tree. • Another example of decision tree. • Apply model to test data. • Decision tree induction. • Issues: data preparation. • Issues: evaluating classification methods. • Algorithm for decision tree induction(ID3/C4.3). • Attribute selection measure: Information gain. • Decision tree example. | Reading chapter 8. |
5) | • Numeric variables and missing values. • Overfitting and tree pruning. • Enhancements to basic decision tree induction. • Model evaluation. • Metrics for performance evaluation. • Limitation of accuracy. • Cost matrix. • Calculation of accuracy. • Cost-sensitive measures. • Model evaluation. • Methods for performance evaluation. • Methods of estimation. • ROC (Receiver Operating Characteristic). • Instance Based Classification. • Nearest neighbor classification. • k-Nearest neighbor algorithm example. | Reading chapter 8-9. |
6) | • What is cluster analysis? • Applications of cluster analysis • What is not cluster analysis? • Notion of a cluster can be ambiguous. • Types of clustering. • Characteristics of the input data are important. • Clustering algorithms. • Hierarchical clustering. • Agglomerative clustering algorithm. • Cluster distance measures. • Single link(min) hierarchical clustering. • Single link(min) hierarchical clustering example. | Reading chapter 10. |
7) | • Complete link(max) hierarchical clustering example. • K-means clustering. • Importance of choosing initial centroids. • Limitations of k-means. • Overcoming k-means limitations. • K-means clustering example. | Reading chapter 10. |
8) | Midterm1 | |
9) | • Association rule mining • Frequent itemset • Association rule • Association rule mining task • • Apriori algorithm • Apriori algorithm example | Reading chapter 6. |
10) | • Statistical classification models. • Bayes theorem and classifier. • Bayes classifier example. • Continuous variables. | Reading chapter 6. |
11) | • Text and web mining. • Natural language processing. • Part-of-speech tagging. • Word sense disambiguation. • Text databases and IR. • Indexing techniques. • Types of text data mining. • Text classification. • Document clustering. • Text categorization. • Categorization methods. • Vector space model. | |
12) | Midterm2 | |
13) | Project presentations | |
14) | Project presentations | |
15) | Final exam |
Course Notes / Textbooks: | Data Mining Concept and Techniques J.Han and M.Kamber @2012| Morgan Kaufmann Publishers ISBN 978-0-12-381479-1 |
References: | İnternet Kaynakları |
Learning Outcomes | 1 |
2 |
3 |
4 |
5 |
||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Program Outcomes | |||||||||||
1) By carrying out scientific research in their field, graduates evaluate and interpret deeply and broadly, their findings and apply their findings. | |||||||||||
2) Graduates have extensive knowledge about current techniques and methods applied in engineering and their limitations. | |||||||||||
3) Graduates can complet and implement knowledge using scientific methods using limited or incomplete data; can use the information of different disciplines together. | |||||||||||
4) Graduates are aware of new and evolving practices of their profession, examinining new knowledge and learning as necessary | |||||||||||
5) Graduates can define and formulate problems related to the field, develop methods to solve them and apply innovative methods in solutions. | |||||||||||
6) Graduates develop new and/or original ideas and methods; design complex systems or processes and develop innovative / alternative solutions in their designs. | |||||||||||
7) Graduates design and apply theoretical, experimental and model-based research; analyze and investigate the complex problems encountered in this process. | |||||||||||
8) Lead in multidisciplinary teams, develop solution approaches in complex situations, work independently and take responsibility. | |||||||||||
9) A foreign language communicates verbally and in writing using at least the European Language Portfolio B2 General Level. | |||||||||||
10) Transfers the processes and outcomes of their work in a systematic and explicit manner, either written or verbally, in the national or international contexts of that area. | |||||||||||
11) Recognize the social, environmental, health, safety, legal aspects of engineering applications, as well as project management and business life practices, and are aware of the limitations they place on engineering applications. | |||||||||||
12) Consider social, scientific and ethical values in the collection, interpretation, announcement of data and in all professional activities. |
No Effect | 1 Lowest | 2 Low | 3 Average | 4 High | 5 Highest |
Program Outcomes | Level of Contribution | |
1) | By carrying out scientific research in their field, graduates evaluate and interpret deeply and broadly, their findings and apply their findings. | |
2) | Graduates have extensive knowledge about current techniques and methods applied in engineering and their limitations. | |
3) | Graduates can complet and implement knowledge using scientific methods using limited or incomplete data; can use the information of different disciplines together. | |
4) | Graduates are aware of new and evolving practices of their profession, examinining new knowledge and learning as necessary | |
5) | Graduates can define and formulate problems related to the field, develop methods to solve them and apply innovative methods in solutions. | |
6) | Graduates develop new and/or original ideas and methods; design complex systems or processes and develop innovative / alternative solutions in their designs. | |
7) | Graduates design and apply theoretical, experimental and model-based research; analyze and investigate the complex problems encountered in this process. | |
8) | Lead in multidisciplinary teams, develop solution approaches in complex situations, work independently and take responsibility. | |
9) | A foreign language communicates verbally and in writing using at least the European Language Portfolio B2 General Level. | |
10) | Transfers the processes and outcomes of their work in a systematic and explicit manner, either written or verbally, in the national or international contexts of that area. | |
11) | Recognize the social, environmental, health, safety, legal aspects of engineering applications, as well as project management and business life practices, and are aware of the limitations they place on engineering applications. | |
12) | Consider social, scientific and ethical values in the collection, interpretation, announcement of data and in all professional activities. |
Lesson | |
Project preparation |
Written Exam (Open-ended questions, multiple choice, true-false, matching, fill in the blanks, sequencing) | |
Individual Project |
Semester Requirements | Number of Activities | Level of Contribution |
Project | 1 | % 10 |
Midterms | 2 | % 50 |
Final | 1 | % 40 |
total | % 100 | |
PERCENTAGE OF SEMESTER WORK | % 60 | |
PERCENTAGE OF FINAL WORK | % 40 | |
total | % 100 |
Activities | Number of Activities | Workload |
Course Hours | 16 | 48 |
Project | 2 | 70 |
Midterms | 12 | 130 |
Final | 4 | 62 |
Total Workload | 310 |