Ata Mining and Knowledge Discovery in Database

In this project option, you will be expected to conduct a comprehensive literature search and survey,
select and study a specific topic in one subject area of data mining and KDD, and write a technical
paper on the selected topic all by yourself. The technical paper you are asked to write can be a
detailed comprehensive survey on some specific topic or the original research work that will have
been done by yourself.

Requirements and Instructions for the Technical Paper:
1. The objective of the paper should be very clear about subject, scope, domain, and the goals to be

2. The paper should address the important advanced and critical issues in a specific area of data
mining and KDD. Your research paper should emphasize not only breadth of coverage, but also
depth of coverage in the specific area.

3. The research paper should give the measurable conclusions and future research directions (this is
your contribution).

4. It might be beneficial to review or browse through about 10 to 15 relevant technical articles
before you make decision on the topic of the research project.

5. The research paper should reflect the quality at certain academic research level.

7. The paper should include adequate abstraction or introduction, and reference list.

8. Please write the paper in your words and statements, and please give the names of references,
citations, and resources of reference materials if you want to use the statements from other
reference articles.

Suggested Topics for KDD Research (But not limited)

Theory and Fundamental Issues in KDD:
Data and knowledge representation for KDD
Database Models for knowledge discovery and data mining
Definitions, formalisms, and theoretical issues in KDD
Fundamental advances in search, retrieval, and discovery methods
Modeling of structured, unstructured and multimedia data for KDD
Metrics for evaluation of KDD results
Probabilistic modeling and uncertainty management in KDD

Data Mining Methods and Algorithms:
Algorithms for learning classification rules, characteristic rules, associative rules
Algorithms for association rule mining
Algorithms for clustering, predication, etc.
Algorithmic complexity, efficiency and scalability issues in KDD
High dimensional datasets and data preprocessing
Parallel and distributed data mining techniques
Probabilistic and statistical models and methods in KDD
Supervised and unsupervised discovery and predictive modeling
Using prior domain knowledge and re-use of discovered knowledge
Measurement of rule interestingness and quality

KDD Process and Human Interaction:
Models of the KDD process
Methods for evaluating subjective relevance and utility
Data and knowledge visualization
Interactive data exploration and discovery
Privacy preservation data mining and security

Application of KDD in business, science, medicine and engineering
Application of KDD methods for mining knowledge in text, image,
audio, sensor, numeric, categorical or mixed format data, semi-structural data
Big-data mining and data analytics
Mining multimedia, hyper-text, spatial, temporal databases
Mining bioinformatics data
Applications of KDD for semantic query optimization
Knowledge discovery and data mining tools
Resource and knowledge discovery using the Internet

Suggested Check List for Written Report :

Your written report should try to include the following items:

a. Introduction and objectives of the research.

b. Current state of arts and existing methodologies in the specific area.

c. Barriers, issues, and open problems in the area.

d. Existing, expected, proposed solutions, methods, and algorithms if any at the time of the
project due.

e. Examples in details (step by step) to illustrate concepts, principles, theories, algorithms,
methodologies, etc.

e. Research results if any at the time of the report due.

f. Analysis and comparison of research methods, algorithms, and expected results if any at the
time of the report due.

g. Conclusions and future research directions if any at the time of the report due.

h. Reference list

Please use recent sources also try to stick to
GSCIS Dissertation Guide, IEEE or ACM journal articles.