Thursday, October 15, 2009
CS 1004 – DATA WAREHOUSING AND MINING, APRIL / MAY 2008 |
PART A – (10 x 2 = 20 marks) |
1. Compare OLTP and OLAP systems. |
2. What is Data Warehouse Metadata? |
3. What is Dimensionality Reduction? |
4. What is Concept Description? |
5. List two interesting measures for association rules. |
6. What are Iceberg queries? |
7. What is classification? |
8. What is cluster analysis? |
9. What is Web Usage Mining? |
10. What is Visual Data Mining? |
PART B – (5 x 16 = 80 marks) |
11. (a) Briefly compare the following concepts. Explain your points with an example |
(i) Snowflake schema, fact constellation, star net query model [Marks 5] |
(ii) Data cleaning, data transformation, refresh [Marks 5] |
(iii) Discovery-driven cube, multifeature cube, virtual warehouse [Marks 6] |
(b) What are the difference between three main types of data usage: information |
processing, analytical processing and data mining? Discuss the motivation behind OLAP |
mining. [Marks 16]12. (a) For class characterization, what are the main differences |
between a data cube based implementation and a relational implementation such as |
attribute-oriented induction. Discuss which method is most efficient and under what |
condition this is so. [Marks 16] |
Or |
(b) (i) List and discuss the various data mining primitives. [Marks 8] |
(ii) With relevant examples discuss the role of statistics in data mining. [Marks 8] |
13. (a) Explain with an algorithm, how to mine single dimensional Boolean Association |
Rules from transactional database. Give relevant example. [Marks 16] |
Or |
(b) With an algorithm explain constraint-based association mining. Give relevant example. |
[Marks 16] |
14. (a) What are Bayesian classifiers? Explain in detail about: |
(i) Naïve Bayesian classification [Marks 8] |
(ii) Linear and multiple regression. [Marks 8] |
Or(b) Why is outline mining important? Briefly describe the different approaches behind |
statistical based outlier detection, distance-based outlier detection and deviation-based |
outlier detection. [Marks 16] |
15. (a) (i) What is multidimensional analysis? Discuss the same with an example. [Marks 6] |
(ii) Discuss how data mining is done is spatial databases. [Marks 10] |
Or |
(b) (i) Discuss data mining in multimedia databases. [Marks 10] |
(ii) What is time series analysis? Discuss the same with an example. [Marks 6] |