Thursday, October 15, 2009
| CS 1004 – DATA WAREHOUSING AND MINING, APRIL / MAY 2008 |
| PART A – (10 x 2 = 20 marks) |
| 1. Compare OLTP and OLAP systems. |
| 2. What is Data Warehouse Metadata? |
| 3. What is Dimensionality Reduction? |
| 4. What is Concept Description? |
| 5. List two interesting measures for association rules. |
| 6. What are Iceberg queries? |
| 7. What is classification? |
| 8. What is cluster analysis? |
| 9. What is Web Usage Mining? |
| 10. What is Visual Data Mining? |
| PART B – (5 x 16 = 80 marks) |
| 11. (a) Briefly compare the following concepts. Explain your points with an example |
| (i) Snowflake schema, fact constellation, star net query model [Marks 5] |
| (ii) Data cleaning, data transformation, refresh [Marks 5] |
| (iii) Discovery-driven cube, multifeature cube, virtual warehouse [Marks 6] |
| (b) What are the difference between three main types of data usage: information |
| processing, analytical processing and data mining? Discuss the motivation behind OLAP |
| mining. [Marks 16]12. (a) For class characterization, what are the main differences |
| between a data cube based implementation and a relational implementation such as |
| attribute-oriented induction. Discuss which method is most efficient and under what |
| condition this is so. [Marks 16] |
| Or |
| (b) (i) List and discuss the various data mining primitives. [Marks 8] |
| (ii) With relevant examples discuss the role of statistics in data mining. [Marks 8] |
| 13. (a) Explain with an algorithm, how to mine single dimensional Boolean Association |
| Rules from transactional database. Give relevant example. [Marks 16] |
| Or |
| (b) With an algorithm explain constraint-based association mining. Give relevant example. |
| [Marks 16] |
| 14. (a) What are Bayesian classifiers? Explain in detail about: |
| (i) Naïve Bayesian classification [Marks 8] |
| (ii) Linear and multiple regression. [Marks 8] |
| Or(b) Why is outline mining important? Briefly describe the different approaches behind |
statistical based outlier detection, distance-based outlier detection and deviation-based |
| outlier detection. [Marks 16] |
| 15. (a) (i) What is multidimensional analysis? Discuss the same with an example. [Marks 6] |
| (ii) Discuss how data mining is done is spatial databases. [Marks 10] |
| Or |
| (b) (i) Discuss data mining in multimedia databases. [Marks 10] |
| (ii) What is time series analysis? Discuss the same with an example. [Marks 6] |



