NCTU Course Syllabus in Fall Semester, 2020
Course Name: Instructors:
Smart Data Analytics (SDA) I
Class time:
Friday, 13:20-14:10, 14:20-15:10, 15:30-16:20.
Classroom:
A427, NCTU
Prerequirements:
Good knowledge in programming on own laptop (MAC preferred) and presentation skills in keynote or PPTX. Basic knowledge of Applied Multivariate Statistical Analysis, like the following reference:
Härdle WK, Simar L (2019) Applied Multivariate Statistical Analysis, 5th ed., Springer Verlag, Heidelberg.
(https://www.springer.com/gp/book/9783030260057#otherversion=9783030260064)Course content:
The evolution from analogue to digital technologies continues to dominate the attention of decision makers today. Many tools in industrial production processes have been automated or replaced by highly complex mechanisms with pre-programmed decision making. The change to digital modes of operations increasingly determines the lives of individuals and does so in increasingly unexpected ways.
The Smart Data Analytics (SDA) course presents tools and concepts for unstructured data with a strong focus on applications and implementations. It presents decision analytics in a way that is understandable for non-mathematicians and practitioners who are confronted with day to day number crunching statistical data analysis. All practical examples may be recalculated and modified: software and Quantlets are in www.quantlet.de. The SDA course endows the practitioner with ready to use practical tools for smart data analytics.
The students get insight into the area of modern internet based Computational Statistics Methods. Practically relevant knowledge on methods, data forms and Gestalt will be trained. The use of GITHUB and network techniques will be taught. Direct computer oriented knowledge and possibilities of empirical research will be shown. We present hands on practical examples from finance, Crypto currencies and network analysis.
Data are everywhere and the ubiquitous availability of huge amounts of data makes it necessary to develop smart data analytics. Out of the plethora of tools that are available for many scientific disciplines this course offers for the common data analyst an easy access to all levels of analysis without deep computer programming knowledge. SDA provides a wide variety of exercises. In addition a full set of slides is provided making it easier for the participants to reanalyze the presented material. The R and Python programming language are becoming the lingua franca of computational data analysis. They are the common smart data analysis software platforms used inside corporations and in academia. Both are OS independent free open-source programs which are popularized and improved by hundreds of volunteers all over the world. The course of SDA I in the fall semester will cover Unit 1-4. The other course of SDA II in the spring semester will cover Unit 5-8.
Unit 1 |
|
What do we see? |
● Basic concepts ● Data Management ● Structuring Data elements |
Unit 2 |
|
Data Analysis |
● Sentiment extraction ● Stemming, lemmatizing ● DTM Dynamic Topic Modeling |
Unit 3 |
|
Modern Data Analysis |
● Cluster Analysis and Classification ● Understanding Crypto Currencies ● CRIX a CRypto currency IndeX |
Unit 4 |
|
Modern Data Analytics |
● R and Python tools ● text mining and scoring ● Applications & Empirics |
Unit 5 |
|
Smart Data Analytics |
● Network Centrality, Herding effects ● LSTM Neural Networks ● SVMs and Probabilty of Defaults |
Unit 6 |
|
Smart Data Analytics |
● Financial Risk Meter ● Scagnostics ● Hierachical Clustering |
Unit 7 |
|
Very Smart Data Analytics |
● fraud and scam detection ● Options on cryptos ● LDA Latent Dirichlet Analysis |
Unit 8 |
|
We do Smart Data Analytics |
● Machine learning in Economics ● Deep Learning of Forecasts ● Generalized Random Forests |
References:
Franke J, Härdle WK, Hafner C (2019) Statistics of Financial Markets: An
Introduction. 5th Ed. Springer Verlag, Heidelberg.
(
https://www.springer.com/gp/book/9783030137502)
Härdle WK, Simar L (2019) Applied Multivariate Statistical Analysis, 5th ed., Springer Verlag, Heidelberg.
(
Chen C YH, Härdle WK, Overbeck L (2017) Applied Quantitative Finance. 3rd extended ed., Springer Verlag, Heidelberg.
(https://www.springer.com/gp/book/9783662544853)
Härdle WK, Okhrin O, Okhrin Y (2017) Basics of Computational Statistics,
Springer Verlag, Heidelberg.
(https://link.springer.com/book/10.1007/978-3-319-55336-8)
Hardle WK, Lu HHS, Shen X. (2018) (eds) Handbook of Big Data Analytics, Springer Verlag, Heidelberg.
(https://www.springer.com/gp/book/9783319182834)
All examples are presented in R or Python.
The Quantlets are available here:
www.quantlet.de
The CRIX is here:
thecrix.de
The FRM links:
https://firamis.de/frm/