Introduction to Computer Science for Economists

1. Description

This course provides an introduction to computer science concepts and techniques that are relevant to economists. The course aims to equip students with the fundamental knowledge and skills needed to apply computational methods to economic analysis, modeling, and data analysis. It provides the prerequisite knowledge for students who want to follow specialized courses in data science and machine learning. The course focuses on the following topics:

  1. Introduction to programming. The course introduces students to programming concepts and practices, including data types, control structures, and functions. The focus is on programming in R, a widely used language in data science, statistical learning, and computational economics.
  2. Data analysis. The course covers techniques for data analysis including exploratory data analysis, statistical inference, data preprocessing and feature engineering for machine learning using libraries such as dplyr and TensorFlow.
  3. Visualization. The course introduces students to the basic concepts of graphics and visualization, including commonly used types of figures and visualizations, annotations, and labeling using the ggplot2 library.

2. Objectives

The course's first objective is to introduce students to programming with R. It does so by guiding the students in exploring popular R libraries that are commonly used in data analysis, visualization, and statistical learning. On successful completion of the course, students will be able to

  • understand and apply programming concepts and practices to solve economic problems,
  • analyze and interpret economic data using appropriate tools and techniques, and
  • apply common visualization techniques to economic data.

Second, the course aims to instruct students on working on economic data science projects effectively in a team and communicate technical ideas clearly and concisely. To emulate conditions of working on collaborative data science projects, the didactic approach of the course is to provide students with coding challenges and exercises with economic data that students can work in groups to solve.

Finally, the course aims to enable the students to independently enrich and expand their knowledge of R statistical software and its libraries after completing the course. To this end, the course provides a comprehensive overview of the open-source R statistical software ecosystem.

3. Prerequisites

The course is intended for first-year post-graduate students in economics who want to further specialize in computational economics, data science, and machine learning economic applications. Programming with R will be introduced from scratch and no previous experience with the language is required. Experience with other languages such as Python can be helpful, but it is not a prerequisite. Instead, knowledge of basic statistics is a prerequisite. Knowledge of usual statistical measures of centrality and dispersion, histograms, and regression concepts introduced in typical statistics and econometrics courses of B.Sc. programs in economics or business is expected.

4. Reading Material


The two most important sources for the course are [RDS] and [ISLR]. The first book is more appropriate for an introductory level. The second book covers both introductory and intermediate topics. The last texbook, i.e., [DLR], is more advanced and requires some programming experience and a deeper understanding of the R programming language.

(Wickham, Hadley, Mine Getinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. 2nd ed. O’Reilly Media.)

A free book with a modern introduction to R with an emphasis on the most prevalent data science activities.

(Gareth, James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2014. An Introduction to Statistical Learning: With Applications in R. Springer Publishing Company, Incorporated.)

An intermediate book with a comprehensive hands-on approach to statistical learning with R.

(Chollet, F., T. Kalinowski, and J.J. Allaire. 2022. Deep Learning with R, Second Edition. 2nd ed. Manning.)

Deep learning for classification, computer vision, and time series applications using Keras and TensorFlow in R.

The discussion topics of the course are available online. The blog contains interactive components (figures, videos, etc.) that can be used when studying. Last but not least, the blog has indices for all the concepts, learning activities, and applications of the course, with links to the corresponding material section they are introduced.

The slides of the course are available online. The content of the slides is a subset of the content available on the blog. The slides are optimized for presenting the course's main ideas in the lectures and are not meant to be used as reading material. Instead, students can use either the blog's topics or the handout to revise what we cover in the lectures.

A cumulative version of the course's material is given in an online handout format as a single web page (loading all the material can take a while on older cell phones or tablets, though). Those who prefer using PDF material can use the Downloads section of the course's blog. The handout can be downloaded in PDF format from there, though this comes at the expense of figure interactivity.

5. Organization

The course discusses the following topics in more detail.

  1. Preamble (1.5 hours)
    1. Introduction to programming concepts and practices
    2. Overview of the R programming language
    3. Why R?
    4. A preview of the R ecosystem. What can one do with R?
    5. Setting up the development environment
  2. Introduction to Programming (2.5 hours)
    1. Data types and operators
    2. Control structures (if-else, loops)
    3. Functions and libraries
    4. Lists and vectors
    5. Input-output operations
    6. Group coding challenge
    7. Best practices in R programming
  3. Data Analysis and Transformation (4 hours)
    1. Introduction to data analysis and transformation
    2. Overview of the dplyr library
    3. Reading and writing data
    4. Data manipulation and transformation
    5. Aggregation and grouping operations
    6. Merging and joining data
    7. Group data analysis exercise
    8. Best practices in data analysis
  4. Data visualization (3.5 hours)
    1. Introduction to data visualization
    2. Overview of the ggplot2 library
    3. Basic plotting with ggplot2
    4. Aesthetics
    5. Coordinate systems and geometric transformations
    6. Annotations and Labels
    7. Group data visualization exercise
    8. Best practices in data visualization
  5. Epilogue (0.5 hours)
    1. Where to go from here?
    2. A preview of statistical and machine learning with R

6. Evaluation

There is no final exam for the course. Throughout the course, students will have the opportunity to apply their newly acquired knowledge through programming exercises and short data analysis projects. The course emphasizes hands-on learning in teams and practical problem-solving.

Introduction to Computer Science for Economists, Summer Semester 2024

Pantelis Karapanagiotis

karapanagiotis [at] ebs [dot] edu

This version was created on 2023-09-26 Tue 14:28