R is an open-source statistical language designed with a focus on data analysis. While its historical roots are in statistical applications, it is currently experiencing a rapid growth in popularity in all fields where data matters: from data science, through bioinformatics and finance, to machine learning. Key strengths contributing to this growth include its rich libraries ecosystem (over 6 thousands packages at the moment of writing) – often authored by the leading researchers in the field, providing early access to the latest techniques; beautiful, high-quality visualizations – supporting seamless exploratory data analysis and producing stunning presentations; all of this available in an interactive environment resulting in high productivity through fast iteration times.
At the same time, there are no free lunches in programming: the dynamic, interactive nature of R does have its costs, including a significant impact on run-time performance. In an era of growing data sizes and increasingly realistic models this concern is only becoming more important.
In this talk we provide an introduction to Rcpp – a library allowing smooth integration of R with C++, combining the productivity benefits of R for data science together with the performance of C++. First released in 2005, today it’s the most popular language extension for R -- used by over 400 packages. We'll also discuss challenges (as well as possible solutions) involved in integrating modern C++ code, and demonstrate the usage of popular C++ libraries in practice. We’ll conclude the talk with the RInside package allowing to embed R in C++.