RStudio Connect Admin Training Workshop (Virtual) for R/Pharma 2021
● Part 1: Get Set Up for Success
● Part 2: The Admin Experience
● Part 3: Preview of things to come
What is the purpose of RStudio Connect?
ethods: push-button, CLI, git-backed, or
content, or source
s. Set up
schedule, control resource
organizational tags. Track
collaborators. Set a
All the work you do in R & Python: “Data Products”
○ Dash, Streamlit, Bokeh
○ R Markdown
○ Jupyter Notebooks
○ Static content: sites, plots, graphs
● Web APIs (RSC Standard & Enterprise)
○ Flask, FastAPI
○ Tableau Analytic Extensions: plumbertableau, fastAPItableau
○ Have all privileges, but must
explicitly grant themselves content
○ Actions are audited
○ Special access to the Admin tab and
certain content settings
○ Can upload new content items
○ Can see content items
● Content Collaborators
○ Can publish new versions
○ Manage settings
○ Download source bundles
● Content Viewers
○ Can only see and interact
with the content itself
RStudio Connect Overview
● Demo of Publishing Mechanisms
○ Push-button (RStudio IDE, Jupyter Notebooks)
○ Git-backed (manifest generation)
○ Programmatic (Azure DevOps example)
● Demo of Application Permissions Management
● Demo of Admin Dashboard Functionality
○ Process listing (updated)
○ Audit Logs
○ Unpublished Content
○ Scheduled Content Calendar
Supported Linux Distributions
● RHEL/CentOS 7 & 8*
● Ubuntu 18.07 LTS & 20.04 LTS
● SLES 12 SP5
● SLES 15 SP2 / openSUSE 15.2
*Distributions such as Rocky Linux and AlmaLinux can be used as long as they stay 1:1
binary compatible with RHEL 8. CentOS Stream is not supported by RStudio.
Reference Architectures: Single Server
Reference Architectures: Cluster
RStudio Connect & Docker
RStudio products are designed to
live on long-running Linux
servers. RStudio products are
entirely compatible with treating
a container like the underlying
Linux server to better encapsulate
dependencies and diminish server
In this model, each RStudio
product is placed in its own
long-running container and
treated as a standalone instance
of the product. Multiple
containers can be load-balanced
and treated as a cluster. These
containers can be managed by a
Kubernetes cluster, should you
There are some specific
considerations for running
RStudio products in containers,
which are detailed in this article.
Types of Evaluations
1. Not very useful to Admins: RStudio Hosted Evaluation
2. Useful but you have to DIY: 45-day Evaluation Key
Configuration from Scratch
Authentication Decision Making
Authorization via Groups Current-User
Password (built-in) Yes (can also be disabled) Groups must be managed locally in Connect No
PAM No Groups must be managed locally in Connect Yes (per-app basis)
LDAP/AD No Groups can come from the Provider or be local to
SAML No Groups can come from the Provider or be local to
OIDC - Google No Groups must be managed locally in Connect No
OIDC - Others No Groups can come from the Provider or be local to
Proxied through an
No Groups can come from the Provider or be local to
The Key to Configuration: Publisher Relations
The most successful RStudio Connect Installations require
open dialog between admins and publishers.
● Do you know who your Publishers are?
● Do your Publishers know who you are?
● Do you know what types of content they will be publishing?
● Do you know what versions of R and Python they'll need?
● Do you know how they plan to connect to data sources?
This is also your chance to make any Dev/SecOps policies and expectations known.
More Publisher Discussion Topics
Connect Access Restrictions
● Will you place limits on allowed viewership? (all, logged_in, acl)
Vanity URL Management
● Will Publishers be allowed to set vanity URLs for content?
Content Organization: Tags
● Work with your Publishers to set up a Tag Schema for content organization
Resource Management & Budgeting
● R is single threaded
● Load balance between processes
● Application owners can set Minimum and Maximum processes
● Use Scheduler.MinProcessLimit to cap resources if this
becomes a problem
● Scheduler.MaxProcessLimit is also available
● The maximum amount of time to wait for an app to start
Scheduler.InitTimeout = 60s
● The minimum time to keep a worker process alive after it goes idle
Scheduler.IdleTimeout = 5s
After the last user disconnects from a process, RStudio Connect waits 5s before
that process is reaped.
You might want to increase Scheduler.IdleTimeout if you have a process that
is resource-intensive to start up.
● Applications.ScheduleConcurrency (default: 2)
● Maximum number of scheduled reports to run in parallel
● Setting this to zero will disable scheduled execution
This lets you control (throttle) scheduled content execution
● If all your publishers schedule reports to run at midnight, Connect
will iterate through them as quickly as possible.
Disk Usage Resource Management
Things Connect stores on disk:
● Content bundles (uploaded compressed bundles from users)
● Unzipped bundles for running applications
● Package cache
○ One copy of each version of each package specific to the R (or Python)
● Metrics (RAM and CPU usage)
● R/Python process information/logs
Content Bundle Retention
Throttle the number of bundles retained for each content item
● Applications.BundleRetentionLimit (default 0, which retains everything)
If you experience problems with large bundles:
● Ask publishers not to package large sets of data in the content bundle and
provision data on the server separately
Process Information Retention
● Maximum number of jobs preserved on disk for any one application:
○ Jobs.MaxCompleted (default: 1000)
● Maximum age of a completed job retained on disk:
○ Jobs.OldestCompleted (default: 30d)
On-disk job metadata is removed if either the MaxCompleted or
OldestCompleted restrictions are violated.
Adjust this retention window based on your auditing requirements.
How will your publishers deploy to Connect?
Three ways to publish content to RStudio Connect:
Publishing methods for Code Promotion
Git-backed Code Promotion
Importance of an Environment Management Strategy
Environment management takes work. Here are some cases where the reward is
worth the eﬀort:
● When you are working on a long-term project, and need to safely upgrade
● In cases where you and your team need to collaborate on the same project, using a
common source of truth.
● If you need to validate and control the packages you’re using.
● When you are ready to deploy a data product to production, such as a Shiny app, R
Markdown document, or plumber API.
Many organizations find value in hosting their own package repository.
Hosting an internal repository allows organizations to:
● Share and version their internal packages
● Access and govern packages from external sources
● Audit package use
Validated Environment Management
❏ Review the curated resources and recommendations for
Using R for Validated Work
❏ Can you recreate your environment?
❏ Can you trust the things in your environment?
❏ Learn about the Validated Environment Strategy
❏ Learn about Internal Package Repositories
Reproducibility & Environment Strategy Maps
To select a strategy, you need to answer two questions:
● Who is responsible for managing the environment?
● How open is the environment?
● Replace the RStudio logo and favicon with your own.
● Direct logged-in users to a landing page of your choice when they first enter
● Generate custom content landing pages with R code using connectwidgets.
● Customize what anonymous and logged-out users see when they visit your server.
● Control email settings such as sender display name, “from” address, sender
address headers, and subject prefix.
● Hide the Documentation tab from viewers.
Branding Configuration Settings
Email Customization Settings
Custom Landing Pages
Create a custom landing
page that all anonymous
or logged-out users will
Use the Server.LandingDir
configuration setting to specify the
path to a directory that contains
index.html and all assets (CSS,
Other Types of Custom Landing Pages
Landing Pages for Logged-in Users
● Server.RootRedirect (Default: The
Server.Dashboard path) The URL logged-in
users will be redirected to when visiting the
public URL used to access the server.
● Server.DashboardPath (Default: "/connect")
The URL path name to be used where RStudio
Connect's dashboard is hosted.
One option for creating a
custom landing page is to make
a content showcase with the
connectwidgets R package.
Unsupported Customizations (November 2021)
● RStudio Connect dashboard color palette
● Hiding Tags from viewers
● Removal of footer text that says “Powered by RStudio Connect”
● Removal of RStudio copyright information
Special Considerations for Consultancies (External Users)
● Branding and Landing Page Customization
● Managing multiple clients
○ User Isolation: Authorization.ViewersCanOnlySeeThemselves, Server.HideEmailAddresses
○ Viewer Restrictions: Server.ViewerKiosk When enabled, users with viewer role will not be allowed to
submit permission requests for content access or to request elevated role privileges.
● Multiple authentication providers
○ Federated authentication: RStudio Connect will authenticate against an external identity provider
(usually via SAML), and the provider will federate identity management to all the diﬀerent
Federated Identity Management
Golden Rules of RStudio Connect Configuration
❏ Check your configuration file: Is Server.Address set?
❏ Verify your email server configuration: Send a test email
❏ Maintain an open dialog with your publisher users
❏ Before you start publishing content:
❏ Make an informed decision about your authentication provider
❏ Make an informed decision about your package repository
❏ Life is better with Package Manager or an Internal Repository
● RStudio Connect uses the license-manager to determine if a valid
license is available:
sudo /opt/rstudio-connect/bin/license-manager status
● The Connect dashboard will display a notification to admins and
publishers when the license is within 15 days of expiration.
● You can disable this with Licensing.ExpirationUIWarning
● Adding Users
○ Accounts can be either created / pre-provisioned or auto-registered. Details and
capabilities diﬀer by authentication provider.
○ Example: Server API driven user provisioning
● Locking Users
○ Forbids login and publishing
○ Removes user from your license count
○ Example: Server API documentation
● Removing Users
○ Last resort option
○ Could Require content ownership migration
● Local Groups
○ Manage through the UI: “People” tab
○ Manage with the RStudio Connect Server API
○ Disable local group support with: Authorization.UserGroups (existing groups
must be removed)
● Remote Groups
○ Management is the responsibility of of the external authentication provider
○ Group memberships are locally synchronized through successful login events
Note! Having a mix of Local and Remote groups on your server is not recommended.
Migrate completely from one mode to the other when making a change.
RStudio Connect API Keys
● Programmatically access content on RStudio Connect and use
the Server API
● API Keys are associated with users, not content
● Server API documentation
● Server API Cookbook
Setting up Programmatic Deployments
DEMO: Azure DevOps Pipelines for content deployments
● Publishing Methods Explained
● Publishing to RStudio Connect with Github Actions
RStudio Connect provides several methods for posting custom HTML
messages to the User Interface:
● Server.PublicWarning - Visible on the unauthenticated landing
● Server.LoggedInWarning - Visible above recent content when
○ Useful for things like scheduling maintenance windows
End of Support for Python 2 (January 2022)
Starting January 2022, RStudio Connect will no longer support Python 2.
Factors that have gone into our decision include the following:
● Python 3 is now widely adopted and is the actively-developed version of the
● In January 2021, the pip 21.0 release oﬀicially dropped support for Python 2.
● A large number of projects pledged to drop support for Python 2 in 2020 including
TensorFlow, scikit-learn, Apache Spark, pandas, XGBoost, NumPy, Bokeh,
Matplotlib, IPython, and Jupyter notebook.
Exercise: Use the RStudio Connect Server
API to audit the versions of R/Python in use
Other Server API Project Ideas
● Build a report examine access control list details for each content item on your
RStudio Connect server Example
● Audit all the unpublished (orphaned) content items on your RStudio Connect
● Audit all the vanity URLs currently in use on your RStudio Connect server Example
● Audit all the tags currently in use on the server, and list all the tagged content items
Content Usage Data & Tracking
● Records information
about each visit and the
length of that visit
● Records information
about each visit: user,
Managing RStudio Connect Upgrades
● RStudio Connect versions are supported for 18 months
● We recommend upgrading at least once a year.
● Most upgrades should require less than five minutes unless
breaking changes have occurred in the interim and require
● Consult the release notes before undergoing an upgrade.
Performing an Upgrade
Download and run the installation script
The installation script works across all supported Linux distributions,
validates the GPG key of the downloaded package, and includes
support for oﬀline use.
curl -Lo rsc-installer.sh https://cdn.rstudio.com/connect/installer/installer-v1.9.5.sh
sudo -E bash ./rsc-installer.sh 2021.10.0
RStudio Product Support
Submit a Support Ticket: https://support.rstudio.com/hc/en-us/requests/new
Generate a server diagnostic report:
If you are on RStudio Connect version 1.7.2 and later, run the following command on the
server and send us the output:
sudo /opt/rstudio-connect/scripts/run-diagnostics.sh /path/to/output/dir
RStudio Connect Investments
Short Term Future
Vision: Data scientists own the publication, execution, management, and distribution of their work in a
safe and sophisticated manner, fully sanctioned by their IT admins.
Strategic Goals: Increase the types of content available to share, improve content discovery and
management, and facilitate production deployments.
Administrators can enable
remote content execution on a
Kubernetes back-end while
maintaining easy self-serve
● Publishers are able to
drive viewer engagement
on their work
● Publishers can manage
● Feature parity for Python
● Extend Cloud Native
capabilities to ease
● Improvements to
BI Integration: Extend
Tableau dashboards with
R, Shiny and Python
Invitation to the Beta Program for Oﬀ-Host Execution
● Begins in December, runs until the GA launch in early 2022
● Beta will not have feature parity with RStudio Connect local execution
● A Kubernetes cluster where you have full cluster-admin privileges
● A PostgreSQL database that meets Connect’s requirements
● An NFS server that meets Connect’s shared storage requirements
● Willingness to provide feedback on the installation/configuration process
● Publishers who are willing to provide feedback