Using Python and R in VS Code Like a Data Scientist

Embracing the python and R community together needs smooth transference of our mentalities and skills from both community; I believe one could get best of both worlds in VS Code.

In the past, I tried either call Python function from R environment or call R functions from Python. I did not find good packages that could manage this well. I believe the best of way of leveraging the best of both Python and R is to share data via IO.

Best of both worlds

Many years ago, I spent nights and nights googling whether I should learn R or Python first. Since I studied business and finance as my first degree, R became my first choice. Later, I switched to Python when I studied mathematics and got into the world of machine learning.

This year, a project I am working on needs me to use Python and R at the same time. I have already wrote a post talking about setting up the environment with Docker. In this post, I will share some tips about how I leveraged the best of both.

Pure scripts

Jupyter notebook used to be my top choice for doing some data analysis in the quick way. However, I found it also distracting when you jump among different code chunks and markdown blocks. It also become quite slow once your iterations of analysis grows big. I think the most valuable thing that Jupyter provides is the kernel, which allows you:

Thanks to VS Code, we could run our code interactively. This gives me the choice of coding my thoughts with script purely but still keep analysis attached and flowed dynamically.

Figure 1 to Figure 3 shows what I mean with the above statement (please zoom image to appreciate its powerful implications).

A demo figure to run python from command
Figure 1. Programming in the script and run code from the command line.
A demo figure run python interactively
Figure 2. Programming in the script and run code from interactive window with #%% option enabling code chunk.
A demo figure run R interactively
Figure 3. Programming in the script with R and run the code from a Jupyter kernel interactively, which needs a shortcut setup (see the next section).

Pure scripts for R too

To make the function demonstrated in Figure 3 work, you need setup a shortcut in VS Code. Just add the following keybindings setup into your keybindings.json.

Remark: if it does not work, try to update your VS Code and when you reopen your editor, please set up the python interpreter first as sending command into interactive window needs python interpreter as a marco environment. I have tested it from Mac OS and Windows with Remote Linux server.

[
    {
        // I use shift and space to run 
        "key": "shift+space",
        // only run selection
        "command": "jupyter.execSelectionInteractive",
        // run whole file 
        // "command": "jupyter.runFileInteractive",
        "when": "editorTextFocus && editorLangId == 'r'"
    }
]
A demo figure run R interactively
Figure 4. Instructions for finding the keyboard setup.

Using GNU Make to manage the workflow

(Baker, 2020)

  1. Baker, P. (2020). Using GNU make to manage the workflow of data analysis projects. Journal of Statistical Software, 94, 1–46.