Simon Couch - Practical AI for data science
Practical AI for data science (Simon Couch)
Abstract: While most discourse about AI focuses on glamorous, ungrounded applications, data scientists spend most of their days tackling unglamorous problems in sensitive data. Integrated thoughtfully, LLMs are quite useful in practice for all sorts of everyday data science tasks, even when restricted to secure deployments that protect proprietary information. At Posit, our work on ellmer and related R packages has focused on enabling these practical uses. This talk will outline three practical AI use-cases—structured data extraction, tool calling, and coding—and offer guidance on getting started with LLMs when your data and code is confidential.
Presented at the 2025 R/Pharma Conference Europe/US Track.
Resources mentioned in the presentation:
- {vitals}: Large Language Model Evaluations https://vitals.tidyverse.org/
- {mcptools}: Model Context Protocol for R https://posit-dev.github.io/mcptools/
- {btw}: A complete toolkit for connecting R and LLMs https://posit-dev.github.io/btw/
- {gander}: High-performance, low-friction Large Language Model chat for data scientists https://simonpcouch.github.io/gander/
- {chores}: A collection of large language model assistants https://simonpcouch.github.io/chores/
- {predictive}: A frontend for predictive modeling with tidymodels https://github.com/simonpcouch/predictive
- {kapa}: RAG-based search via the kapa.ai API https://github.com/simonpcouch/kapa
- Databot https://positron.posit.co/dat
btw
ellmer
mcptools
positron
tidymodels
tidyverse
tidyverse.org
vitals