Web API Updates for R | RStudio Webinar - 2017

Transcript#

This transcript was generated automatically and may contain errors.

Thank you, Anne, for that intro. I appreciate it. As she mentioned, I'm an engineer at RStudio , and this is going to be an expanded version of the talk I gave at RStudio Conf, so I can provide more detail on working with APIs in R. We will begin with an overview, a quick review of the Web API basics, move on to tools for accessing those APIs. I have a couple of examples prepared and a practical application that we have built in-house using these techniques, and at the end, I'll give you some resources for finding out more information if you're interested in expanding your knowledge in this area.

Web API basics

So API basics. There are a lot of data accessible over the Internet these days, and a lot of it is made available via APIs, which are application programming interfaces. These allow programs to talk to each other based on a communication contract. If you ask me a question in a way I understand, I will give you an answer. So there are two important parts to HTTP communication over the Web. There's the request, which is data sent to the server from the client, and then the response, which is the data that the server sends back to that client.

Now, this is a very oversimplified view of the client-server communication, but the basics of it are ask a question and get a response, and this is predicated on the fact that the client and server need to understand each other's language, so to speak. The client can only send a request to the server that the server knows how to handle, and then the server will send that response back to the client, and the client will need to know how to parse that response to get the data out based on the question that was asked.

Web APIs usually provide read and or write access to data stores when you're talking about Web APIs, and they allow you to access data on an ongoing basis. This is significant since it allows you to write scripts that pull an API to get the most up-to-date data. Not only will your analyses be current, but when the data changes, your scripts don't have to. Depending on how you design that script, data frames and plots will update automatically when your data changes without you having to download a new CSV or import data manually.

This is significant since it allows you to write scripts that pull an API to get the most up-to-date data. Not only will your analyses be current, but when the data changes, your scripts don't have to.

So as an example of this request-response, I've put down here a curl request. Curl is a utility in the Unix world that allows you to make HTTP requests. In this case, I'm calling the OMDB API. This is the Online Movie Database API, and I'm giving it a request with a query parameter, T for title and R for response format, so in this case, I'm saying, OMDB, give me the data for the title, the movie named Clue, and send it back to me in JSON format. The server understands this request because you've put it into the format that it expects, and so it sends a response that contains the data you asked for, the title, the year, the rating, et cetera. Again, this is a very simplistic example, but it gives you an idea of sort of how simple it is to ask a question and receive an answer and pull that data into the scripts that you're writing every day.

Web APIs are organized into resources that are often in a particular hierarchy. In this example, there's a map that if you follow down from the top, you can get information about accounts, a particular account by its ID, the lists that a particular account has, the ID of those lists, the campaigns, the subscribers, the web forms, et cetera. So these are often organized in maps or webs of information, and if you know how to get to those different levels of information, you can construct a URL and ask for very specific things all the way down at the bottom, like clicks and tracked events in this case.

Now that presumes that you understand a little bit about the API that you're trying to use. Fortunately, most of it has many of them, I should say most of them, have decent documentation. This example is from the Star Wars API. So in this case, you can see across the left the kinds of information you can learn from this documentation. There's a getting started guide here on the first page. But the things that you need to understand are the resources towards the bottom so that you know what you can get and how to get it. For example, this being the Star Wars API, you can get information about people, films, starships, planets, vehicles, et cetera. And the beauty of this documentation, this particular documentation is very good, so it'll tell you exactly what URL to use to get to that information, and it'll give you an example response as well. So it will tell you how to ask the question and what format the answer will be in so that you can then parse it appropriately in your script.

Web API Updates for R | RStudio Webinar - 2017

Transcript#

Web API basics

Tools for accessing APIs in R

Handling the HTTP response

Star Wars API examples

Featured software#

rstudio

webinars