
Davis Vaughan | It's about time | RStudio (2022)
Dealing with date-times is hard. Dealing with date-times without the proper tooling is even harder! clock is an R package that aims to provide comprehensive and safe handling of date-times. It goes beyond the date and date-time types that base R provides, implementing new types for year-month, year-quarter, ISO year-week, and many other date-like formats, all with up to nanosecond precision. In this talk, you'll see how clock emphasizes "safety first" when manipulating date-times, and how these new date-time types can be used in your own work. Talk materials are available at https://speakerdeck.com/davisvaughan/2022-rstudio-conf-its-about-time Session: Lightning Talks
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
I am here to talk about time, which is obviously everyone's favorite subject. In particular, I'm actually here to talk about a package called Clock. So Clock is a date-time manipulation library, kind of in the same way that Lubridate is a date-time manipulation library. It does things you might expect, add dates, subtract dates, format and parse them, all kinds of other manipulation.
But if you get anything out of this talk, it's really that Clock is not here to replace Lubridate in any way. The only idea would be that, in the end, Clock might be a back-end for Lubridate, in the same way that DTPlier or DBPlier are different types of back-ends for DPlier. And I'm not even going to spend the rest of this talk talking about features that overlap with Lubridate. Instead, I want to talk about things that are pretty unique to Clock.
One of those is safety, and one of those is calendars, because I only have five minutes. I'm going to do that with one date, January 30th of this year.
Safety in clock
Safety is built into Clock from the ground up to hopefully avoid issues like this, time zone issues, invalid date issues, things that are pretty common when you're working with time series and just drive you up the wall. So let's jump into safety. Here's a timeline. This is January 30th, our date in question, marked in blue on our timeline. It continues through to February on the next line. You'll see this gap between February and March, because February only has 28 days, but January had 31, so it doesn't necessarily map one-to-one.
If I were to ask you this seemingly innocuous question, please add one month to this date, what would you get? Well, if we were to ask LuberDate, it gives you a somewhat reasonable answer of NA. There is nothing that maps one-to-one from January 30th to something in February, maybe. And there's nothing particularly wrong with this, except for the fact that it's not the most useful answer. Generally, you'll be running this code, and it happens silently, and then five steps downstream, all of a sudden you discover there's some NAs here? I didn't have those to begin with. Where did those come from? And you have to backtrack up through your calculations to figure out why they appeared.
If you were to ask clock this question with add months, it actually gives you an error in this special case by default. It says, whoa, hold up, there's something wrong here. Go look at location one. If you had a vector, it might be location five, seven, whatever. And check out the invalid argument to learn more about this case. You go and you look at the documentation, and you come out with the idea that maybe I could set this thing called invalid equals previous. That allows you to say, give me the previous valid date when I have this kind of problem. That's the end of February. I think that's a pretty reasonable result in this case. But you also might want to say, depending on your specific problem, invalid equals next to map forward to the beginning of March instead. If you actually do like that behavior, that's fine, you can say invalid equals NA, and any time that occurs, you get an NA instead.
If you were to ask clock this question with add months, it actually gives you an error in this special case by default. It says, whoa, hold up, there's something wrong here.
Calendar types
So that's about safety. Let's talk about calendars. Calendars are just the idea of a way to represent a unique point in time. With our date in question, we could use a calendar called year, month, day to represent this date using three components, the year, the month, and the day of the month. But this isn't the only way you could represent this date. You could also use the year and the day of the year. Or you could use one of these many other calendar types that are built into Clock. If you're a finance person, you might be particularly interested in year, quarter, day, which uses a true fiscal year to represent your date.
These are really nice because they're all convertible to each other. You can work with any particular calendar type and say you need to get the quarter out, you convert to year, quarter, day, you do manipulation over there, you convert back. It's obviously convertible with date and POSIX-CT as well, since those are the date time types that you're most likely to start out with.
The other really neat thing that I find really fun about these calendar types is that they have what's known as variable precision. These are all day precision calendar types at this point, but we could narrow that down to month precision as needed, and you've got a built-in year, month type in Clock. Similarly, you could have a built-in year, quarter type. You can actually go the other way, too. You can widen it out all the way to nanoseconds if you need it.
The last thing I'll say is that Clock is completely compatible with some of the other packages you might be familiar with that I've created, called Slider and IVs. Slider is one for rolling averages, so you can use Clock types as the index to say give me a rolling average looking back four or five quarters. IVs is a relatively new package, you might not have heard of this one yet, but it deals with date ranges, and you can use Clock types as the components of those ranges.
So to sum up, LuberDate is not going anywhere, don't worry, but please try Clock for enhanced safety and these powerful new types. Thank you.

