Convert date and time strings into standard ISO 8601 formatted strings, with output temporal resolution matching inputs, and full support of timezone offsets.
NOTE: This function does not convert to all ISO 8601 formats. Currently supported formats include calendar dates, times, time zones, and valid combinations of these. Week dates, ordinal dates, and time intervals are not yet supported.
iso8601_convert(x, orders, tz = NULL, truncated = 0, exact = FALSE, train = TRUE, drop = FALSE, return.format = FALSE)
x | (character) A vector of date and time, date, or time strings. |
---|---|
orders | (character) From lubridate 1.7.4.90000 documentation: A vector of date-time formats. Each order string is a series of formatting characters as listed in `base::strptime()` but might not include the " dates in year, month, day order. Formatting orders might include arbitrary separators. These are discarded. See details for implemented formats. The list below contains recognized formats.
|
tz | (character) Time zone offset with respect to UTC. Acceptable formats are: +hh, -hh, +hh:mm, -hh:mm. Use "Z" to denote UTC. If an invalid time zone is entered, then an error is returned. Acceptable time zone offsets are listed here https://en.wikipedia.org/wiki/List_of_tz_database_time_zones. NOTE: This argument is different than `tz` supplied to `lubridate::parse_date_time`. |
truncated | (integer) From lubridate 1.7.4.90000 documentation: Number of formats that can be missing. The most common type of irregularity in date-time data is the truncation due to rounding or unavailability of the time stamp. If the `truncated` parameter is non-zero, then truncated formats are also checked. For example, if the format order is "ymdHMS" and `truncated = 3`, then incomplete date-times like `2012-06-01 12:23`, `2012-06-01 12`, and `2012-06-01` are parsed. The above definition was slightly modified from lubridate 1.7.4.90000 documentation. |
exact | (logical) From lubridate 1.7.4.90000 documentation: If `TRUE`, the `orders` parameter is interpreted as an exact `base::strptime()` format and no training or guessing are performed (i.e. `train`, `drop` parameters are irrelevant). The above definition was copied directly from lubridate 1.7.4.90000 documentation. |
train | (logical) From lubridate 1.7.4.90000 documentation: Whether to train formats on a subset of the input vector. The resut of this is that supplied orders are sorted according to performance on this training set, which commonly results in increased performance. Please note that even when `train = FALSE` (and `exact = FALSE`) guessing of the actual formats is still performed on a pseudo-random subset of the original input vector. This might result in `All formats failed to parse` error. The above definition was copied directly from lubridate 1.7.4.90000 documentation. |
drop | (logical) From lubridate 1.7.4.90000 documentation: Whether to drop formats that didn't match on the training set. If `FALSE`, unmatched on the training set formats are tried as a laast resort at the end of the parsing queue. Applies only when `train = TRUE`. Setting this parameter to `TRUE` might slightly speed up parsing in situations involving many formats. Prior to v1.7.0 this parameter was implicitly `TRUE`, which resulted in occasional surprising behavior when rare patterns where not present in the training set. The above definition was copied directly from lubridate 1.7.4.90000 documentation. |
return.format | (logical) Should format specifiers be returned with the output data? This argument supports identification of where differences in output temporal resolution occur. |
(character) A vector of dates and times in the ISO 8601 standard in the temporal resolution of the input date and time strings. The ISO 8601 standard format output by this function is a combination of calendar dates, times, time zone offsets, and valid combinations of these.
(data frame) If `return.format` is `TRUE` then a data frame is returned containing the input data, converted data, and formats of the converted data. This supports identification of where differences in output temporal resolution occur.
`iso8601_convert` leverages the power of `lubridate::parse_date_time` to parse dates and times, then uses regular expressions on the `orders` argument to identify the temporal resolution of the input data, and then outputs the converted data in this same resolution. Most of the arguments available to `lubridate::parse_date_time` can be used with `iso8601_convert`.
# Convert dates and times of varying temporal resolution iso8601_convert(x = '2012', orders = 'y')#> Warning: Some data failed to parse. Consider updating your list of orders.#> [1] NAiso8601_convert(x = '01/05/2012', orders = 'dmy')#> [1] "2012-05-01"iso8601_convert(x = '01-May-2012', orders = 'dby')#> [1] "2012-05-01"iso8601_convert(x = '132954', orders = 'HMS')#> [1] "13:29:54"iso8601_convert(x = '132954', orders = 'HMS', tz = '-05')#> [1] "13:29:54-05"iso8601_convert(x = '20120501 132954', orders = 'Ymd HMS', tz = '-05')#> [1] "2012-05-01T13:29:54-05"# Some variation of input format is supported as long as orders are defined. # NOTE: Output resolution matches input resolution. iso8601_convert(x = c('2012-05-01 13:29:54', '2012-05-01 13:29', '1/5/2012 13'), orders = c('ymd_HMS', 'ymd_HM', 'dmy_H'))#> Warning: Converted data contains multiple levels of temporal resolution. Use the argument "return.format = T" to see where.#> [1] "2012-05-01T13:29:54" "2012-05-01T13:29" "2012-05-01T13"# Force output resolution to be the same iso8601_convert(x = c('2012-05-01 13:29:54', '2012-05-01 13:29', '1/5/2012 13'), orders = c('ymd_HMS', 'ymd_HMS', 'dmy_HMS'), truncated = 3)#> [1] "2012-05-01T13:29:54" "2005-12-20T01:13:29" "2012-05-01T13:00:00"if (FALSE) { # Invalid time zones result in an error iso8601_convert(x = '2012-05-01 13:29', orders = 'ymd HM', tz = '+18') }