Title: | Download Datasets from the Swiss National Science Foundation (SNF, FNS, SNSF) |
---|---|
Description: | Download and read datasets from the Swiss National Science Foundation (SNF, FNS, SNSF; <https://snf.ch>). The package is lightweight and without dependencies. Downloaded data can optionally be cached, to avoid repeated downloads of the same files. There are also utilities for comparing different versions of datasets, i.e. to report added, removed and changed entries. |
Authors: | Silvia Martens [ctb] , Enrico Schumann [aut, cre] |
Maintainer: | Enrico Schumann <[email protected]> |
License: | GPL-3 |
Version: | 0.1.1 |
Built: | 2024-11-04 05:34:18 UTC |
Source: | https://github.com/enricoschumann/snsfdatasets |
Download datasets from the Swiss National Science Foundation (SNF, FNS, SNSF) in CSV format.
fetch_datasets(dataset, dest.dir = NULL, detect.dates = TRUE, ...) compare_datasets(filename.old, filename.new, match.column = "GrantNumber", ...) read_dataset(filename, detect.dates = TRUE, ...)
fetch_datasets(dataset, dest.dir = NULL, detect.dates = TRUE, ...) compare_datasets(filename.old, filename.new, match.column = "GrantNumber", ...) read_dataset(filename, detect.dates = TRUE, ...)
dataset |
a character vector. When of length greater than one, datasets are only downloaded, but not read. Currently supported are:
|
dest.dir |
a directory; if |
detect.dates |
logical: if |
filename.old |
string: the filename |
filename.new |
string: the filename |
filename |
string: the filename |
match.column |
string: the name of the column to use for matching entries in old and new file |
... |
arguments to be passed to |
fetch_datasets
downloads datasets in CSV
format from the SNSF's website and stores them,
with a date prefix, in directory dest.dir
. If the
latter is NULL
, a temporary directory is used
(through tempdir
); but much better is to use a
more-persistent storage location. If a file with today's
date exists in dest.dir
, that file is read, and
nothing is downloaded. If more than one dataset
is
specified, those files are downloaded (if not current in
dest.dir
) but not read.
For downloading, function download.file
is
used. If it fails, fetch_datasets
returns
NULL
. Settings can be passed via ... . See
download.file
for options; in particular, see
the hints about timeout
.
compare_datasets
will match old and new
dataset via the specified match.column
and report
added lines (in new, but not in old file),
removed lines (in old, but not in new file), and
changed lines (in both files, but with differing content).
read_dataset
is a simple wrapper of
read.table
with appropriate settings.
A data.frame
for fetch_datasets
and
read_dataset
. For compare_datasets
, a
list
of three components named added
,
removed
and changed
.
Silvia Martens and Enrico Schumann
download.file
; options
(timeout
, in particular)
## requires internet connection, and file may be large dataset <- "OutputdataAward" SNSF.dir <- tempdir() ## This is just an example. ## In practice it's much more useful to ## store files in a persistant location, ## such as '~/Downloads/SNSFdatasets'. data <- fetch_datasets(dataset = dataset, dest.dir = SNSF.dir) ## all award titles table(data[["Award_Title"]])
## requires internet connection, and file may be large dataset <- "OutputdataAward" SNSF.dir <- tempdir() ## This is just an example. ## In practice it's much more useful to ## store files in a persistant location, ## such as '~/Downloads/SNSFdatasets'. data <- fetch_datasets(dataset = dataset, dest.dir = SNSF.dir) ## all award titles table(data[["Award_Title"]])