| Title: | pacta.portfolio.import |
|---|---|
| Description: | For more information visit <https://rmi.org/>. |
| Authors: | CJ Yetman [aut, cre, ctr] (ORCID: <https://orcid.org/0000-0001-5099-9500>), Jackson Hoffart [aut, ctr] (ORCID: <https://orcid.org/0000-0002-8600-5042>), Jacob Kastl [aut, ctr], Alex Axthelm [aut, ctr] (ORCID: <https://orcid.org/0000-0001-8579-8565>), RMI [cph, fnd] |
| Maintainer: | CJ Yetman <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.0.2 |
| Built: | 2026-05-24 07:20:03 UTC |
| Source: | https://github.com/rmi-pacta/pacta.portfolio.import |
This function will return a named vector giving the names of the headers in the portfolio CSV that match the proper header names expected by pacta.portfolio.analysis. The name of each element will be the proper column name it matches to.
determine_headers(filepath)determine_headers(filepath)
filepath |
A character vector containing an absolute or relative path to a single portfolio CSV |
A named character vector containing the names of the headers in the portfolio CSV that match the proper header names expected by pacta.portfolio.analysis. The name of each element will be the proper column name it matches to.
This function will return a data frame with numerous specifications for every
CSV file passed in the files argument.
get_csv_specs( files, expected_colnames = c("Investor.Name", "Portfolio.Name", "ISIN", "MarketValue", "Currency") )get_csv_specs( files, expected_colnames = c("Investor.Name", "Portfolio.Name", "ISIN", "MarketValue", "Currency") )
files |
A character vector containing absolute or relative paths to portfolio CSVs, or a directory containing portfolio CSVs |
expected_colnames |
A character vector containing the names of the columns expected in the portfolio CSVs |
A data frame (invisibly) containing one row per portfolio CSV with columns for each identified specification
This function will guess the delimiter of a delimited file for a vector of
filenames or filepaths and return the delimiter as a string. It defaults to
the following delimiters in order if the others are not valid: ",", ";",
tab, "|", ":". If the file is inaccessible or binary, it will return NA
for that element. If you pass anything that is not a character vector or a
single column data.frame to the filepaths argument, this function will
give an error.
guess_delimiter(filepaths)guess_delimiter(filepaths)
filepaths |
A character vector |
A character vector the same length as filepaths.
This function will guess the file encoding of a vector of filenames or
filepaths and return the file encoding as a string. It primarily uses
stringi::stri_enc_detect() to guess the encoding. Additionally, it
searches for known CP850 and CP1252 characters and will return the
appropriate encoding if found, because ICU/stringi cannot detect them. If a
file is a binary file, it will return "binary". If a file is inaccessible
it will return NA for that element.
guess_file_encoding(filepaths, threshold = 0.2)guess_file_encoding(filepaths, threshold = 0.2)
filepaths |
A character vector |
threshold |
A single element numeric (minimum confidence level of the guess [0-1]) |
A character vector the same length as filepaths.
market_value column of a portfolio CSVThis function will guess the numerical marks in the market_value column of
a portfolio CSV. It will return a single character string containing the
guessed decimal or thousands grouping mark, depending on the value passed to
type, for each portfolio CSV passed in filepaths.
guess_numerical_mark(filepaths, type = "decimal")guess_numerical_mark(filepaths, type = "decimal")
filepaths |
A character vector |
type |
A single character string, either "decimal" or "grouping" |
A character vector the same length as filepaths containing a single
character string defining the guessed numerical mark for each portfolio CSV
This function validates that a vector of filenames or filepaths are accessible files that: are a file, exist, have read access, and are not empty. Dropbox files that are visible but not downloaded locally will be empty files, and they will not pass this validation.
is_file_accessible(filepaths)is_file_accessible(filepaths)
filepaths |
A character vector |
A logical vector the same length as filepaths.
This function validates that files in a vector of filenames or filepaths are
readable files and returns TRUE or FALSE for each one.
is_readable_file(filepaths)is_readable_file(filepaths)
filepaths |
A character vector |
A logical vector the same length as filepaths.
This function will guess if a file is a text file for a vector of filenames
or filepaths and return TRUE or FALSE for each. It guesses that a file is
text if it doesn't find any nul bytes in the first 2048 bytes of the file.
This is an imperfect guess, but it is very likely that a binary/non-text file
will have a nul byte near the beginning of the file. This might guess that a
file that is not intended to be read in as text is a text file, but at least
you will likely be able to read in the file as text without error. If the
file is inaccessible, either because it is empty, you don't have permission
to read it, it's a directory, or it doesn't exist, this function will return
FALSE. If you pass anything that is not a character vector or a single
column data.frame to the filepaths argument, this function will give an
error.
is_text_file(filepaths)is_text_file(filepaths)
filepaths |
A character vector |
A logical vector the same length as filepaths.
This function validates that a vector of currency codes are valid currency codes that exist in the ISO 4217 alpha code specification.
is_valid_currency_code(currency_codes)is_valid_currency_code(currency_codes)
currency_codes |
A character vector |
A logical vector the same length as currency_codes.
This function validates that a vector of CUSIPs are valid CUSIP codes.
is_valid_cusip(cusips)is_valid_cusip(cusips)
cusips |
A character or numeric vector (automatically collapses a single column data.frame to a vector) |
A logical vector the same length as cusips.
This function validates that a vector of ISINs are valid codes that conform
to the ISO 6166 specification with TRUE or FALSE. It checks the basic
structure (2 alpha characters, 9 alpha-numeric characters, 1 check digit)
and also validates the check digit using the Luhn algorithm.
is_valid_isin(isins)is_valid_isin(isins)
isins |
A character vector |
A logical vector the same length as isins.
This function will read in one more portfolio CSVs. It works around a number of common issues, like alternate column names, alternate delimiter, alternate decimal and grouping marks, file encodings besides ASCII or UTF-8, etc.
read_portfolio_csv(filepaths, combine = TRUE)read_portfolio_csv(filepaths, combine = TRUE)
filepaths |
A character vector or single column data frame (strings should be valid file paths to CSV files or a directory that contains CSV files) |
combine |
A single element logical (default |
If combine is TRUE, returns a tbl_df with all of the readable
data from the portfolio CSVs combined. If combine is FALSE, returns a
list of tbl_dfs, one for each readable portfolio CSV.