This section provides an overview of the preparatory steps that need to be taken before running the PACTA for Banks analysis. It includes information on the required input data sets and the required software.
The PACTA for Banks analysis requires a number of input data sets to run. Some of these can be obtained from external sources, while others need to be prepared by the user. Furthermore, some of the input data sets are optional.
The main input data sets required for the analysis are the following:
This data set provides information on the production profiles and emission intensities of companies active in the following real economy sectors: Automotive (light-duty vehicles) manufacturing, aviation, cement production, coal mining, upstream oil & gas extraction, power generation, and steel production. The ABCD is typically obtained from third party data providers. However, it is possible to prepare the ABCD yourself or complement an external data set with entries that may not be covered out of the box.
The ABCD data set must be an XLSX file and contains the following columns:
company_id
: <character>name_company
: <character>lei
: <character>sector
: <character>technology
: <character>production_unit
: <character>year
: <integer>production
: <numeric>emission_factor
: <numeric>plant_location
: <character>is_ultimate_owner
: <logical>emission_factor_unit
: <character>For more detail about the necessary structure of this dataset, see the data dictionary for abcd.
While PACTA is data agnostic and allows using data from any provider that offers the appropriate format, one option to obtain ABCD for this analysis is to buy the data from the data provider Asset Impact. Further information on how to obtain ABCD for PACTA and documentation of the individual sectors and data points can be found on the Asset Impact website.
The scenario data set provides information on the trajectories of technologies/fuel types and of emission intensity pathways for each of (or a subset of) the sectors covered in PACTA.
For sectors with technology level trajectories, the data set provides the TMSR and SMSP pathways based on the Market Share Approach, an allocation rule that implies all companies active in a sector have to adjust their production in a way that keeps market shares constant and solves for the aggregate climate transition scenario. For more information on how to calculate the TMSR and the SMSP, see the PACTA for Banks documentation.
The target market share scenario data set must be a CSV file and contains the following columns:
scenario
: <character>sector
: <character>technology
: <character>region
: <character>year
: <integer>tmsr
: <numeric>smsp
: <numeric>scenario_source
: <character>For more detail about the necessary structure of this dataset, see the data dictionary for scenario.
For sectors that do not have technology level pathways, PACTA uses the Sectoral Decarbonization Approach (SDA), an allocation rule that implies that all companies in a sector have to converge their physical emission intensity at a future scenario value - e.g. in the year 2050. This implies that more polluting companies have to reduce their physical emissions intensity more drastically than companies using cleaner technology. It does not have any direct implications on the amount of units produced by any company. For further information on calculating the SDA in PACTA, please see the PACTA for Banks documentation.
The SDA scenario data set must be a CSV file and contains the following columns:
scenario_source
: <character>scenario
: <character>sector
: <character>region
: <character>year
: <numeric>emission_factor_unit
: <character>emission_factor
: <numeric>For more detail about the necessary structure of this dataset, see the data dictionary for scenario.
While the raw input values of the scenarios are based on models from external third party organisations - such as the World Energy Outlook by the International Energy Agency (IEA), the Global Energy and Climate Outlook by the Joint Research Center of the European Commission (JRC), or the One Earth Climate Model by the Institute for Sustainable Futures (ISF) - the input data set for PACTA must be prepared using additional steps, which are documented publicly on the following GitHub repositories:
Since RMI has taken over stewardship of PACTA, the prepared scenario files can also be accessed as CSV downloads in the “Methodology and Supporting Documents” section of the PACTA website. The files are usually updated annually based on the latest scenario publications and as a general rule, the year of the publication defines the initial year of the scenario data set. This is commonly also used as the start year of the analysis.
The raw loan book is the financial data set that you would like to analyze. It contains information on the loans that you have provided to companies. As a bank, the data required will be available in your internal systems.
The raw loan book must be prepared as CSV files and contain at a minimum the following columns:
id_loan
: <character>id_direct_loantaker
: <character>name_direct_loantaker
: <character>id_ultimate_parent
: <character>name_ultimate_parent
: <character>loan_size_outstanding
: <numeric>loan_size_outstanding_currency
: <character>loan_size_credit_limit
: <numeric>loan_size_credit_limit_currency
: <character>sector_classification_system
: <character>sector_classification_direct_loantaker
:
<character>lei_direct_loantaker
: <character>isin_direct_loantaker
: <character>For more detail about the necessary structure of this dataset, see the data dictionary for loanbook.
For detailed descriptions of how to prepare raw loan books, see the “Training Materials” section of the PACTA for Banks documentation. The “User Guide 2”, the “Data Dictionary”, and the “Loan Book Template” files can all be helpful in preparing your data.
The misclassified loans CSV file should have just one column, id_loan, and therefore be structured as follows:
id_loan
: <character>The user can provide a list of loans that have been misclassified in the raw loan book. The aim here is specifically to remove false positives, that is, loans that are classified in scope of one of the PACTA sectors, but where manual research shows that the companies do not actually operate within the PACTA scope. Such a false positive may be due to erroneous data entry in the raw loan book, for example. Removing these loans from the falsely indicated sector in the calculation of the match success rate will give a more accurate picture of what match success rate can really be reached.
The the manual sector classification data set must be prepared as a CSV file and contain the following columns:
sector
: <character>borderline
: <logical>code
: <character>code_system
: <character>In case the user cannot obtain sector classification codes of any of
the classification systems featured in
sector_classifications
(currently the following
classification systems are featured: GICS, ISIC, NACE, NAICS, PSIC,
SIC), the user can provide a manually created sector classification file
for matching the loan book to the ABCD instead. Generally, any such
manually prepared sector classification file must follow the format of
sector_classifications
. It is recommended to use the built
in sector classifications if possible, as mapping your own sector
classification to the PACTA sectors can be complex and time
consuming.
Using the {pacta.loanbook}
package for the PACTA for
Banks analysis requires the following software to be installed on your
system:
R is the programming language that the {pacta.loanbook}
package is written in. You can download R from the Comprehensive R Archive Network
(CRAN).
RStudio is an integrated development environment (IDE) for R
developed by Posit. It is not strictly required to run the analysis, but
it can be helpful for managing your project and running the analysis.
Generally, RStudio is very widely used among the R community and
probably the easiest way to interact with most R tools, such as the
{pacta.loanbook}
suite of packages. RStudio Desktop is an
open source tool and free of charge. You can download RStudio from the
Posit RStudio website.
{pacta.loanbook}
packageThe {pacta.loanbook}
R package is the main software tool
that you will use to run the PACTA for Banks analysis.
You can install the {pacta.loanbook}
R package from any
CRAN mirror by running the following command in R:
You can install the development versions of the
{pacta.loanbook}
R package from GitHub with:
We use the pak
package as
a simple tool to
install packages from GitHub.
Note that if you choose to install the {pacta.loanbook}
R package from GitHub, you will need to have:
git
installed locally,You can find more information on how to do this using the following resources:
If you only plan to use GitHub to install this package or other packages as shown above, you will not have to have a deep understanding of all the git commands, so there is no need to be overwhelmed by the complexity of git.
The {pacta.loanbook}
R package depends on a number of
other R packages. These dependencies will be installed automatically
when you install the {pacta.loanbook}
R package. The
required packages are:
{cli}
, {data.table}
, {dplyr}
,
{ggplot2}
, {ggrepel}
, {glue}
,
{lifecycle}
, {magrittr}
, {purrr}
,
{r2dii.analysis}
, {r2dii.data}
,
{r2dii.match}
, {r2dii.plot}
,
{rlang}
, {rstudioapi}
, {scales}
,
{stringdist}
, {stringi}
,
{stringr}
, {tibble}
, {tidyr}
,
{tidyselect}
, {zoo}
{pacta.loanbook}
R package?The most common ways to install R packages are via CRAN or GitHub. Public institutions often have restrictions on the installation of packages from GitHub, so you may need to install the package from CRAN. In some cases, your institution may mirror CRAN in their internal application registry, so you may need to install the package from there. Should you have any issues with the installation from the internal application registry, it is best to reach out to your IT department. If you cannot obtain the package in any of these ways, please reach out to the package maintainers directly for exploring other options.
In principle, all dependencies required to run the
{pacta.loanbook}
R package will be installed automatically
when you install the package. However, if you encounter any issues with
the installation of the required packages, you can install them manually
by running the following command in R, where ...
should be
replaced with the package names from the list above, separated by
commas:
Before running the PACTA for Banks analysis, you should make sure that you have completed the following preparatory steps:
PREVIOUS CHAPTER: Overview
NEXT CHAPTER: Running the Analysis