Title: | Synthetic Experience Tracking Insurance Claims |
---|---|
Description: | Creation of an individual claims simulator which generates various features of non-life insurance claims. An initial set of test parameters, designed to mirror the experience of an Auto Liability portfolio, were set up and applied by default to generate a realistic test data set of individual claims (see vignette). The simulated data set then allows practitioners to back-test the validity of various reserving models and to prove and/or disprove certain actuarial assumptions made in claims modelling. The distributional assumptions used to generate this data set can be easily modified by users to match their experiences. Reference: Avanzi B, Taylor G, Wang M, Wong B (2020) "SynthETIC: an individual insurance claim simulator with feature control" <arXiv:2008.05693>. |
Authors: | Benjamin Avanzi [aut], William Ho [aut], Greg Taylor [aut], Melantha Wang [aut, cre], Bernard Wong [aut] |
Maintainer: | Melantha Wang <[email protected]> |
License: | GPL-3 |
Version: | 1.1.0 |
Built: | 2025-02-21 03:45:53 UTC |
Source: | https://github.com/agi-lab/synthetic |
Performs assertions on inputted relativities, will raise an error if any checks
fail. Currently checks on the: necessary column names used in dataframe,
required factors and levels based on inputted factors
, inputted relativities
being non-negative numbers.
check_relativity(factors, relativity)
check_relativity(factors, relativity)
factors |
named list of vectors, containing the name of the covariates and associated factors in vector form. |
relativity |
relativity dataframe as defined in |
Simulates and returns the closure/settlement delays of each of the claims occurring in each of the periods, assuming a Weibull distribution.
claim_closure(frequency_vector, claim_size_list, rfun, paramfun, ...)
claim_closure(frequency_vector, claim_size_list, rfun, paramfun, ...)
frequency_vector |
a vector of claim frequencies for all the periods. |
claim_size_list |
list of claim sizes. |
rfun |
optional alternative random sampling function; if not specified, assumes Weibull. |
paramfun |
parameters for the random sampling function, as a function of
|
... |
other arguments/parameters to be passed onto |
Claim settlement delay represents the delay from claim notification to closure. The epoch of closure is the sum of occurrence time, notification delay and settlement delay.
By default, it is assumed that the settlement delay follows a Weibull
distribution. The default Weibull parameters have been set up such that
the mean settlement delay (in quarters, but automatically converted to the
relevant time_unit
as defined in set_parameters
) is
porportional to
(where
ref_claim
is a packagewise-global variable that user is required to define
by set_parameters
) up to a scaling factor "", which is
dependent on
occurrence_perid
. Specifically,
if
claim_size
< (0.10 * ref_claim) and occurrence_period
21, and
otherwise. The CoV of the settlement delay is constant at 60%, independent of the size and occurrence period of the claim.
Note that this function can create out-of-bound settlement dates. In these
cases, the simulated epoch of occurrence of the transaction is maintained
throughout the simulation of details of the claim concerned. Adjustments will
only be made for the tabulation of results in claim_output
and
payment inflation.
Of course, like any other SynthETIC
modules, the user may wish to
sample from a different distribution rfun
and/or a different set of
parameters. The paramfun
should return the distribution parameters in a
vector, e.g. for gamma distribution paramfun
should return a value in
the format of c(shape = , scale = )
, for exponential distribution this
should return c(rate = )
. See Examples. If a rfun
is specified without
a paramfun
, SynthETIC
will try to proceed without parameterisation (i.e.
directly calling rfun
with claim size and occurrence period), and if it
fails, then return an error message.
A list of settlement delays such that the th component of
the list gives the settlement delays for all claims that occurred in period
.
n_vector <- c(90, 79, 102, 78, 86, 88, 116, 84, 93, 104) # Try a constant Weibull distribution # (i.e. independent of claim size and occurrence period) setldel_param <- function(claim_size, occurrence_period) { mean <- 10; cv <- 0.70 shape <- get_Weibull_parameters(mean, cv)[1, ] scale <- get_Weibull_parameters(mean, cv)[2, ] c(shape = shape, scale = scale) } setldel <- claim_closure(n_vector, claim_size(n_vector), paramfun = setldel_param) setldel[[1]] # show settlement delay of claims originating from period 1
n_vector <- c(90, 79, 102, 78, 86, 88, 116, 84, 93, 104) # Try a constant Weibull distribution # (i.e. independent of claim size and occurrence period) setldel_param <- function(claim_size, occurrence_period) { mean <- 10; cv <- 0.70 shape <- get_Weibull_parameters(mean, cv)[1, ] scale <- get_Weibull_parameters(mean, cv)[2, ] c(shape = shape, scale = scale) } setldel <- claim_closure(n_vector, claim_size(n_vector), paramfun = setldel_param) setldel[[1]] # show settlement delay of claims originating from period 1
Returns the number of insurance claims occurring in each of the periods.
claim_frequency( I = 40, E = 12000, freq = 0.03, simfun, type = c("r", "p"), ... )
claim_frequency( I = 40, E = 12000, freq = 0.03, simfun, type = c("r", "p"), ... )
I |
number of claims development periods considered. |
E |
effective annual exposure associated with each period (vector). |
freq |
expected frequency per unit exposure for each period (vector). |
simfun |
optional alternative sampling distribution; see Details. |
type |
the type of |
... |
other arguments/parameters to be passed onto |
Unless otherwise specified, claim_frequency()
assumes the
claim frequency follows a Poisson distribution with mean equal to the
product of exposure E
associated with period and expected
claim frequency
freq
per unit exposure for that period.
If no arguments are provided, by default claim_frequency()
assumes
a total of 40 development periods, constant exposure rate at 12000 per year
and constant frequency at 0.03 per unit of exposure.
If one wishes to use an alternative sampling distribution for claim
frequency, they could declare such specification through the simfun
argument. The simfun
argument takes both random generation functions
(type = "r"
, the default) and cumulative distribution functions
(type = "p"
). For the latter, claim_frequency()
will first search for the
existence of the corresponding r
-function. If it notes the existence of
such an r
-function (e.g. rpois
for simfun = ppois
), it will directly
apply the r
-function to optimise simulation efficiency. Otherwise, the
function uses a numerical inverse transform method for simulation (see
simulate_cdf
), which may not be the most efficient and can
potentially result in errors if an appropriate range
is not specified in
the optional arguments.
Pre-defined distribution functions such as ppois
are supported.
no_period <- 40 exposure <- c(rep(12000, no_period)) exp_freq <- c(rep(0.03, no_period)) # returns the same result as claim_frequency() claim_frequency(I = no_period, E = exposure, freq = exp_freq) # use a pre-defined random generation function claim_frequency(I = 10, simfun = rpois, lambda = 80) # or equivalently, through a distribution function claim_frequency(I = 10, simfun = ppois, type = "p", lambda = 80)
no_period <- 40 exposure <- c(rep(12000, no_period)) exp_freq <- c(rep(0.03, no_period)) # returns the same result as claim_frequency() claim_frequency(I = no_period, E = exposure, freq = exp_freq) # use a pre-defined random generation function claim_frequency(I = 10, simfun = rpois, lambda = 80) # or equivalently, through a distribution function claim_frequency(I = 10, simfun = ppois, type = "p", lambda = 80)
Simulates and returns the notification/reporting delays of each of the claims occurring in each of the periods, according to a user-specified distribution (by default a Weibull).
claim_notification(frequency_vector, claim_size_list, rfun, paramfun, ...)
claim_notification(frequency_vector, claim_size_list, rfun, paramfun, ...)
frequency_vector |
a vector of claim frequencies for all the periods. |
claim_size_list |
list of claim sizes. |
rfun |
optional alternative random sampling function; if not specified, assumes Weibull. |
paramfun |
parameters for the random sampling function, as a function of
|
... |
other arguments/parameters to be passed onto |
Claim notification delay represents the delay from claim occurrence
to reporting. SynthETIC
assumes the (removable) dependence of notification
delay on claim size and occurrence period of the claim, thus requiring the
user to specify a paramfun
of claim_size
and occurrrence_period
(with
the option to add more arguments as needed).
The paramfun
should return the distribution parameters in a vector,
e.g. for gamma distribution paramfun
should return a value in the format of
c(shape = , scale = )
, for exponential distribution this should return
c(rate = )
. See Examples. If a rfun
is specified without a paramfun
,
SynthETIC
will try to proceed without parameterisation (i.e. directly
calling rfun
with claim size and occurrence period), and if it fails,
return an error message.
By default, it is assumed that the notification delay follows a Weibull distribution, and that the mean notification delay (in quarters) is given by
automatically converted to the relevant time_unit
defined by user at the
top of their script through set_parameters
. Note that the
ref_claim
in the equation is another package-wise global variable that the
user needs to define through set_parameters
as it determines
the monetary scale of the simulator. The CoV (Coefficient of Variation)
of the notification delay is assumed to be constant at 70%, independent of
the size and occurrence period of the claim.
Of course, the user may wish to sample from a different distribution rfun
and/or a different set of parameters. An example is given below.
A list of notification delays such that the th component of
the list gives the notification delays for all claims that occurred in
period
.
n_vector <- c(90, 79, 102, 78, 86, 88, 116, 84, 93, 104) # Try a constant Weibull distribution # (i.e. independent of claim size and occurrence period) notidel_param <- function(claim_size, occurrence_period) { mean <- 2; cv <- 0.70 shape <- get_Weibull_parameters(mean, cv)[1, ] scale <- get_Weibull_parameters(mean, cv)[2, ] c(shape = shape, scale = scale) } notidel <- claim_notification(n_vector, claim_size(n_vector), paramfun = notidel_param) notidel[[1]] # show notification for claims originating from period 1
n_vector <- c(90, 79, 102, 78, 86, 88, 116, 84, 93, 104) # Try a constant Weibull distribution # (i.e. independent of claim size and occurrence period) notidel_param <- function(claim_size, occurrence_period) { mean <- 2; cv <- 0.70 shape <- get_Weibull_parameters(mean, cv)[1, ] scale <- get_Weibull_parameters(mean, cv)[2, ] c(shape = shape, scale = scale) } notidel <- claim_notification(n_vector, claim_size(n_vector), paramfun = notidel_param) notidel[[1]] # show notification for claims originating from period 1
Returns the occurrence times of each of the claims occurring in each of
the periods, assuming the occurrence time of any claim in period
is uniformly distributed between times
and
.
claim_occurrence(frequency_vector)
claim_occurrence(frequency_vector)
frequency_vector |
a vector of claim frequencies for all the periods. |
A list of occurrence times such that the th component of the
list gives the claim occurrence time for all claims that occurred in period
.
n_vector <- c(90, 79, 102, 78, 86, 88, 116, 84, 93, 104) # occurrence time for all claims originating from period 1 claim_occurrence(n_vector)[[1]]
n_vector <- c(90, 79, 102, 78, 86, 88, 116, 84, 93, 104) # occurrence time for all claims originating from period 1 claim_occurrence(n_vector)[[1]]
Outputs the full (or past) square of claim payments by occurrence period and
development period. The upper left triangle represents the past, and the
lower right triangle the unseen future.
Users can modify the aggregate level by providing an aggregate_level
argument to the function. For example, setting aggregate_level = 4
when
working with calendar quarters produces a payment square by occurrence and
development year.
Users will also have the option to decide whether to include the out-of-bound
transactions to the maximum DQ, or to leave them in a separate "tail" cell,
see Details.
claim_output( frequency_vector, payment_time_list, payment_size_list, aggregate_level = 1, incremental = TRUE, future = TRUE, adjust = TRUE )
claim_output( frequency_vector, payment_time_list, payment_size_list, aggregate_level = 1, incremental = TRUE, future = TRUE, adjust = TRUE )
frequency_vector |
a vector of claim frequencies for all the periods. |
payment_time_list |
(compound) list of payment times (both the continous time scale and the discrete period versions work). |
payment_size_list |
(compound) list of payment size pattern (can be either with or without inflation). |
aggregate_level |
number of periods to be aggregated together; must be a divisor of the total number of periods under consideration (default 1). |
incremental |
logical; if true returns the incremental payment square, else returns the cumulative payment square. |
future |
logical; if true shows the full claim triangle (i.e. including claim payments in future periods), else shows only the past triangle (default TRUE). |
adjust |
logical; if true accumulates all payments beyond the max development period to the last development period, else shows a separate "tail" column for all the out-of-bound transactions. |
Remark on out-of-bound payment times: This function allows adjustment
for out-of-bound transaction dates, by forcing payments that were projected
to fall out of the maximum development period to be paid at the exact end of
the maximum development period allowed (when we set adjust = TRUE
, which is
the default behaviour). For example, if we consider 40 periods of development
and a claim incurred in the interval (20, 21] was projected to have a payment
at time 62.498210, then for the purpose of tabulation, we can
treat such a payment as if it occurred at time 60 (adjust = TRUE
);
leave the payment in the "tail" cell, so the user can see the proportion of
payments beyond the maximum development period (adjust = FALSE
).
An array of claims payments.
attach(test_claims_object) # a square of cumulative claims payments by accident and development quarters CL <- claim_output(frequency_vector, payment_time_list, payment_size_list, aggregate_level = 1, incremental = FALSE) detach(test_claims_object)
attach(test_claims_object) # a square of cumulative claims payments by accident and development quarters CL <- claim_output(frequency_vector, payment_time_list, payment_size_list, aggregate_level = 1, incremental = FALSE) detach(test_claims_object)
Simulates and returns the inter-partial delays (i.e. the delay of one partial payment relative to the previous) of each payment for each of the claims occurring in each of the periods.
claim_payment_delay( frequency_vector, claim_size_list, no_payments_list, settlement_list, rfun, paramfun, ... )
claim_payment_delay( frequency_vector, claim_size_list, no_payments_list, settlement_list, rfun, paramfun, ... )
frequency_vector |
a vector of claim frequencies for all the periods. |
claim_size_list |
list of claim sizes. |
no_payments_list |
list of number of partial payments. |
settlement_list |
list of settlement delays. |
rfun |
optional alternative random sampling function; see Details for default. |
paramfun |
parameters for the random sampling function, as a function of
|
... |
other arguments/parameters to be passed onto |
Returns a compound list structure such that the th component
of the
th sub-list gives the payment delay pattern (as a vector) for
the
th claim of occurrence period
.
The default rfun
is split into 2 cases.
Case 1: claims with at least 4 partial payments. The simulation takes
2 steps.
First we sample the last payment delay from a Weibull distribution with
mean = 1 quarter (automatically converted to the relevant time_unit
, a
global variable that the user is required to define at the top of their code)
and CoV = 20%. Then we sample the remaining payment delays from a second
Weibull distribution with CoV at 35% and mean = target mean settlement delay
(see claim_closure
) divided by the number of payments.
Case 2: claims with less than 4 partial payments. Proceed as in Case 1 but without separating out the simulation of the last payment delay (i.e. ignore step 1).
# set up n_vector <- claim_frequency(I = 10) claim_sizes <- claim_size(n_vector) no_payments <- claim_payment_no(n_vector, claim_sizes) setldel <- claim_closure(n_vector, claim_sizes) # with default setting pmtdel <- claim_payment_delay(n_vector, claim_sizes, no_payments, setldel) pmtdel[[1]][[1]] # payment delays for claim 1 of occurrence period 1 # with some custom rfun # simplistic case: payments times are uniformly distributed my_func <- function(n, setldel) { prop <- runif(n) prop <- prop / sum(prop) setldel * prop } mypayments <- claim_payment_delay(n_vector, claim_sizes, no_payments, setldel, rfun = my_func) # inter-partial delays for claim 1 of occurrence period 1 mypayments[[1]][[1]]
# set up n_vector <- claim_frequency(I = 10) claim_sizes <- claim_size(n_vector) no_payments <- claim_payment_no(n_vector, claim_sizes) setldel <- claim_closure(n_vector, claim_sizes) # with default setting pmtdel <- claim_payment_delay(n_vector, claim_sizes, no_payments, setldel) pmtdel[[1]][[1]] # payment delays for claim 1 of occurrence period 1 # with some custom rfun # simplistic case: payments times are uniformly distributed my_func <- function(n, setldel) { prop <- runif(n) prop <- prop / sum(prop) setldel * prop } mypayments <- claim_payment_delay(n_vector, claim_sizes, no_payments, setldel, rfun = my_func) # inter-partial delays for claim 1 of occurrence period 1 mypayments[[1]][[1]]
Converts the (compound) list of constant-dollar-value payment sizes to a
(compound) list of inflated payment sizes by applying inflation rates on a
continuous time scale.
Compare with claim_payment_size()
which generates the constant dollar
amount of partial payment sizes. Note that the constant dollar values are as
of time 0.
claim_payment_inflation( frequency_vector, payment_size_list, payment_time_list, occurrence_list, claim_size_list, base_inflation_vector, si_occurrence_function, si_payment_funtion )
claim_payment_inflation( frequency_vector, payment_size_list, payment_time_list, occurrence_list, claim_size_list, base_inflation_vector, si_occurrence_function, si_payment_funtion )
frequency_vector |
a vector of claim frequencies for all the occurrence periods. |
payment_size_list |
(compound) list of payment size pattern (without inflation). |
payment_time_list |
(compound) list of payment times on a continuous time scale. |
occurrence_list |
(compound) list of occurrence times on a continuous time scale. |
claim_size_list |
list of claim sizes. |
base_inflation_vector |
vector showing quarterly base inflation rates (quarterly effective) for all the periods under consideration (default at nil base inflation). |
si_occurrence_function |
function of |
si_payment_funtion |
function of |
Returns a compound list structure such that the th component
of the
th sub-list gives the inflated payment pattern (as a
vector) for the
th claim of occurrence period
.
By default we assume
Nil base inflation.
No superimposed inflation by (continuous) occurrence time for the first 20
quarters (converted to the relevant time_unit
); beyond 20 quarters, the
inflation index is given by
where ref_claim
is a package-wise global variable that user is required to
define at the top of their code using set_parameters
. The
interpretation is that, due to some external change to the insurance scheme
at the end of occurrence quarter 20, the smallest claims will reduce by up to
40% in size. This change will not impact claims exceeding 0.25*ref_claim
in
size. The reduction varies linearly between these claim sizes.
Superimposed inflation by (continuous) payment time operates at a period rate of
where is
equivalent to a 30% p.a. inflation rate (converted to the relevant
time_unit
). The interpretation is that, for claims of small size the
payment time superimposed inflation tends to be very high (30% p.a.);
whereas for claims exceeding ref_claim
in dollar values as of ,
the payment time superimposed inflation is nil. The rate of inflation
varies linearly between claim sizes of zero and
ref_claim
.
Remark on continuous inflation: We note that SynthETIC
works with
exact transaction times, so time has been measured continuously throughout
the program. This allows us to apply inflation on a continous time scale too.
For example, we asked the users to provide base inflation as a vector of
quarterly base inflation rates, quarterly effective for all the periods under
consideration. This data is generally available online (e.g. the Australian
quarterly inflation is available on RBA's website - see
link). We
then interpolate the quarterly inflation rates to compute the addition of
inflation by exact times. In the case of above, if we observed quarterly
inflation rates of 0.6%, 0.5%, 0.7% and 0.3% for one particular year, then
the base inflation applied to a payment at time quarters will
be
.
Remark on out-of-bound payment times: This function includes adjustment for out-of-bound transaction dates, by forcing payments that were projected to fall out of the maximum development period to be paid at the exact end of the maximum development period allowed. For example, if we consider 40 periods of development and a claim incurred in the interval (20, 21] was projected to have a payment at time 62.498210, then we would treat such a payment as if it occurred at time 60 for the purpose of inflation.
# remove SI occurrence and SI payment SI_occurrence <- function(occurrence_time, claim_size) {1} SI_payment <- function(payment_time, claim_size) {1} # base inflation constant at 0.02 p.a. effective # (length is 80 to cover the maximum time period) base_inflation_vector <- rep((1 + 0.02)^(1/4) - 1, times = 80) attach(test_claims_object) payment_inflated_list <- claim_payment_inflation( frequency_vector, payment_size_list, payment_time_list, occurrence_list, claim_size_list, base_inflation_vector, SI_occurrence, SI_payment ) detach(test_claims_object) # undo the attach # inflated payments for claim 1 of occurrence period 1 payment_inflated_list[[1]][[1]]
# remove SI occurrence and SI payment SI_occurrence <- function(occurrence_time, claim_size) {1} SI_payment <- function(payment_time, claim_size) {1} # base inflation constant at 0.02 p.a. effective # (length is 80 to cover the maximum time period) base_inflation_vector <- rep((1 + 0.02)^(1/4) - 1, times = 80) attach(test_claims_object) payment_inflated_list <- claim_payment_inflation( frequency_vector, payment_size_list, payment_time_list, occurrence_list, claim_size_list, base_inflation_vector, SI_occurrence, SI_payment ) detach(test_claims_object) # undo the attach # inflated payments for claim 1 of occurrence period 1 payment_inflated_list[[1]][[1]]
Simulates and returns the number of partial payments required to settle each of the claims occurring in each of the periods.
claim_payment_no(frequency_vector, claim_size_list, rfun, paramfun, ...)
claim_payment_no(frequency_vector, claim_size_list, rfun, paramfun, ...)
frequency_vector |
a vector of claim frequencies for all the periods. |
claim_size_list |
list of claim sizes. |
rfun |
optional alternative random sampling function; see Details for default. |
paramfun |
parameters for the random sampling function, as a function of
|
... |
other arguments/parameters to be passed onto |
Returns a list structure such that the th component of the
list gives the number of partial payments required to settle each of the
claims that occurred in period
. It is assumed that at least one
payment is required i.e. no claims are settled without any single cash
payment.
Let represent the number of partial payments associated with a
particular claim. The default
simulate_no_pmt_function
is set up
such that if claim_size
claim_size_benchmark_1
,
if claim_size_benchmark_1
<
claim_size
claim_size_benchmark_2
,
if claim_size
> claim_size_benchmark_2
then is geometric with minimum 4 and mean
Alternative sampling distributions are supported through rfun
(the random
generation function) and paramfun
(which returns the parameters of rfun
as a function of claim_size
). The paramfun
should return the distribution
parameters in a vector, e.g. for gamma distribution paramfun
should
return a value in the format of c(shape = , scale = )
. If a rfun
is
specified without a paramfun
, SynthETIC
will try to proceed without
parameterisation (i.e. directly calling rfun
with claim_size
), and if it
fails, then return an error message.
n_vector <- claim_frequency(I = 10) # with default simulation function no_payments <- claim_payment_no(n_vector, claim_size(n_vector)) no_payments[[1]] # number of payments for claims incurred in period 1 # modify the lower benchmark value claim_payment_no(n_vector, claim_size(n_vector), claim_size_benchmark_1 = 5000)
n_vector <- claim_frequency(I = 10) # with default simulation function no_payments <- claim_payment_no(n_vector, claim_size(n_vector)) no_payments[[1]] # number of payments for claims incurred in period 1 # modify the lower benchmark value claim_payment_no(n_vector, claim_size(n_vector), claim_size_benchmark_1 = 5000)
Simulates and returns the constant dollar amount of each partial payment (i.e.without inflation) for each of the claims occurring in each of the periods.
claim_payment_size( frequency_vector, claim_size_list, no_payments_list, rfun, paramfun, ... )
claim_payment_size( frequency_vector, claim_size_list, no_payments_list, rfun, paramfun, ... )
frequency_vector |
a vector of claim frequencies for all the periods. |
claim_size_list |
list of claim sizes. |
no_payments_list |
list of number of partial payments. |
rfun |
optional alternative random sampling function; see Details for default. |
paramfun |
parameters for the random sampling function, as a function of
|
... |
other arguments/parameters to be passed onto |
Returns a compound list structure such that the th component
of the
th sub-list gives the payment pattern (as a vector) for the
th claim of occurrence period
.
The default rfun
is set up in three steps. First
we sample the complement of the proportion of total claim size
represented by the last two payments, from a Beta distribution with mean
where
ref_claim
is a package-wise global variable that we ask the user to define
at the top of their code using set_parameters
. CoV is assumed
constant at 20%.
Next we simulate the proportion of last_two_pmts paid in the second last payment (settlement of the claim) from a Beta distribution with mean = 0.90 and CoV = 3%.
Lastly we sample the remaining payment proportions from a Beta distribution with mean
and CoV = 10%, which is followed by a normalisation such that the proportions add up to 1.
In the cases where there are only 2 or 3 partial payments, proceed as if there were 4 or 5 payments respectively with last_two_payments = 0. The trivial case is when the claim is settled with a single payment, which must be of the same amount as the total claim size.
Alternative sampling distributions are supported through rfun
(the random
generation function) and paramfun
(which returns the parameters of rfun
as a function of claim_size
). The paramfun
should return the distribution
parameters in a vector, e.g. for gamma distribution paramfun
should
return a value in the format of c(shape = , scale = )
. If a rfun
is
specified without a paramfun
, SynthETIC
will try to proceed without
parameterisation (i.e. directly calling rfun
with claim_size
), and if it
fails, then return an error message.
Why did we set up a payment pattern as above?
The payment pattern is set up to reflect the typical pattern of a claim from an Auto liability line of business, which usually consists of:
(possibly) some small payments such as police reports, medical consultations and reports;
some more substantial payments such as hospitalisation, specialist medical procedures, equipment (e.g. prosthetics);
a final settlement with the claimant (usually the second last payment);
a smaller final payment, usually covering legal costs.
Claims in a number of other lines of business exhibit a similar structure, albeit with possible differences in the types of payment made.
# set up n_vector <- claim_frequency(I = 10) claim_sizes <- claim_size(n_vector) no_payments <- claim_payment_no(n_vector, claim_sizes) # with default rfun payments <- claim_payment_size(n_vector, claim_sizes, no_payments) # partial payment sizes for claim 1 of occurrence period 1 payments[[1]][[1]] # with some custom rfun # simplistic case: (stochastically) equal amounts my_func <- function(n, claim_size) { prop <- runif(n) prop <- prop / sum(prop) claim_size * prop } mypayments <- claim_payment_size(n_vector, claim_sizes, no_payments, my_func) # partial payment sizes for claim 1 of occurrence period 1 mypayments[[1]][[1]]
# set up n_vector <- claim_frequency(I = 10) claim_sizes <- claim_size(n_vector) no_payments <- claim_payment_no(n_vector, claim_sizes) # with default rfun payments <- claim_payment_size(n_vector, claim_sizes, no_payments) # partial payment sizes for claim 1 of occurrence period 1 payments[[1]][[1]] # with some custom rfun # simplistic case: (stochastically) equal amounts my_func <- function(n, claim_size) { prop <- runif(n) prop <- prop / sum(prop) claim_size * prop } mypayments <- claim_payment_size(n_vector, claim_sizes, no_payments, my_func) # partial payment sizes for claim 1 of occurrence period 1 mypayments[[1]][[1]]
Converts the list of inter-partial delays to a list of payment times in
continuous time scale. Set discrete = TRUE
to get the payment times in
calendar periods.
claim_payment_time( frequency_vector, occurrence_list, notification_list, payment_delay_list, discrete = FALSE )
claim_payment_time( frequency_vector, occurrence_list, notification_list, payment_delay_list, discrete = FALSE )
frequency_vector |
a vector of claim frequencies for all the periods. |
occurrence_list |
list of claim occurrence times. |
notification_list |
list of notification delays. |
payment_delay_list |
(compound) list of inter partial delays. |
discrete |
logical; if TRUE returns integer-valued payment times (default FALSE). |
Returns a compound list structure such that the th component
of the
th sub-list gives the payment time pattern (as a vector) for
the
th claim of occurrence period
.
Note that, as in the case of claim_closure
, this function can
result in out-of-bound payment dates (i.e. payment times beyond the maximum
number of development periods under consideration). In these cases, we retain
the original simulated values for the simulation of other quantities, but we
will make adjustments for such claims in the tabulation of results in
claim_output
and the payment inflation function
claim_payment_inflation
.
Simulates and returns the size of each of the claims occurring in each of
the periods, given its cumulative distribution function.
Note that claim_size()
aims to model the claim sizes
without inflation.
claim_size(frequency_vector, simfun, type = c("r", "p"), ...)
claim_size(frequency_vector, simfun, type = c("r", "p"), ...)
frequency_vector |
a vector of claim frequencies for all the periods. |
simfun |
optional alternative sampling distribution; see Details. |
type |
the type of |
... |
other arguments/parameters to be passed onto |
By default claim_size()
assumes a left truncated power
normal distribution: , left truncated
at 30. The truncation is done via resampling for rejected values.
Users can opt to use alternative distributions if desired. As discussed in
claim_frequency
, users can declare such specification through
the simfun
argument, which takes both random generation functions
(type = "r"
, the default) and cumulative distribution functions
(type = "p"
). See Examples.
For the latter, claim_size()
will first search for the existence of the
corresponding r
-function. If it notes the existence of such an r
-function
(e.g. rweibull
for simfun = pweibull
), it will directly apply the
r
-function to optimise simulation efficiency. Otherwise, the function uses
a numerical inverse transform method for simulation (see
simulate_cdf
), which may not be the most efficient and can
potentially result in errors if an appropriate range
is not specified in
the optional arguments.
A list of claim sizes such that the th component of the
list gives the sizes for all claims that occurred in period
.
n_vector <- c(90, 79, 102, 78, 86, 88, 116, 84, 93, 104) claim_size(n_vector)[[1]] # gives the sizes for all # all claims incurred in period 1 # use some custom pre-defined distribution function claim_size(n_vector, stats::rweibull, shape = 4, scale = 100000)[[1]] # equivalently claim_size(n_vector, stats::pweibull, "p", shape = 4, scale = 100000)[[1]]
n_vector <- c(90, 79, 102, 78, 86, 88, 116, 84, 93, 104) claim_size(n_vector)[[1]] # gives the sizes for all # all claims incurred in period 1 # use some custom pre-defined distribution function claim_size(n_vector, stats::rweibull, shape = 4, scale = 100000)[[1]] # equivalently claim_size(n_vector, stats::pweibull, "p", shape = 4, scale = 100000)[[1]]
Adjusts claim sizes given a set of covariates. Note that this function firstly simulates covariate levels for each claim, see simulate_covariates
.
claim_size_adj(covariate_obj, claim_size)
claim_size_adj(covariate_obj, claim_size)
covariate_obj |
a |
claim_size |
a list in the same output as |
Returns a nested named list:
covariates_data
which is a named list of covariate relativities (covariates
), the simulated covariate levels (data
) and the claim IDs.
claim_size_adj
which is a list of adjusted claim sizes such that the th component of the list gives the sizes for all claims that occurred in period
.
Adjusts claim sizes given the covariates related to each claim. The relative adjustment of each claim size is given by the severity relativities of the covariates.
claim_size_adj.fit(covariates_data, claim_size)
claim_size_adj.fit(covariates_data, claim_size)
covariates_data |
a |
claim_size |
a list in the same output as |
claims
ObjectConstructs a claims
object which stores all the simulated quantities.
claims( frequency_vector, occurrence_list, claim_size_list, notification_list, settlement_list, no_payments_list, payment_size_list, payment_delay_list, payment_time_list, payment_inflated_list )
claims( frequency_vector, occurrence_list, claim_size_list, notification_list, settlement_list, no_payments_list, payment_size_list, payment_delay_list, payment_time_list, payment_inflated_list )
frequency_vector |
a vector of claim frequencies for all the periods. |
occurrence_list |
list of claim occurrence times. |
claim_size_list |
list of claim sizes. |
notification_list |
list of notification delays. |
settlement_list |
list of settlement delays. |
no_payments_list |
list of number of partial payments. |
payment_size_list |
(compound) list of payment size pattern (without inflation). |
payment_delay_list |
(compound) list of inter partial delays. |
payment_time_list |
(compound) list of payment times on a continuous time scale. |
payment_inflated_list |
(compound) list of payment size pattern (after inflation). |
Returns a claims
object (mainly a reformat of the arguments as a
list object), with the 10 components as listed above.
covariates
ObjectConstructs a covariates
object which stores all covariate inputs.
All covariates will be assumed discrete.
Continuous covariates will have been discretized.
covariates(factors)
covariates(factors)
factors |
named list of vectors, containing the name of the covariates and associated factors in vector form. |
Creating a covariates
object will provide template relativities for the
frequency and severity relativities. It is encouraged to use the setter
functions set.covariates_relativity
to set these values to ensure that
all necessary inputs are provided.
Returns a covariates
object.
factors <- list( "Legal Representation" = c("Y", "N"), "Injury Severity" = as.character(1:6), "Age of Claimant" = c("0-15", "15-30", "30-50", "50-65", "over 65") ) covariate_obj <- covariates(factors)
factors <- list( "Legal Representation" = c("Y", "N"), "Injury Severity" = as.character(1:6), "Age of Claimant" = c("0-15", "15-30", "30-50", "50-65", "over 65") ) covariate_obj <- covariates(factors)
covariates_data
ObjectConstructs a covariates_data
object which stores the dataset of known covariate levels of each factor.
covariates_data(covariates, data, covariates_id = NULL)
covariates_data(covariates, data, covariates_id = NULL)
covariates |
a |
data |
a dataset of covariate values, with columns equal to each of the covariate factors and rows related to individual claim observations. |
covariates_id |
optional list of list of ids, in the same format as a |
Returns a covariates_data
object.
Calculates the relativities (freq
or sev
) of a set of covariate values.
covariates_relativity( covariates_data, freq_sev = c("freq", "sev"), by_ids = FALSE )
covariates_relativity( covariates_data, freq_sev = c("freq", "sev"), by_ids = FALSE )
covariates_data |
a |
freq_sev |
one of |
by_ids |
optional boolean to calculate reorder the result based off claim observations instead of observations in the covariates dataset. Defaults to FALSE. |
Returns the observed coefficient of variation (CoV) of a given sample
x
.
If na.rm
is true then missing values are removed before
computation proceeds, as in the case of the mean()
function.
cv(x, na.rm = TRUE)
cv(x, na.rm = TRUE)
x |
a numeric vector. |
na.rm |
a logical value indicating whether NA values should be stripped before the computation proceeds. |
The coefficient of variation is defined as is defined as the
ratio of the standard deviation to the mean. It shows the extent of
variability in relation to the mean of the population. cv()
estimates the CoV of a given sample by computing the ratio of
the sample standard deviation (see stats::sd
) to the sample mean.
cv(1:10)
cv(1:10)
Generates a dataset of claims records that takes the same structure as
test_claim_dataset
included in this package, with each row
representing a unique claim.
generate_claim_dataset( frequency_vector, occurrence_list, claim_size_list, notification_list, settlement_list, no_payments_list )
generate_claim_dataset( frequency_vector, occurrence_list, claim_size_list, notification_list, settlement_list, no_payments_list )
frequency_vector |
a vector of claim frequencies for all the periods. |
occurrence_list |
list of claim occurrence times. |
claim_size_list |
list of claim sizes. |
notification_list |
list of notification delays. |
settlement_list |
list of settlement delays. |
no_payments_list |
list of number of partial payments. |
A dataframe that takes the same structure as
test_claim_dataset
.
# demo only, in practice might generate claim dataset before simulating # the partial payments # this code generates the built-in test_claim_dataset attach(test_claims_object) claim_dataset <- generate_claim_dataset( frequency_vector, occurrence_list, claim_size_list, notification_list, settlement_list, no_payments_list ) detach(test_claims_object)
# demo only, in practice might generate claim dataset before simulating # the partial payments # this code generates the built-in test_claim_dataset attach(test_claims_object) claim_dataset <- generate_claim_dataset( frequency_vector, occurrence_list, claim_size_list, notification_list, settlement_list, no_payments_list ) detach(test_claims_object)
Generates a dataset of partial payment records that takes the same structure
as test_transaction_dataset
included in this package, with each row
representing a unique payment.
generate_transaction_dataset(claims, adjust = FALSE)
generate_transaction_dataset(claims, adjust = FALSE)
claims |
an |
adjust |
if TRUE then the payment times will be forced to match with the maximum development period under consideration; default FALSE (which will produce out-of-bound payment times). |
A dataframe that takes the same structure as
test_transaction_dataset
.
# this generates the built-in test_transaction_dataset transact_data <- generate_transaction_dataset(test_claims_object)
# this generates the built-in test_transaction_dataset transact_data <- generate_transaction_dataset(test_claims_object)
Returns the Beta parameters given the mean and the CoV of the target Beta distribution.
get_Beta_parameters(target_mean, target_cv)
get_Beta_parameters(target_mean, target_cv)
target_mean |
mean of the target Beta distribution (between 0 and 1). |
target_cv |
CoV of the target Beta distribution. |
get_Beta_parameters(target_mean = 0.5, target_cv = 0.20) get_Beta_parameters(target_mean = 0.5, target_cv = c(0.10, 0.20, 0.30))
get_Beta_parameters(target_mean = 0.5, target_cv = 0.20) get_Beta_parameters(target_mean = 0.5, target_cv = c(0.10, 0.20, 0.30))
Returns the Weibull shape and scale parameters given the mean and the CoV of the target Weibull distribution.
get_Weibull_parameters(target_mean, target_cv)
get_Weibull_parameters(target_mean, target_cv)
target_mean |
mean of the target Weibull distribution. |
target_cv |
CoV of the target Weibull distribution. |
get_Weibull_parameters(target_mean = 100000, target_cv = 0.60) get_Weibull_parameters(target_mean = c(100000, 200000, 300000), target_cv = 0.60)
get_Weibull_parameters(target_mean = 100000, target_cv = 0.60) get_Weibull_parameters(target_mean = c(100000, 200000, 300000), target_cv = 0.60)
Generates a plot of cumulative claims paid (as a percentage of total amount incurred) as a function of development time for each occurrence period.
plot_transaction_dataset( transactions, occurrence_time_col = "occurrence_time", payment_time_col = "payment_time", payment_size_col = "payment_inflated", by_year = FALSE, adjust = TRUE )
plot_transaction_dataset( transactions, occurrence_time_col = "occurrence_time", payment_time_col = "payment_time", payment_size_col = "payment_inflated", by_year = FALSE, adjust = TRUE )
transactions |
a dataset of partial payment records. |
occurrence_time_col |
name of column that stores the time of occurrence of the claims (on a continuous scale). |
payment_time_col |
name of column that stores the time of partial payments of the claims (on a continuous scale). |
payment_size_col |
name of column that stores the size of partial payments of the claims. |
by_year |
if |
adjust |
if |
plot_transaction_dataset(test_transaction_dataset) # Plot claim development without end-of-development-period correction plot_transaction_dataset(test_transaction_dataset, adjust = FALSE) # Plot claim development without inflation effects plot_transaction_dataset(test_transaction_dataset, payment_size_col = "payment_size")
plot_transaction_dataset(test_transaction_dataset) # Plot claim development without end-of-development-period correction plot_transaction_dataset(test_transaction_dataset, adjust = FALSE) # Plot claim development without inflation effects plot_transaction_dataset(test_transaction_dataset, payment_size_col = "payment_size")
Generates a plot of cumulative claims paid (as a percentage of total amount incurred) as a function of development time for each occurrence period.
## S3 method for class 'claims' plot(x, by_year = FALSE, inflated = TRUE, adjust = TRUE, ...)
## S3 method for class 'claims' plot(x, by_year = FALSE, inflated = TRUE, adjust = TRUE, ...)
x |
an object of class |
by_year |
if |
inflated |
if |
adjust |
if |
... |
other graphical parameters. |
plot(test_claims_object) plot(test_claims_object, adjust = FALSE)
plot(test_claims_object) plot(test_claims_object, adjust = FALSE)
Constructs a template for the covariate relativities based on the inputed covariates. Note that only non-zero relativities and one cross-factor relativity is needed for each factor pair.
relativity_template(factors)
relativity_template(factors)
factors |
named list of vectors, containing the name of the covariates and associated factors in vector form. |
Suppose that there are covariates labelled
,
,
and that covariate
can assume one and only one of
values,
,
. The total number of available covariate
values is
.
Now set up an matrix
, consisting of sub-matrices
,
of dimension
. The
diagonal blocks
will quantify first-order relativities on claims
attributes, and the off-diagonal blocks
,
will quantify
second-order effects. Let
denote the
element of
. This element operates as a multiplier of the claim attribute
when covariates
and
take values
and
respectively. Since
can assume only one of the values
,
for
, and so
is diagonal for
all
. Moreover,
, so that
is symmetric and
.
Returns a dataframe object, with five columns:
factor_i |
Factor i. |
factor_j |
Factor j. |
level_ik |
Level within Factor i. |
level_jl |
Level within Factor j. |
relativity |
Relativity between level_ik and level_jl , defaults to NA .
|
factors <- list( "Legal Representation" = c("Y", "N"), "Injury Severity" = as.character(1:6), "Age of Claimant" = c("0-15", "15-30", "30-50", "50-65", "over 65") ) relativity_freq <- relativity_template(factors) relativity_sev <- relativity_template(factors) # Default Values relativity_freq$relativity <- c( 1, 1, 0.95, 1, 1, 1, 1, 1, 0.05, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.53, 0.3, 0.1, 0.05, 0.01, 0.01, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.183, 0.192, 0.274, 0.18, 0.171 ) relativity_sev$relativity <- c( 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.6, 1.2, 2.5, 5, 8, 0.4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.97, 0.95, 1, 1, 1, 0.95, 0.9, 1, 1, 1, 1, 1, 1.25, 1.15, 1, 0.85, 0.7 ) head(relativity_freq) head(relativity_sev) test_covariates_obj <- covariates(factors) test_covariates_obj <- set.covariates_relativity( covariates = test_covariates_obj, relativity = relativity_freq, freq_sev = "freq" ) test_covariates_obj <- set.covariates_relativity( covariates = test_covariates_obj, relativity = relativity_sev, freq_sev = "sev" )
factors <- list( "Legal Representation" = c("Y", "N"), "Injury Severity" = as.character(1:6), "Age of Claimant" = c("0-15", "15-30", "30-50", "50-65", "over 65") ) relativity_freq <- relativity_template(factors) relativity_sev <- relativity_template(factors) # Default Values relativity_freq$relativity <- c( 1, 1, 0.95, 1, 1, 1, 1, 1, 0.05, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.53, 0.3, 0.1, 0.05, 0.01, 0.01, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.183, 0.192, 0.274, 0.18, 0.171 ) relativity_sev$relativity <- c( 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.6, 1.2, 2.5, 5, 8, 0.4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.97, 0.95, 1, 1, 1, 0.95, 0.9, 1, 1, 1, 1, 1, 1.25, 1.15, 1, 0.85, 0.7 ) head(relativity_freq) head(relativity_sev) test_covariates_obj <- covariates(factors) test_covariates_obj <- set.covariates_relativity( covariates = test_covariates_obj, relativity = relativity_freq, freq_sev = "freq" ) test_covariates_obj <- set.covariates_relativity( covariates = test_covariates_obj, relativity = relativity_sev, freq_sev = "sev" )
Returns the current values of ref_claim
and time_unit
parameters, two
packagewise-global variables used by all simulation functions within this
package.
return_parameters(print = FALSE)
return_parameters(print = FALSE)
print |
logical; if TRUE prints a message. |
Returns and (optionally) prints the current values of ref_claim
and time_unit
parameters.
cur <- return_parameters() cur set_parameters(ref_claim = 200000, time_unit = 1/12) # monthly reserving return_parameters(print = FALSE)
cur <- return_parameters() cur set_parameters(ref_claim = 200000, time_unit = 1/12) # monthly reserving return_parameters(print = FALSE)
Sets ref_claim
and time_unit
parameters for all the simulation functions
within the SynthETIC
package.
set_parameters(ref_claim = 2e+05, time_unit = 1/4)
set_parameters(ref_claim = 2e+05, time_unit = 1/4)
ref_claim |
a reference value for the claim sizes (default 200000). |
time_unit |
time unit to work with, given as a fraction of a year; default calendar quarters (1/4). |
Those variables will be available to multiple functions in this
package, but are kept local to the package environment (i.e. not accessible
from the global environment). To extract the current values of the variables,
use return_parameters
.
We introduce the reference value ref_claim
partly as a measure of the
monetary unit and/or overall claims experience. The default distributional
assumptions were set up with an Australian Auto Liability portfolio in mind.
ref_claim
then allows users to easily simulate a synthetic portfolio with
similar claim pattern but in a different currency, for example. We also
remark that users can alternatively choose to interpret ref_claim
as a
monetary unit. For example, one can set ref_claim <- 1000
and think of all
amounts in terms of $1,000. However, in this case the default simulation
functions will not work and users will need to supply their own set of
functions and set the values as multiples of ref_claim
rather than
fractions as in the default setting.
We also require the user to input a time_unit
(which should be given as a
fraction of year), so that the default input parameters apply to contexts
where the time units are no longer in quarters. In the default setting we
have a time_unit
of 1/4 i.e. we work with calendar quarters.
The default input parameters will update automatically with the choice of
the two variables ref_claim
and time_unit
, which ensures that the
simulator produce sensible results in contexts other than the default
setting. We remark that both ref_claim
and time_unit
only affect the
default simulation functions, and users can also choose to set up their own
modelling assumptions for any of the modules to match their experiences even
better. In the latter case, it is the responsibility of the user to ensure
that their input parameters are compatible with their time units and claims
experience. For example, if the time units are quarters, then claim
occurrence rates must be quarterly.
See the vignette for this package for a full list of functions impacted by those two variables.
set_parameters(ref_claim = 200000, time_unit = 1/12) # monthly reserving
set_parameters(ref_claim = 200000, time_unit = 1/12) # monthly reserving
covariates
object.Sets the claims relativity for a covariates
object.
set.covariates_relativity(covariates, relativity, freq_sev = c("freq", "sev"))
set.covariates_relativity(covariates, relativity, freq_sev = c("freq", "sev"))
covariates |
an object of type |
relativity |
|
freq_sev |
one of |
Returns a covariates
object.
covariates
, relativity_template
Generates sample numbers at random from any probability distribution
given its cumulative distribution function. Pre-defined distribution
functions such as pnorm
are supported.
See here
for the algorithm.
simulate_cdf(n, cdf, range = c(-1e+200, 1e+200), ...)
simulate_cdf(n, cdf, range = c(-1e+200, 1e+200), ...)
n |
number of observations. |
cdf |
cumulative distribution function to be sampled from. |
range |
support of the given |
... |
other arguments/parameters to be passed onto |
simulate_cdf(10, pnorm) simulate_cdf(10, pbeta, shape1 = 2, shape2 = 2)
simulate_cdf(10, pnorm) simulate_cdf(10, pbeta, shape1 = 2, shape2 = 2)
Simulates covariates for each claim. The relative occurrence of each combination of covariates is given the frequency relativities of the covariates.
simulate_covariates( covariates, frequency_vector = 1, claim_size_list = list(1) )
simulate_covariates( covariates, frequency_vector = 1, claim_size_list = list(1) )
covariates |
a |
frequency_vector |
a vector in the same output as |
claim_size_list |
optional if |
Returns a covariates_data
object.
A dataset of 3,624 rows containing individual claims features.
test_claim_dataset
test_claim_dataset
A data frame with 3,624 rows and 7 variables:
claim number, which uniquely characterises each claim.
integer; period of ocurrence of the claim.
double; time of occurrence of the claim.
size of the claim (in constant dollar values).
notification delay of the claim, i.e. time from occurrence to notification.
settlement delay of the claim, i.e. time from notification to settlement.
number of partial payments required for the claim.
# see a distribution of payment counts table(test_claim_dataset$no_payment)
# see a distribution of payment counts table(test_claim_dataset$no_payment)
The test_claim_dataset
where the default set of covariates have been applied to adjust claim sizes.
test_claim_dataset_cov
test_claim_dataset_cov
An object of class data.frame
with 3624 rows and 7 columns.
# see a distribution of payment counts table(test_claim_dataset_cov$no_payment)
# see a distribution of payment counts table(test_claim_dataset_cov$no_payment)
A list containing a sample output from each of the simulation modules, in
sequential order of the running of the modules. This is the test data
generated when run with seed 20200131
at the top of the code.
test_claims_object
test_claims_object
A claims
object with 10 components:
vector; number of claims for each occurrence period, see also claim_frequency()
.
list; claim occurrence times for all claims that occurred in each of the period, see also claim_occurrence()
.
list; claim sizes for all claims that occurred in each of the period, see also claim_size()
.
list; notification delays for all claims that occurred in each of the period, see also claim_notification()
.
list; settlement delays for all claims that occurred in each of the period, see also claim_closure()
.
list; number of partial payments for all claims that occurred in each of the period, see also claim_payment_no()
.
(compound) list; sizes of partial payments (without inflation) for all claims that occurred in each of the period, see also claim_payment_size()
.
(compound) list; inter partial delays for all claims that occurred in each of the period, see also claim_payment_delay()
.
(compound) list; payment times (on a continuous time scale) for all claims that occurred in each of the period, see also claim_payment_time()
.
(compound) list; sizes of partial payments (with inflation) for all claims that occurred in each of the period, see also claim_payment_inflation()
.
Claim occurrence: claim_frequency
, claim_occurrence
Claim size: claim_size
Claim notification: claim_notification
Claim closure: claim_closure
Claim payment count: claim_payment_no
Claim payment size (without inflation): claim_payment_size
Claim payment time: claim_payment_delay
, claim_payment_time
Claim inflation: claim_payment_inflation
test_claims_object$frequency_vector
test_claims_object$frequency_vector
The test_claims_object
where the default set of covariates have been applied to adjust claim sizes.
test_claims_object_cov
test_claims_object_cov
An object of class claims
of length 10.
test_claims_object$frequency_vector
test_claims_object$frequency_vector
An object detailing the set of covariates for each claim in the default setting of SynthETIC
.
test_covariates_dataset
test_covariates_dataset
A covariates_data
object with 3 components:
data.frame; a dataset of covariate levels
covariates; a covariates
object
list; indices of the covariate-level dataset for each claim
test_covariates_dataset$data
test_covariates_dataset$data
A list containing the frequency and severity relativities for three factors.
test_covariates_obj
test_covariates_obj
A covariates
object with 3 components:
list; levels within each factor.
data.frame; first and second order frequency relativities between all the levels of each factor
data.frame; first and second order severity relativities between all the levels of each factor
test_covariates_obj$factors
test_covariates_obj$factors
A dataset of 18,983 records of partial payments associated with the 3,624
claims in test_claim_dataset
.
test_transaction_dataset
test_transaction_dataset
A data frame with 18,983 rows and 12 variables:
claim number, which uniquely characterises each claim.
payment number, identification number of partial payments in respect of a particular claim_no
.
integer; period of ocurrence of the claim.
double; time of occurrence of the claim.
size of the claim (in constant dollar values).
notification delay of the claim, i.e. time from occurrence to notification.
settlement delay of the claim, i.e. time from notification to settlement.
double; time of payment (on a continuous time scale).
integer; time of payment (in calendar period).
size of the payment in constant dollar terms.
actual size of the payment (i.e. with inflation).
inter partial delay associated with the payment.
The test_transaction_dataset
where the default set of covariates have been applied to adjust claim sizes.
test_transaction_dataset_cov
test_transaction_dataset_cov
An object of class data.frame
with 18983 rows and 12 columns.
Converts a vector of simulated quantities (e.g. claim occurrence times, claim
sizes) to a list format consistent with what is used for SynthETIC
simulation; to be used when user wishes to replace one or more of the
SynthETIC
modules with their own.
to_SynthETIC(x, frequency_vector, level = c("clm", "pmt"), no_payments_list)
to_SynthETIC(x, frequency_vector, level = c("clm", "pmt"), no_payments_list)
x |
a vector of simulated quantities for all the claims. |
frequency_vector |
a vector of claim frequencies for all the periods. |
level |
level of the data provided; one of |
no_payments_list |
list of number of partial payments; only required
if |
It is assumed that the simulated quantities in x
is provided in
chronological order, e.g. if there are 30 claims in period 1 and x
is
on a "clm"
level, then the first 30 elements of x
should give the
measures for those 30 claims. Likewise, if x
is on a "pmt"
level, and
the first claim in period 1 has 5 payments, then the first 5 elements of x
should give the measures for those 5 payments.
A list of quantities such that the th component of the
list gives the corresponding measure for all claims that occurred in period
.
freq <- claim_frequency() my_claims <- rweibull(sum(freq), shape = 4, scale = 100000) claim_sizes <- to_SynthETIC(my_claims, freq)
freq <- claim_frequency() my_claims <- rweibull(sum(freq), shape = 4, scale = 100000) claim_sizes <- to_SynthETIC(my_claims, freq)