| audit {rattle} | R Documentation |
The audit dataset is an artificially constructed dataset that has some of the characteristics of a true audit dataset for modelling productive and non-productive audits. It is used to illustrate binary classification.
The target variable is Adjusted, an integer which is either 0
(for non-producitve audits) or 1 (for productive audits). Productive
audits are those that result in an adjustment being made to a client's
claims. The dollar value of those adjustments is also recorded (as the
Adjusted column).
The independent variables include Age, type of
Employment, level of Education, Marital status,
Occupation, level of Income, Sex, amount of
Deductions being claimed, Hours worked per week, and
country in which they have a bank Account.
An identifier is included as the ID variable.
The dataset is quite small, consisting of just 2000 entities. It primary purpose is to illustrate modelling in Rattle, so a minimally sized dataset is suitable.
A data frame.