Assume we have
test
xgb_base
which is a xgboost modeldata
with name my_pred.csv
The official target metric on the site is not usual, thus here is the function I wrap the metric into.
Assume you finish your model and get four coloumn in dataset dataset
,
id
is the id of the phv machinet
is time to record every y and x variablesp
is the real power, in this contest, it is the target we want to predict.phat
is the predicted power, we want it to approach the real one.library(add2evaluation)
data(dataset)
library(lubridate)
#>
#> 载入程辑包:'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
library(psych)
library(tidyverse)
#> ─ Attaching packages ────────────────────────────── tidyverse 1.2.1 ─
#> ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
#> ✔ tibble 2.1.1 ✔ dplyr 0.8.0.1
#> ✔ tidyr 0.8.2 ✔ stringr 1.4.0
#> ✔ readr 1.1.1 ✔ forcats 0.3.0
#> ─ Conflicts ─────────────────────────────── tidyverse_conflicts() ─
#> ✖ ggplot2::%+%() masks psych::%+%()
#> ✖ ggplot2::alpha() masks psych::alpha()
#> ✖ lubridate::as.difftime() masks base::as.difftime()
#> ✖ lubridate::date() masks base::date()
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ lubridate::intersect() masks base::intersect()
#> ✖ dplyr::lag() masks stats::lag()
#> ✖ lubridate::setdiff() masks base::setdiff()
#> ✖ lubridate::union() masks base::union()
dataset %>%
describe()
#> Warning in FUN(newX[, i], ...): min里所有的参数都不存在; 回覆Inf
#> Warning in FUN(newX[, i], ...): max里所有的参数都不存在;回覆-Inf
#> vars n mean sd median trimmed mad min max range
#> short_name 1 183093 2.26 1.18 2.00 2.20 1.48 1.00 4.00 3.00
#> t 2 183093 NaN NA NA NaN NA Inf -Inf -Inf
#> p 3 183093 4.85 9.50 0.00 2.26 0.17 -0.40 48.83 49.23
#> phat 4 183093 4.84 8.88 0.17 2.47 0.47 -3.55 46.97 50.52
#> skew kurtosis se
#> short_name 0.33 -1.40 0.00
#> t NA NA NA
#> p 2.60 6.38 0.02
#> phat 2.39 5.11 0.02
phv_metric(
id = dataset$short_name
,t = dataset$t
,y = dataset$p
,yhat = dataset$phat
)
#> [1] 0.1192559