Overview

Data Structures
Divisions Metrics Values
size observations 1,087
size variables 56
size values 60,872
size memory size (MB) 1
duplicated duplicate observation 0
missing complete observation 0
missing missing observation 1,087
missing missing variables 50
missing missing values 29,435
Data Types
Divisions Metrics Values
data type numerics 10
data type integers 0
data type factors/ordered 0
data type characters 45
data type Dates 1
data type POSIXcts 0
data type others 0
Job Informations
Divisions Metrics Values
dataset dataset df
dataset dataset type spec_tbl_df
dataset target not defied
job samples 1,087 / 1,087 (100%)
job created 2023-05-07 21:43:44
job created by Brian Julius






Univariate Analysis

Descriptive Statistics

Descriptive statistics and visualization of individual variables
Variables
Missing (%)
Distincts (Ratio)
Zeros
Negatives
Outliers
coaster_name
character
0
(0)
990
(0.911)
-
-
-
Length
character
134
(12.33)
570
(0.524)
-
-
-
Speed
character
150
(13.8)
244
(0.224)
-
-
-
Location
character
0
(0)
280
(0.258)
-
-
-
Status
character
213
(19.6)
16
(0.015)
-
-
-
Opening date
character
250
(23)
657
(0.604)
-
-
-
Type
character
0
(0)
98
(0.09)
-
-
-
Manufacturer
character
59
(5.43)
103
(0.095)
-
-
-
Height restriction
character
256
(23.55)
101
(0.093)
-
-
-
Model
character
343
(31.55)
318
(0.293)
-
-
-
1–10 of 56 rows


Normality Test

Normality test results of numeric variables
Variables
Mean
Min
Q1
Median
Q3
Max
Balance
Inversions
1.55
0
0
0
3
14
Right-Skewed
year_introduced
1994.99
1884
1989
2000
2010
2022
Left-Skewed
latitude
38.37
-48.2617
35.03105
40.2898
44.7996
63.2309
Left-Skewed
longitude
-41.6
-123.0357
-84.5522
-76.6536
2.7781
153.4265
Right-Skewed
speed1_value
53.85
5
40
50
63
240
Right-Skewed
speed_mph
48.62
5
37.3
49.7
58
149.1
Balanced
height_value
89.58
4
44
79
113
3937
Right-Skewed
height_ft
102
13.1
51.8
91.2
131.2
377.3
Right-Skewed
Inversions_clean
1.33
0
0
0
2
14
Right-Skewed
Gforce_clean
3.82
0.8
3.4
4
4.5
12
Balanced


Bivariate Analysis

Compare Numerical Variables

Relationship between two numerical variables
First Variable
Second Variable
Correlation Coefficient
Inversions
year_introduced
0.21100
Inversions
latitude
-0.00982
Inversions
longitude
0.06159
Inversions
speed1_value
0.16342
Inversions
speed_mph
0.25221
Inversions
height_value
0.09481
Inversions
height_ft
0.17133
Inversions
Inversions_clean
1.00000
Inversions
Gforce_clean
0.35687
year_introduced
latitude
-0.07098
1–10 of 45 rows


Compare Categorical Variables

The number of categorical variables is less than 2.


Multivariate Analysis

Correlation Analysis

Correlation Matrix

Correlation coefficient matrix of numeric variables
Second Variable
First Variable
Inversions
year_introduced
latitude
longitude
speed1_value
speed_mph
height_value
height_ft
Inversions_clean
Gforce_clean
Inversions
0.211
-0.010
0.062
0.163
0.252
0.095
0.171
1.000
0.357
year_introduced
0.211
-0.071
0.176
0.210
0.205
0.088
0.232
0.229
-0.067
latitude
-0.010
-0.071
-0.298
-0.122
-0.064
-0.004
0.011
-0.014
0.043
longitude
0.062
0.176
-0.298
0.301
0.051
-0.093
0.160
0.087
0.016
speed1_value
0.163
0.210
-0.122
0.301
0.852
0.089
0.815
0.176
0.380
speed_mph
0.252
0.205
-0.064
0.051
0.852
0.241
0.829
0.266
0.489
height_value
0.095
0.088
-0.004
-0.093
0.089
0.241
1.000
0.108
0.337
height_ft
0.171
0.232
0.011
0.160
0.815
0.829
1.000
0.164
0.475
Inversions_clean
1.000
0.229
-0.014
0.087
0.176
0.266
0.108
0.164
0.345
Gforce_clean
0.357
-0.067
0.043
0.016
0.380
0.489
0.337
0.475
0.345


Correlation Plot

Correlation coefficient matrix plot


 

Created by dlookr package