Overview

Dataset statistics

Number of variables12
Number of observations430
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory64.4 KiB
Average record size in memory153.3 B

Variable types

Categorical1
Numeric11

Alerts

Date has a high cardinality: 430 distinct values High cardinality
Year is highly correlated with Product_Supplied and 5 other fieldsHigh correlation
Product_Supplied is highly correlated with Year and 5 other fieldsHigh correlation
Refinery_Input is highly correlated with Year and 7 other fieldsHigh correlation
Operable_Dist_Capacity is highly correlated with Year and 5 other fieldsHigh correlation
Operating_Dist_Capacity is highly correlated with Year and 5 other fieldsHigh correlation
Idle_Dist_Capacity is highly correlated with Refinery_Input and 1 other fieldsHigh correlation
Percent_Util is highly correlated with Refinery_Input and 1 other fieldsHigh correlation
Future_Price is highly correlated with Year and 5 other fieldsHigh correlation
Spot_Price is highly correlated with Year and 5 other fieldsHigh correlation
Year is highly correlated with Product_Supplied and 5 other fieldsHigh correlation
Product_Supplied is highly correlated with Year and 3 other fieldsHigh correlation
Refinery_Input is highly correlated with Year and 6 other fieldsHigh correlation
Operable_Dist_Capacity is highly correlated with Year and 5 other fieldsHigh correlation
Operating_Dist_Capacity is highly correlated with Year and 5 other fieldsHigh correlation
Idle_Dist_Capacity is highly correlated with Percent_UtilHigh correlation
Percent_Util is highly correlated with Refinery_Input and 1 other fieldsHigh correlation
Future_Price is highly correlated with Year and 4 other fieldsHigh correlation
Spot_Price is highly correlated with Year and 4 other fieldsHigh correlation
Year is highly correlated with Refinery_Input and 4 other fieldsHigh correlation
Product_Supplied is highly correlated with Refinery_InputHigh correlation
Refinery_Input is highly correlated with Year and 3 other fieldsHigh correlation
Operable_Dist_Capacity is highly correlated with Year and 4 other fieldsHigh correlation
Operating_Dist_Capacity is highly correlated with Year and 4 other fieldsHigh correlation
Idle_Dist_Capacity is highly correlated with Percent_UtilHigh correlation
Percent_Util is highly correlated with Idle_Dist_CapacityHigh correlation
Future_Price is highly correlated with Year and 3 other fieldsHigh correlation
Spot_Price is highly correlated with Year and 3 other fieldsHigh correlation
Year is highly correlated with Total_Production and 7 other fieldsHigh correlation
Total_Production is highly correlated with Year and 7 other fieldsHigh correlation
Product_Supplied is highly correlated with Year and 6 other fieldsHigh correlation
Refinery_Input is highly correlated with Year and 7 other fieldsHigh correlation
Operable_Dist_Capacity is highly correlated with Year and 8 other fieldsHigh correlation
Operating_Dist_Capacity is highly correlated with Year and 7 other fieldsHigh correlation
Idle_Dist_Capacity is highly correlated with Operable_Dist_Capacity and 1 other fieldsHigh correlation
Percent_Util is highly correlated with Year and 5 other fieldsHigh correlation
Future_Price is highly correlated with Year and 6 other fieldsHigh correlation
Spot_Price is highly correlated with Year and 6 other fieldsHigh correlation
Date is uniformly distributed Uniform
Date has unique values Unique
Total_Production has unique values Unique

Reproduction

Analysis started2022-02-01 00:06:59.529551
Analysis finished2022-02-01 00:07:50.169279
Duration50.64 seconds
Software versionpandas-profiling v3.1.1
Download configurationconfig.json

Variables

Date
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct430
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size27.4 KiB
Apr-1987
 
1
Nov-2011
 
1
Aug-2021
 
1
May-2001
 
1
Jan-1991
 
1
Other values (425)
425 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters3440
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique430 ?
Unique (%)100.0%

Sample

1st rowJan-1986
2nd rowFeb-1986
3rd rowMar-1986
4th rowApr-1986
5th rowMay-1986

Common Values

ValueCountFrequency (%)
Apr-19871
 
0.2%
Nov-20111
 
0.2%
Aug-20211
 
0.2%
May-20011
 
0.2%
Jan-19911
 
0.2%
Apr-20161
 
0.2%
Jan-19931
 
0.2%
Sep-20211
 
0.2%
Aug-20141
 
0.2%
May-20041
 
0.2%
Other values (420)420
97.7%

Length

Histogram of lengths of the category
ValueCountFrequency (%)
may-19991
 
0.2%
oct-19941
 
0.2%
apr-19991
 
0.2%
may-20211
 
0.2%
oct-20181
 
0.2%
oct-20081
 
0.2%
nov-20131
 
0.2%
mar-20091
 
0.2%
jun-19871
 
0.2%
apr-20081
 
0.2%
Other values (420)420
97.7%

Most occurring characters

ValueCountFrequency (%)
-430
12.5%
0430
12.5%
9336
 
9.8%
1334
 
9.7%
2320
 
9.3%
a108
 
3.1%
J108
 
3.1%
u108
 
3.1%
e107
 
3.1%
896
 
2.8%
Other values (23)1063
30.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1720
50.0%
Lowercase Letter860
25.0%
Dash Punctuation430
 
12.5%
Uppercase Letter430
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a108
12.6%
u108
12.6%
e107
12.4%
p72
8.4%
r72
8.4%
n72
8.4%
c71
8.3%
l36
 
4.2%
g36
 
4.2%
y36
 
4.2%
Other values (4)142
16.5%
Decimal Number
ValueCountFrequency (%)
0430
25.0%
9336
19.5%
1334
19.4%
2320
18.6%
896
 
5.6%
648
 
2.8%
748
 
2.8%
336
 
2.1%
436
 
2.1%
536
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
J108
25.1%
M72
16.7%
A72
16.7%
S36
 
8.4%
F36
 
8.4%
O36
 
8.4%
N35
 
8.1%
D35
 
8.1%
Dash Punctuation
ValueCountFrequency (%)
-430
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2150
62.5%
Latin1290
37.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a108
 
8.4%
J108
 
8.4%
u108
 
8.4%
e107
 
8.3%
p72
 
5.6%
r72
 
5.6%
M72
 
5.6%
A72
 
5.6%
n72
 
5.6%
c71
 
5.5%
Other values (12)428
33.2%
Common
ValueCountFrequency (%)
-430
20.0%
0430
20.0%
9336
15.6%
1334
15.5%
2320
14.9%
896
 
4.5%
648
 
2.2%
748
 
2.2%
336
 
1.7%
436
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII3440
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
-430
12.5%
0430
12.5%
9336
 
9.8%
1334
 
9.7%
2320
 
9.3%
a108
 
3.1%
J108
 
3.1%
u108
 
3.1%
e107
 
3.1%
896
 
2.8%
Other values (23)1063
30.9%

Year
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct36
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2003.418605
Minimum1986
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum1986
5-th percentile1987
Q11994.25
median2003
Q32012
95-th percentile2019.55
Maximum2021
Range35
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation10.35552747
Coefficient of variation (CV)0.005168928472
Kurtosis-1.200417279
Mean2003.418605
Median Absolute Deviation (MAD)9
Skewness0.001086441007
Sum861470
Variance107.2369491
MonotonicityIncreasing
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
200312
 
2.8%
202012
 
2.8%
200112
 
2.8%
200012
 
2.8%
199912
 
2.8%
199812
 
2.8%
199712
 
2.8%
199612
 
2.8%
199512
 
2.8%
199412
 
2.8%
Other values (26)310
72.1%
ValueCountFrequency (%)
198612
2.8%
198712
2.8%
198812
2.8%
198912
2.8%
199012
2.8%
199112
2.8%
199212
2.8%
199312
2.8%
199412
2.8%
199512
2.8%
ValueCountFrequency (%)
202110
2.3%
202012
2.8%
201912
2.8%
201812
2.8%
201712
2.8%
201612
2.8%
201512
2.8%
201412
2.8%
201312
2.8%
201212
2.8%

Month
Real number (ℝ≥0)

Distinct12
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.476744186
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13.25
median6
Q39
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.75

Descriptive statistics

Standard deviation3.446990323
Coefficient of variation (CV)0.5322103551
Kurtosis-1.211707003
Mean6.476744186
Median Absolute Deviation (MAD)3
Skewness0.005611039523
Sum2785
Variance11.88174229
MonotonicityNot monotonic
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1036
8.4%
936
8.4%
836
8.4%
736
8.4%
636
8.4%
536
8.4%
436
8.4%
336
8.4%
236
8.4%
136
8.4%
Other values (2)70
16.3%
ValueCountFrequency (%)
136
8.4%
236
8.4%
336
8.4%
436
8.4%
536
8.4%
636
8.4%
736
8.4%
836
8.4%
936
8.4%
1036
8.4%
ValueCountFrequency (%)
1235
8.1%
1135
8.1%
1036
8.4%
936
8.4%
836
8.4%
736
8.4%
636
8.4%
536
8.4%
436
8.4%
336
8.4%

Total_Production
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct430
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean220299.7558
Minimum119208
Maximum400219
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum119208
5-th percentile154744.3
Q1173878.25
median202056.5
Q3255772.25
95-th percentile349502.25
Maximum400219
Range281011
Interquartile range (IQR)81894

Descriptive statistics

Standard deviation59441.36299
Coefficient of variation (CV)0.2698203762
Kurtosis0.4264539035
Mean220299.7558
Median Absolute Deviation (MAD)33502.5
Skewness1.031730681
Sum94728895
Variance3533275634
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2007021
 
0.2%
1774881
 
0.2%
3556701
 
0.2%
1773721
 
0.2%
2498431
 
0.2%
3573441
 
0.2%
3427471
 
0.2%
2523741
 
0.2%
1807121
 
0.2%
1672721
 
0.2%
Other values (420)420
97.7%
ValueCountFrequency (%)
1192081
0.2%
1264171
0.2%
1408941
0.2%
1412001
0.2%
1433011
0.2%
1457081
0.2%
1466981
0.2%
1468681
0.2%
1470611
0.2%
1492971
0.2%
ValueCountFrequency (%)
4002191
0.2%
3972981
0.2%
3963291
0.2%
3959001
0.2%
3889841
0.2%
3867251
0.2%
3771691
0.2%
3763621
0.2%
3719491
0.2%
3696441
0.2%

Product_Supplied
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct429
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean574218.2163
Minimum436455
Maximum671648
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum436455
5-th percentile499344.05
Q1537078.5
median579533
Q3610163.5
95-th percentile641859.5
Maximum671648
Range235193
Interquartile range (IQR)73085

Descriptive statistics

Standard deviation45528.3279
Coefficient of variation (CV)0.07928750187
Kurtosis-0.6464705951
Mean574218.2163
Median Absolute Deviation (MAD)36325.5
Skewness-0.2964651742
Sum246913833
Variance2072828641
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5979762
 
0.5%
6278381
 
0.2%
6341671
 
0.2%
5798671
 
0.2%
6177491
 
0.2%
5778271
 
0.2%
6228841
 
0.2%
5573491
 
0.2%
5297031
 
0.2%
5890981
 
0.2%
Other values (419)419
97.4%
ValueCountFrequency (%)
4364551
0.2%
4532091
0.2%
4574861
0.2%
4734251
0.2%
4772691
0.2%
4783391
0.2%
4808961
0.2%
4814821
0.2%
4841631
0.2%
4853541
0.2%
ValueCountFrequency (%)
6716481
0.2%
6663571
0.2%
6644481
0.2%
6620391
0.2%
6580991
0.2%
6558951
0.2%
6518641
0.2%
6517901
0.2%
6512741
0.2%
6491041
0.2%

Refinery_Input
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct411
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15007.91395
Minimum11759
Maximum18041
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum11759
5-th percentile13074.9
Q114049.25
median15093.5
Q315803.75
95-th percentile17042.1
Maximum18041
Range6282
Interquartile range (IQR)1754.5

Descriptive statistics

Standard deviation1238.984141
Coefficient of variation (CV)0.08255538676
Kurtosis-0.5409443404
Mean15007.91395
Median Absolute Deviation (MAD)876
Skewness0.01498363446
Sum6453403
Variance1535081.701
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
145383
 
0.7%
147832
 
0.5%
133832
 
0.5%
130972
 
0.5%
133562
 
0.5%
146372
 
0.5%
156382
 
0.5%
150002
 
0.5%
146932
 
0.5%
157682
 
0.5%
Other values (401)409
95.1%
ValueCountFrequency (%)
117591
0.2%
120681
0.2%
122371
0.2%
124171
0.2%
125831
0.2%
126031
0.2%
126371
0.2%
127251
0.2%
127421
0.2%
127531
0.2%
ValueCountFrequency (%)
180411
0.2%
179691
0.2%
178331
0.2%
177491
0.2%
176991
0.2%
176881
0.2%
176871
0.2%
176591
0.2%
175621
0.2%
175271
0.2%

Operable_Dist_Capacity
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct242
Distinct (%)56.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16841.56279
Minimum15028
Maximum18976
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum15028
5-th percentile15186
Q115686.25
median16764
Q317736
95-th percentile18619.2
Maximum18976
Range3948
Interquartile range (IQR)2049.75

Descriptive statistics

Standard deviation1161.536555
Coefficient of variation (CV)0.06896845441
Kurtosis-1.331455769
Mean16841.56279
Median Absolute Deviation (MAD)1056
Skewness0.06030327715
Sum7241872
Variance1349167.17
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1880812
 
2.8%
1674711
 
2.6%
1773610
 
2.3%
175948
 
1.9%
185987
 
1.6%
169786
 
1.4%
176726
 
1.4%
178205
 
1.2%
157225
 
1.2%
189764
 
0.9%
Other values (232)356
82.8%
ValueCountFrequency (%)
150282
0.5%
150581
0.2%
151051
0.2%
151211
0.2%
151291
0.2%
151331
0.2%
151371
0.2%
151391
0.2%
151401
0.2%
151421
0.2%
ValueCountFrequency (%)
189764
 
0.9%
1880812
2.8%
186411
 
0.2%
186223
 
0.7%
186212
 
0.5%
186172
 
0.5%
186033
 
0.7%
186012
 
0.5%
185987
1.6%
185711
 
0.2%

Operating_Dist_Capacity
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct364
Distinct (%)84.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16419.06512
Minimum14375
Maximum18698
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum14375
5-th percentile14824.45
Q115117
median16564
Q317227.5
95-th percentile18450.2
Maximum18698
Range4323
Interquartile range (IQR)2110.5

Descriptive statistics

Standard deviation1210.49159
Coefficient of variation (CV)0.07372475723
Kurtosis-1.228651902
Mean16419.06512
Median Absolute Deviation (MAD)1110
Skewness0.1045932304
Sum7060198
Variance1465289.889
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
167115
 
1.2%
169214
 
0.9%
166434
 
0.9%
169044
 
0.9%
161343
 
0.7%
171503
 
0.7%
150973
 
0.7%
150813
 
0.7%
174643
 
0.7%
151173
 
0.7%
Other values (354)395
91.9%
ValueCountFrequency (%)
143751
0.2%
144111
0.2%
145171
0.2%
145381
0.2%
145501
0.2%
146071
0.2%
146391
0.2%
146491
0.2%
146621
0.2%
146911
0.2%
ValueCountFrequency (%)
186981
0.2%
186922
0.5%
186211
0.2%
185671
0.2%
185612
0.5%
185491
0.2%
185282
0.5%
185261
0.2%
185231
0.2%
185201
0.2%

Idle_Dist_Capacity
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct314
Distinct (%)73.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean422.527907
Minimum32
Maximum2651
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum32
5-th percentile75
Q1158.25
median321.5
Q3617.75
95-th percentile949.2
Maximum2651
Range2619
Interquartile range (IQR)459.5

Descriptive statistics

Standard deviation337.8876697
Coefficient of variation (CV)0.7996813088
Kurtosis9.433540024
Mean422.527907
Median Absolute Deviation (MAD)186.5
Skewness2.156327909
Sum181687
Variance114168.0773
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1355
 
1.2%
1395
 
1.2%
1385
 
1.2%
1465
 
1.2%
1524
 
0.9%
1534
 
0.9%
754
 
0.9%
814
 
0.9%
574
 
0.9%
354
 
0.9%
Other values (304)386
89.8%
ValueCountFrequency (%)
322
0.5%
354
0.9%
362
0.5%
372
0.5%
451
 
0.2%
491
 
0.2%
501
 
0.2%
574
0.9%
732
0.5%
742
0.5%
ValueCountFrequency (%)
26511
0.2%
25691
0.2%
23311
0.2%
14881
0.2%
14831
0.2%
14781
0.2%
12831
0.2%
12441
0.2%
11071
0.2%
11031
0.2%

Percent_Util
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct163
Distinct (%)37.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean89.15976744
Minimum70.2
Maximum99.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum70.2
5-th percentile81.3
Q186
median89.5
Q392.7
95-th percentile96.31
Maximum99.9
Range29.7
Interquartile range (IQR)6.7

Descriptive statistics

Standard deviation4.918490392
Coefficient of variation (CV)0.05516490826
Kurtosis0.5971587334
Mean89.15976744
Median Absolute Deviation (MAD)3.3
Skewness-0.5674442978
Sum38338.7
Variance24.19154773
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90.610
 
2.3%
86.58
 
1.9%
92.57
 
1.6%
92.67
 
1.6%
88.96
 
1.4%
956
 
1.4%
90.26
 
1.4%
85.86
 
1.4%
87.16
 
1.4%
89.16
 
1.4%
Other values (153)362
84.2%
ValueCountFrequency (%)
70.21
0.2%
70.81
0.2%
721
0.2%
74.61
0.2%
75.31
0.2%
75.91
0.2%
76.41
0.2%
76.91
0.2%
77.91
0.2%
78.61
0.2%
ValueCountFrequency (%)
99.91
 
0.2%
99.61
 
0.2%
99.21
 
0.2%
99.11
 
0.2%
98.91
 
0.2%
98.41
 
0.2%
97.81
 
0.2%
97.53
0.7%
97.21
 
0.2%
97.13
0.7%

Future_Price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct306
Distinct (%)71.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.57232558
Minimum11.3
Maximum134
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum11.3
5-th percentile14.945
Q119.9
median33.05
Q363.5
95-th percentile100.555
Maximum134
Range122.7
Interquartile range (IQR)43.6

Descriptive statistics

Standard deviation28.69879024
Coefficient of variation (CV)0.6438701562
Kurtosis-0.3805579035
Mean44.57232558
Median Absolute Deviation (MAD)15.65
Skewness0.834881001
Sum19166.1
Variance823.620561
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19.77
 
1.6%
21.35
 
1.2%
18.84
 
0.9%
15.54
 
0.9%
204
 
0.9%
49.94
 
0.9%
19.94
 
0.9%
59.44
 
0.9%
17.84
 
0.9%
184
 
0.9%
Other values (296)386
89.8%
ValueCountFrequency (%)
11.31
0.2%
11.61
0.2%
121
0.2%
12.51
0.2%
12.61
0.2%
12.81
0.2%
131
0.2%
13.42
0.5%
13.71
0.2%
13.81
0.2%
ValueCountFrequency (%)
1341
0.2%
133.51
0.2%
125.51
0.2%
116.71
0.2%
112.51
0.2%
1101
0.2%
106.51
0.2%
106.22
0.5%
105.41
0.2%
105.11
0.2%

Spot_Price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct309
Distinct (%)71.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.55837209
Minimum11.3
Maximum133.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 KiB

Quantile statistics

Minimum11.3
5-th percentile14.945
Q119.9
median33.3
Q363.65
95-th percentile100.665
Maximum133.9
Range122.6
Interquartile range (IQR)43.75

Descriptive statistics

Standard deviation28.68330756
Coefficient of variation (CV)0.6437243151
Kurtosis-0.3744934178
Mean44.55837209
Median Absolute Deviation (MAD)15.9
Skewness0.8375564287
Sum19160.1
Variance822.7321325
MonotonicityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21.36
 
1.4%
20.16
 
1.4%
19.75
 
1.2%
17.95
 
1.2%
185
 
1.2%
19.95
 
1.2%
20.34
 
0.9%
593
 
0.7%
713
 
0.7%
49.83
 
0.7%
Other values (299)385
89.5%
ValueCountFrequency (%)
11.31
0.2%
11.61
0.2%
121
0.2%
12.51
0.2%
12.61
0.2%
12.81
0.2%
131
0.2%
13.41
0.2%
13.51
0.2%
13.71
0.2%
ValueCountFrequency (%)
133.91
0.2%
133.41
0.2%
125.41
0.2%
116.71
0.2%
112.61
0.2%
109.51
0.2%
106.61
0.2%
106.31
0.2%
106.21
0.2%
105.81
0.2%

Interactions

Correlations

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

DateYearMonthTotal_ProductionProduct_SuppliedRefinery_InputOperable_Dist_CapacityOperating_Dist_CapacityIdle_Dist_CapacityPercent_UtilFuture_PriceSpot_Price
0Jan-19861986128324849872812583154591463982081.423.022.9
1Feb-19861986225685545320912068154851453894777.915.515.5
2Mar-19861986327941350456511759154851451796875.912.612.6
3Apr-19861986426591747833912603154731455092381.512.812.8
4May-19861986527396449578913314154841480567986.015.315.4
5Jun-19861986625870048148213347154651464981686.313.413.4
6Jul-19861986726844850551413009154751460786884.111.611.6
7Aug-19861986825958051516713392154301480762486.815.115.1
8Sep-19861986924984347726913191154351487056585.514.914.9
9Oct-198619861026098451467412753154351482760882.614.914.9

Last rows

DateYearMonthTotal_ProductionProduct_SuppliedRefinery_InputOperable_Dist_CapacityOperating_Dist_CapacityIdle_Dist_CapacityPercent_UtilFuture_PriceSpot_Price
420Jan-20212021134274757645714975181431773540882.552.152.0
421Feb-20212021227364648843812804180901752656470.859.159.0
422Mar-202120213345946595319148341809017035105582.062.462.3
423Apr-20212021433690558378115633181281755357486.261.761.7
424May-20212021535134662290316130181281784328589.065.265.2
425Jun-20212021633864561611516743181281791021892.471.471.4
426Jul-20212021735122861671416482181291794318790.972.472.5
427Aug-20212021834739363582816377181301791421690.367.767.7
428Sep-202120219324654606706157971813015800233187.171.571.6
429Oct-202120211035567061663915581181321713399985.981.281.5