Overview

Dataset statistics

Number of variables15
Number of observations45726
Missing cells29682
Missing cells (%)4.3%
Duplicate rows10
Duplicate rows (%)< 0.1%
Total size in memory24.0 MiB
Average record size in memory550.6 B

Variable types

Categorical7
Numeric5
DateTime1
Boolean1
Unsupported1

Alerts

source has constant value "NASA" Constant
Dataset has 10 (< 0.1%) duplicate rowsDuplicates
name has a high cardinality: 45716 distinct values High cardinality
recclass has a high cardinality: 466 distinct values High cardinality
GeoLocation has a high cardinality: 17100 distinct values High cardinality
reclat is highly correlated with reclong and 1 other fieldsHigh correlation
reclong is highly correlated with reclat and 1 other fieldsHigh correlation
reclat_city is highly correlated with reclat and 1 other fieldsHigh correlation
reclat is highly correlated with reclong and 1 other fieldsHigh correlation
reclong is highly correlated with reclat and 1 other fieldsHigh correlation
reclat_city is highly correlated with reclat and 1 other fieldsHigh correlation
reclat is highly correlated with reclong and 1 other fieldsHigh correlation
reclong is highly correlated with reclatHigh correlation
reclat_city is highly correlated with reclatHigh correlation
id is highly correlated with reclat and 2 other fieldsHigh correlation
fall is highly correlated with reclat and 1 other fieldsHigh correlation
reclat is highly correlated with id and 3 other fieldsHigh correlation
reclong is highly correlated with id and 2 other fieldsHigh correlation
reclat_city is highly correlated with id and 3 other fieldsHigh correlation
reclat has 7315 (16.0%) missing values Missing
reclong has 7315 (16.0%) missing values Missing
GeoLocation has 7315 (16.0%) missing values Missing
reclat_city has 7315 (16.0%) missing values Missing
mass (g) is highly skewed (γ1 = 76.91847245) Skewed
name is uniformly distributed Uniform
unhashable is an unsupported type, check if it needs cleaning or further analysis Unsupported
reclat has 6438 (14.1%) zeros Zeros
reclong has 6214 (13.6%) zeros Zeros

Reproduction

Analysis started2022-09-06 19:04:25.019697
Analysis finished2022-09-06 19:04:32.055931
Duration7.04 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct45716
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size3.3 MiB
Aachen
 
2
Abee
 
2
Acapulco
 
2
Achiras
 
2
Adhi Kot
 
2
Other values (45711)
45716 

Length

Max length28
Median length25
Mean length17.78248699
Min length2

Characters and Unicode

Total characters813122
Distinct characters96
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45706 ?
Unique (%)> 99.9%

Sample

1st rowAachen
2nd rowAarhus
3rd rowAbee
4th rowAcapulco
5th rowAchiras

Common Values

ValueCountFrequency (%)
Aachen2
 
< 0.1%
Abee2
 
< 0.1%
Acapulco2
 
< 0.1%
Achiras2
 
< 0.1%
Adhi Kot2
 
< 0.1%
Adzhi-Bogdo (stone)2
 
< 0.1%
Agen2
 
< 0.1%
Aguada2
 
< 0.1%
Aguila Blanca2
 
< 0.1%
Aarhus2
 
< 0.1%
Other values (45706)45706
> 99.9%

Length

2022-09-06T19:04:32.121940image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
yamato7269
 
5.7%
range6575
 
5.2%
africa4502
 
3.6%
northwest4499
 
3.5%
hills3995
 
3.2%
queen3445
 
2.7%
alexandra3444
 
2.7%
mountains3004
 
2.4%
al2663
 
2.1%
grove2496
 
2.0%
Other values (37726)84860
66.9%

Most occurring characters

ValueCountFrequency (%)
81032
 
10.0%
a72715
 
8.9%
e48167
 
5.9%
n38392
 
4.7%
034943
 
4.3%
r33097
 
4.1%
i32658
 
4.0%
l31873
 
3.9%
t30898
 
3.8%
o30428
 
3.7%
Other values (86)378919
46.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter440949
54.2%
Decimal Number205415
25.3%
Uppercase Letter84942
 
10.4%
Space Separator81032
 
10.0%
Close Punctuation295
 
< 0.1%
Open Punctuation295
 
< 0.1%
Dash Punctuation98
 
< 0.1%
Other Punctuation96
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a72715
16.5%
e48167
10.9%
n38392
8.7%
r33097
 
7.5%
i32658
 
7.4%
l31873
 
7.2%
t30898
 
7.0%
o30428
 
6.9%
s20972
 
4.8%
m12393
 
2.8%
Other values (39)89356
20.3%
Uppercase Letter
ValueCountFrequency (%)
A14120
16.6%
M11173
13.2%
R7599
8.9%
Y7327
8.6%
N5796
 
6.8%
H5676
 
6.7%
G4682
 
5.5%
L4630
 
5.5%
D3777
 
4.4%
Q3478
 
4.1%
Other values (21)16684
19.6%
Decimal Number
ValueCountFrequency (%)
034943
17.0%
924444
11.9%
822179
10.8%
121986
10.7%
219839
9.7%
719347
9.4%
317379
8.5%
416001
7.8%
514812
7.2%
614485
7.1%
Other Punctuation
ValueCountFrequency (%)
'67
69.8%
.29
30.2%
Space Separator
ValueCountFrequency (%)
81032
100.0%
Close Punctuation
ValueCountFrequency (%)
)295
100.0%
Open Punctuation
ValueCountFrequency (%)
(295
100.0%
Dash Punctuation
ValueCountFrequency (%)
-98
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin525891
64.7%
Common287231
35.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a72715
13.8%
e48167
 
9.2%
n38392
 
7.3%
r33097
 
6.3%
i32658
 
6.2%
l31873
 
6.1%
t30898
 
5.9%
o30428
 
5.8%
s20972
 
4.0%
A14120
 
2.7%
Other values (70)172571
32.8%
Common
ValueCountFrequency (%)
81032
28.2%
034943
12.2%
924444
 
8.5%
822179
 
7.7%
121986
 
7.7%
219839
 
6.9%
719347
 
6.7%
317379
 
6.1%
416001
 
5.6%
514812
 
5.2%
Other values (6)15269
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII812638
99.9%
None484
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
81032
 
10.0%
a72715
 
8.9%
e48167
 
5.9%
n38392
 
4.7%
034943
 
4.3%
r33097
 
4.1%
i32658
 
4.0%
l31873
 
3.9%
t30898
 
3.8%
o30428
 
3.7%
Other values (58)378435
46.6%
None
ValueCountFrequency (%)
é204
42.1%
ş125
25.8%
Ö63
 
13.0%
á11
 
2.3%
ö11
 
2.3%
ä10
 
2.1%
ó8
 
1.7%
ü8
 
1.7%
ñ8
 
1.7%
ã5
 
1.0%
Other values (18)31
 
6.4%

id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct45716
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26883.9062
Minimum1
Maximum57458
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size357.4 KiB
2022-09-06T19:04:32.243905image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2388.75
Q112681.25
median24256.5
Q340653.5
95-th percentile54890.75
Maximum57458
Range57457
Interquartile range (IQR)27972.25

Descriptive statistics

Standard deviation16863.44557
Coefficient of variation (CV)0.6272691713
Kurtosis-1.160130804
Mean26883.9062
Median Absolute Deviation (MAD)13264
Skewness0.2665300704
Sum1229293495
Variance284375796.4
MonotonicityNot monotonic
2022-09-06T19:04:32.366419image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12
 
< 0.1%
62
 
< 0.1%
102
 
< 0.1%
3702
 
< 0.1%
3792
 
< 0.1%
3902
 
< 0.1%
3922
 
< 0.1%
3982
 
< 0.1%
4172
 
< 0.1%
22
 
< 0.1%
Other values (45706)45706
> 99.9%
ValueCountFrequency (%)
12
< 0.1%
22
< 0.1%
41
< 0.1%
51
< 0.1%
62
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
102
< 0.1%
111
< 0.1%
ValueCountFrequency (%)
574581
< 0.1%
574571
< 0.1%
574561
< 0.1%
574551
< 0.1%
574541
< 0.1%
574531
< 0.1%
574361
< 0.1%
574351
< 0.1%
574341
< 0.1%
574331
< 0.1%

nametype
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
Valid
45651 
Relict
 
75

Length

Max length6
Median length5
Mean length5.001640205
Min length5

Characters and Unicode

Total characters228705
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowValid
2nd rowValid
3rd rowValid
4th rowValid
5th rowValid

Common Values

ValueCountFrequency (%)
Valid45651
99.8%
Relict75
 
0.2%

Length

2022-09-06T19:04:32.604350image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-06T19:04:32.697178image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
valid45651
99.8%
relict75
 
0.2%

Most occurring characters

ValueCountFrequency (%)
l45726
20.0%
i45726
20.0%
V45651
20.0%
a45651
20.0%
d45651
20.0%
R75
 
< 0.1%
e75
 
< 0.1%
c75
 
< 0.1%
t75
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter182979
80.0%
Uppercase Letter45726
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l45726
25.0%
i45726
25.0%
a45651
24.9%
d45651
24.9%
e75
 
< 0.1%
c75
 
< 0.1%
t75
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
V45651
99.8%
R75
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin228705
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l45726
20.0%
i45726
20.0%
V45651
20.0%
a45651
20.0%
d45651
20.0%
R75
 
< 0.1%
e75
 
< 0.1%
c75
 
< 0.1%
t75
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII228705
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l45726
20.0%
i45726
20.0%
V45651
20.0%
a45651
20.0%
d45651
20.0%
R75
 
< 0.1%
e75
 
< 0.1%
c75
 
< 0.1%
t75
 
< 0.1%

recclass
Categorical

HIGH CARDINALITY

Distinct466
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
L6
8287 
H5
7143 
L5
4797 
H6
4529 
H4
4211 
Other values (461)
16759 

Length

Max length26
Median length2
Mean length3.052530289
Min length1

Characters and Unicode

Total characters139580
Distinct characters62
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique145 ?
Unique (%)0.3%

Sample

1st rowL5
2nd rowH6
3rd rowEH4
4th rowAcapulcoite
5th rowL6

Common Values

ValueCountFrequency (%)
L68287
18.1%
H57143
15.6%
L54797
10.5%
H64529
9.9%
H44211
9.2%
LL52766
 
6.0%
LL62043
 
4.5%
L41253
 
2.7%
H4/5428
 
0.9%
CM2416
 
0.9%
Other values (456)9853
21.5%

Length

2022-09-06T19:04:32.789182image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
l68341
17.6%
h57165
15.1%
l54818
10.2%
h64530
9.6%
h44223
 
8.9%
ll52766
 
5.8%
ll62046
 
4.3%
l41256
 
2.7%
iron1070
 
2.3%
h4/5428
 
0.9%
Other values (434)10712
22.6%

Most occurring characters

ValueCountFrequency (%)
L28467
20.4%
H18396
13.2%
516419
11.8%
616132
11.6%
46930
 
5.0%
e3972
 
2.8%
i3834
 
2.7%
r3648
 
2.6%
t3327
 
2.4%
33278
 
2.3%
Other values (52)35177
25.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter57793
41.4%
Decimal Number44118
31.6%
Lowercase Letter29926
21.4%
Other Punctuation3293
 
2.4%
Dash Punctuation1835
 
1.3%
Space Separator1747
 
1.3%
Math Symbol320
 
0.2%
Open Punctuation274
 
0.2%
Close Punctuation274
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e3972
13.3%
i3834
12.8%
r3648
12.2%
t3327
11.1%
n2520
8.4%
o2458
8.2%
c1767
 
5.9%
u1469
 
4.9%
a1409
 
4.7%
l1016
 
3.4%
Other values (12)4506
15.1%
Uppercase Letter
ValueCountFrequency (%)
L28467
49.3%
H18396
31.8%
I2753
 
4.8%
C1785
 
3.1%
E1261
 
2.2%
A985
 
1.7%
M913
 
1.6%
B754
 
1.3%
O542
 
0.9%
V350
 
0.6%
Other values (10)1587
 
2.7%
Decimal Number
ValueCountFrequency (%)
516419
37.2%
616132
36.6%
46930
15.7%
33278
 
7.4%
2646
 
1.5%
7251
 
0.6%
8216
 
0.5%
9111
 
0.3%
1100
 
0.2%
035
 
0.1%
Other Punctuation
ValueCountFrequency (%)
/1174
35.7%
.1064
32.3%
,1031
31.3%
?24
 
0.7%
Math Symbol
ValueCountFrequency (%)
~319
99.7%
<1
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
-1835
100.0%
Space Separator
ValueCountFrequency (%)
1747
100.0%
Open Punctuation
ValueCountFrequency (%)
(274
100.0%
Close Punctuation
ValueCountFrequency (%)
)274
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin87719
62.8%
Common51861
37.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
L28467
32.5%
H18396
21.0%
e3972
 
4.5%
i3834
 
4.4%
r3648
 
4.2%
t3327
 
3.8%
I2753
 
3.1%
n2520
 
2.9%
o2458
 
2.8%
C1785
 
2.0%
Other values (32)16559
18.9%
Common
ValueCountFrequency (%)
516419
31.7%
616132
31.1%
46930
13.4%
33278
 
6.3%
-1835
 
3.5%
1747
 
3.4%
/1174
 
2.3%
.1064
 
2.1%
,1031
 
2.0%
2646
 
1.2%
Other values (10)1605
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII139580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L28467
20.4%
H18396
13.2%
516419
11.8%
616132
11.6%
46930
 
5.0%
e3972
 
2.8%
i3834
 
2.7%
r3648
 
2.6%
t3327
 
2.4%
33278
 
2.3%
Other values (52)35177
25.2%

mass (g)
Real number (ℝ≥0)

SKEWED

Distinct12576
Distinct (%)27.6%
Missing131
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean13278.42646
Minimum0
Maximum60000000
Zeros19
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size357.4 KiB
2022-09-06T19:04:32.907943image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.1
Q17.2
median32.61
Q3202.9
95-th percentile4000
Maximum60000000
Range60000000
Interquartile range (IQR)195.7

Descriptive statistics

Standard deviation574926.0121
Coefficient of variation (CV)43.2977517
Kurtosis6798.398388
Mean13278.42646
Median Absolute Deviation (MAD)30.51
Skewness76.91847245
Sum605429854.6
Variance3.305399193 × 1011
MonotonicityNot monotonic
2022-09-06T19:04:33.027881image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.3171
 
0.4%
1.2140
 
0.3%
1.4138
 
0.3%
2.1130
 
0.3%
2.4126
 
0.3%
1.6120
 
0.3%
0.5119
 
0.3%
1.1116
 
0.3%
3.8114
 
0.2%
1.5111
 
0.2%
Other values (12566)44310
96.9%
(Missing)131
 
0.3%
ValueCountFrequency (%)
019
< 0.1%
0.012
 
< 0.1%
0.0131
 
< 0.1%
0.021
 
< 0.1%
0.031
 
< 0.1%
0.041
 
< 0.1%
0.051
 
< 0.1%
0.061
 
< 0.1%
0.073
 
< 0.1%
0.082
 
< 0.1%
ValueCountFrequency (%)
600000001
< 0.1%
582000001
< 0.1%
500000001
< 0.1%
300000001
< 0.1%
280000001
< 0.1%
260000001
< 0.1%
243000001
< 0.1%
240000001
< 0.1%
230000001
< 0.1%
220000001
< 0.1%

fall
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
Found
44609 
Fell
 
1117

Length

Max length5
Median length5
Mean length4.975571885
Min length4

Characters and Unicode

Total characters227513
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFell
2nd rowFell
3rd rowFell
4th rowFell
5th rowFell

Common Values

ValueCountFrequency (%)
Found44609
97.6%
Fell1117
 
2.4%

Length

2022-09-06T19:04:33.138969image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-06T19:04:33.229212image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
found44609
97.6%
fell1117
 
2.4%

Most occurring characters

ValueCountFrequency (%)
F45726
20.1%
o44609
19.6%
u44609
19.6%
n44609
19.6%
d44609
19.6%
l2234
 
1.0%
e1117
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter181787
79.9%
Uppercase Letter45726
 
20.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o44609
24.5%
u44609
24.5%
n44609
24.5%
d44609
24.5%
l2234
 
1.2%
e1117
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
F45726
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin227513
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F45726
20.1%
o44609
19.6%
u44609
19.6%
n44609
19.6%
d44609
19.6%
l2234
 
1.0%
e1117
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII227513
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F45726
20.1%
o44609
19.6%
u44609
19.6%
n44609
19.6%
d44609
19.6%
l2234
 
1.0%
e1117
 
0.5%

year
Date

Distinct265
Distinct (%)0.6%
Missing291
Missing (%)0.6%
Memory size357.4 KiB
Minimum1970-01-01 00:00:00
Maximum1970-01-01 00:00:00.000002
2022-09-06T19:04:33.322378image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-06T19:04:33.443723image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

reclat
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct12738
Distinct (%)33.2%
Missing7315
Missing (%)16.0%
Infinite0
Infinite (%)0.0%
Mean-39.10709514
Minimum-87.36667
Maximum81.16667
Zeros6438
Zeros (%)14.1%
Negative23416
Negative (%)51.2%
Memory size357.4 KiB
2022-09-06T19:04:33.566787image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-87.36667
5-th percentile-84.35476
Q1-76.71377
median-71.5
Q30
95-th percentile34.494325
Maximum81.16667
Range168.53334
Interquartile range (IQR)76.71377

Descriptive statistics

Standard deviation46.38601095
Coefficient of variation (CV)-1.186127755
Kurtosis-1.476865084
Mean-39.10709514
Median Absolute Deviation (MAD)12.76459
Skewness0.4913157316
Sum-1502142.632
Variance2151.662012
MonotonicityNot monotonic
2022-09-06T19:04:33.684166image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06438
 
14.1%
-71.54761
 
10.4%
-843040
 
6.6%
-721506
 
3.3%
-79.683331130
 
2.5%
-76.71667680
 
1.5%
-76.18333539
 
1.2%
-84.21667263
 
0.6%
-86.36667226
 
0.5%
-86.71667217
 
0.5%
Other values (12728)19611
42.9%
(Missing)7315
 
16.0%
ValueCountFrequency (%)
-87.366674
 
< 0.1%
-87.033333
 
< 0.1%
-86.933333
 
< 0.1%
-86.71667217
0.5%
-86.5666717
 
< 0.1%
-86.544881
 
< 0.1%
-86.53791
 
< 0.1%
-86.537341
 
< 0.1%
-86.537251
 
< 0.1%
-86.530351
 
< 0.1%
ValueCountFrequency (%)
81.166671
< 0.1%
76.533331
< 0.1%
76.133331
< 0.1%
72.883331
< 0.1%
72.683331
< 0.1%
70.733331
< 0.1%
701
< 0.1%
69.11
< 0.1%
681
< 0.1%
67.883331
< 0.1%

reclong
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct14640
Distinct (%)38.1%
Missing7315
Missing (%)16.0%
Infinite0
Infinite (%)0.0%
Mean61.05259359
Minimum-165.43333
Maximum354.47333
Zeros6214
Zeros (%)13.6%
Negative4057
Negative (%)8.9%
Memory size357.4 KiB
2022-09-06T19:04:33.807846image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-165.43333
5-th percentile-90.427
Q10
median35.66667
Q3157.16667
95-th percentile168
Maximum354.47333
Range519.90666
Interquartile range (IQR)157.16667

Descriptive statistics

Standard deviation80.65525774
Coefficient of variation (CV)1.321078319
Kurtosis-0.7313935567
Mean61.05259359
Median Absolute Deviation (MAD)39.53972
Skewness-0.1743813291
Sum2345091.172
Variance6505.2706
MonotonicityNot monotonic
2022-09-06T19:04:33.929944image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06214
 
13.6%
35.666674985
 
10.9%
1683040
 
6.6%
261506
 
3.3%
159.75657
 
1.4%
159.66667637
 
1.4%
157.16667542
 
1.2%
155.75473
 
1.0%
160.5263
 
0.6%
-70228
 
0.5%
Other values (14630)19866
43.4%
(Missing)7315
 
16.0%
ValueCountFrequency (%)
-165.433339
< 0.1%
-165.1166717
< 0.1%
-163.166671
 
< 0.1%
-162.551
 
< 0.1%
-157.866671
 
< 0.1%
-157.783331
 
< 0.1%
-149.54
 
< 0.1%
-148.552
 
< 0.1%
-1483
 
< 0.1%
-146.266671
 
< 0.1%
ValueCountFrequency (%)
354.473331
 
< 0.1%
178.21
 
< 0.1%
178.083331
 
< 0.1%
175.730281
 
< 0.1%
175.133331
 
< 0.1%
175185
0.4%
174.500431
 
< 0.1%
174.41
 
< 0.1%
172.71
 
< 0.1%
172.61
 
< 0.1%

GeoLocation
Categorical

HIGH CARDINALITY
MISSING

Distinct17100
Distinct (%)44.5%
Missing7315
Missing (%)16.0%
Memory size2.9 MiB
(0.0, 0.0)
6214 
(-71.5, 35.66667)
4761 
(-84.0, 168.0)
3040 
(-72.0, 26.0)
 
1505
(-79.68333, 159.75)
 
657
Other values (17095)
22234 

Length

Max length24
Median length22
Mean length17.30480852
Min length10

Characters and Unicode

Total characters664695
Distinct characters16
Distinct categories6 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16363 ?
Unique (%)42.6%

Sample

1st row(50.775, 6.08333)
2nd row(56.18333, 10.23333)
3rd row(54.21667, -113.0)
4th row(16.88333, -99.9)
5th row(-33.16667, -64.95)

Common Values

ValueCountFrequency (%)
(0.0, 0.0)6214
 
13.6%
(-71.5, 35.66667)4761
 
10.4%
(-84.0, 168.0)3040
 
6.6%
(-72.0, 26.0)1505
 
3.3%
(-79.68333, 159.75)657
 
1.4%
(-76.71667, 159.66667)637
 
1.4%
(-76.18333, 157.16667)539
 
1.2%
(-79.68333, 155.75)473
 
1.0%
(-84.21667, 160.5)263
 
0.6%
(-86.36667, -70.0)226
 
0.5%
Other values (17090)20096
43.9%
(Missing)7315
 
16.0%

Length

2022-09-06T19:04:34.053115image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0.012652
 
16.5%
35.666674991
 
6.5%
71.54761
 
6.2%
84.03041
 
4.0%
168.03040
 
4.0%
26.01512
 
2.0%
72.01506
 
2.0%
79.683331130
 
1.5%
76.71667680
 
0.9%
159.75657
 
0.9%
Other values (26608)42852
55.8%

Most occurring characters

ValueCountFrequency (%)
.76822
11.6%
667560
 
10.2%
752499
 
7.9%
049033
 
7.4%
344771
 
6.7%
144476
 
6.7%
542757
 
6.4%
(38411
 
5.8%
,38411
 
5.8%
38411
 
5.8%
Other values (6)171544
25.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number406756
61.2%
Other Punctuation115233
 
17.3%
Open Punctuation38411
 
5.8%
Space Separator38411
 
5.8%
Close Punctuation38411
 
5.8%
Dash Punctuation27473
 
4.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
667560
16.6%
752499
12.9%
049033
12.1%
344771
11.0%
144476
10.9%
542757
10.5%
832680
8.0%
229923
7.4%
423646
 
5.8%
919411
 
4.8%
Other Punctuation
ValueCountFrequency (%)
.76822
66.7%
,38411
33.3%
Open Punctuation
ValueCountFrequency (%)
(38411
100.0%
Space Separator
ValueCountFrequency (%)
38411
100.0%
Close Punctuation
ValueCountFrequency (%)
)38411
100.0%
Dash Punctuation
ValueCountFrequency (%)
-27473
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common664695
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.76822
11.6%
667560
 
10.2%
752499
 
7.9%
049033
 
7.4%
344771
 
6.7%
144476
 
6.7%
542757
 
6.4%
(38411
 
5.8%
,38411
 
5.8%
38411
 
5.8%
Other values (6)171544
25.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII664695
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.76822
11.6%
667560
 
10.2%
752499
 
7.9%
049033
 
7.4%
344771
 
6.7%
144476
 
6.7%
542757
 
6.4%
(38411
 
5.8%
,38411
 
5.8%
38411
 
5.8%
Other values (6)171544
25.8%

source
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
NASA
45726 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters182904
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNASA
2nd rowNASA
3rd rowNASA
4th rowNASA
5th rowNASA

Common Values

ValueCountFrequency (%)
NASA45726
100.0%

Length

2022-09-06T19:04:34.156639image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-06T19:04:34.241358image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
nasa45726
100.0%

Most occurring characters

ValueCountFrequency (%)
A91452
50.0%
N45726
25.0%
S45726
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter182904
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A91452
50.0%
N45726
25.0%
S45726
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin182904
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A91452
50.0%
N45726
25.0%
S45726
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII182904
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A91452
50.0%
N45726
25.0%
S45726
25.0%

boolean
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size44.8 KiB
True
22934 
False
22792 
ValueCountFrequency (%)
True22934
50.2%
False22792
49.8%
2022-09-06T19:04:34.314936image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

mixed
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
A
22889 
1
22837 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45726
Distinct characters2
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd rowA
3rd row1
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A22889
50.1%
122837
49.9%

Length

2022-09-06T19:04:34.391522image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-06T19:04:34.479631image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
a22889
50.1%
122837
49.9%

Most occurring characters

ValueCountFrequency (%)
A22889
50.1%
122837
49.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter22889
50.1%
Decimal Number22837
49.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A22889
100.0%
Decimal Number
ValueCountFrequency (%)
122837
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin22889
50.1%
Common22837
49.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A22889
100.0%
Common
ValueCountFrequency (%)
122837
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII45726
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A22889
50.1%
122837
49.9%

unhashable
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size2.4 MiB

reclat_city
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct38401
Distinct (%)> 99.9%
Missing7315
Missing (%)16.0%
Infinite0
Infinite (%)0.0%
Mean-39.15354218
Minimum-104.3171665
Maximum77.74901083
Zeros0
Zeros (%)0.0%
Negative26603
Negative (%)58.2%
Memory size357.4 KiB
2022-09-06T19:04:34.689774image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-104.3171665
5-th percentile-87.87105763
Q1-78.40775202
median-68.97529272
Q34.788644923
95-th percentile35.42980961
Maximum77.74901083
Range182.0661773
Interquartile range (IQR)83.19639695

Descriptive statistics

Standard deviation46.68568721
Coefficient of variation (CV)-1.192374549
Kurtosis-1.446385025
Mean-39.15354218
Median Absolute Deviation (MAD)17.25584321
Skewness0.4816035823
Sum-1503926.709
Variance2179.55339
MonotonicityNot monotonic
2022-09-06T19:04:34.814105image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50.518060082
 
< 0.1%
43.279571562
 
< 0.1%
52.011044342
 
< 0.1%
-32.58102192
 
< 0.1%
49.607269212
 
< 0.1%
-29.651528212
 
< 0.1%
36.51658962
 
< 0.1%
-23.288646662
 
< 0.1%
23.165965892
 
< 0.1%
52.706635472
 
< 0.1%
Other values (38391)38391
84.0%
(Missing)7315
 
16.0%
ValueCountFrequency (%)
-104.31716651
< 0.1%
-102.43123751
< 0.1%
-102.08682531
< 0.1%
-101.55563731
< 0.1%
-101.32692841
< 0.1%
-101.20843411
< 0.1%
-101.01469351
< 0.1%
-100.91912641
< 0.1%
-100.78569471
< 0.1%
-100.57511171
< 0.1%
ValueCountFrequency (%)
77.749010831
< 0.1%
72.806220231
< 0.1%
72.757304231
< 0.1%
72.426079731
< 0.1%
72.258095951
< 0.1%
71.789382971
< 0.1%
71.425431691
< 0.1%
70.897552121
< 0.1%
70.533731831
< 0.1%
70.485239321
< 0.1%

Interactions

2022-09-06T19:04:30.478691image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-06T19:04:28.061210image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-06T19:04:28.647228image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-06T19:04:29.232549image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-06T19:04:29.929013image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/