Overview

Dataset statistics

Number of variables15
Number of observations45726
Missing cells29682
Missing cells (%)4.3%
Duplicate rows10
Duplicate rows (%)< 0.1%
Total size in memory24.0 MiB
Average record size in memory550.6 B

Variable types

Categorical7
Numeric5
DateTime1
Boolean1
Unsupported1

Alerts

source has constant value "NASA"Constant
Dataset has 10 (< 0.1%) duplicate rowsDuplicates
name has a high cardinality: 45716 distinct valuesHigh cardinality
recclass has a high cardinality: 466 distinct valuesHigh cardinality
GeoLocation has a high cardinality: 17100 distinct valuesHigh cardinality
reclat is highly overall correlated with reclong and 1 other fieldsHigh correlation
reclong is highly overall correlated with reclat and 1 other fieldsHigh correlation
reclat_city is highly overall correlated with reclat and 1 other fieldsHigh correlation
nametype is highly imbalanced (98.2%)Imbalance
fall is highly imbalanced (83.4%)Imbalance
reclat has 7315 (16.0%) missing valuesMissing
reclong has 7315 (16.0%) missing valuesMissing
GeoLocation has 7315 (16.0%) missing valuesMissing
reclat_city has 7315 (16.0%) missing valuesMissing
mass (g) is highly skewed (γ1 = 76.91847245)Skewed
name is uniformly distributedUniform
unhashable is an unsupported type, check if it needs cleaning or further analysisUnsupported
reclat has 6438 (14.1%) zerosZeros
reclong has 6214 (13.6%) zerosZeros

Reproduction

Analysis started2023-01-25 14:26:29.249571
Analysis finished2023-01-25 14:26:35.251838
Duration6 seconds
Software versionpandas-profiling v0.0.dev0
Download configurationconfig.json

Variables

name
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct45716
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size3.3 MiB
Aachen
 
2
Abee
 
2
Acapulco
 
2
Achiras
 
2
Adhi Kot
 
2
Other values (45711)
45716 

Length

Max length28
Median length25
Mean length17.782487
Min length2

Characters and Unicode

Total characters813122
Distinct characters96
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45706 ?
Unique (%)> 99.9%

Sample

1st rowAachen
2nd rowAarhus
3rd rowAbee
4th rowAcapulco
5th rowAchiras

Common Values

ValueCountFrequency (%)
Aachen 2
 
< 0.1%
Abee 2
 
< 0.1%
Acapulco 2
 
< 0.1%
Achiras 2
 
< 0.1%
Adhi Kot 2
 
< 0.1%
Adzhi-Bogdo (stone) 2
 
< 0.1%
Agen 2
 
< 0.1%
Aguada 2
 
< 0.1%
Aguila Blanca 2
 
< 0.1%
Aarhus 2
 
< 0.1%
Other values (45706) 45706
> 99.9%

Length

2023-01-25T14:26:35.322778image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
yamato 7269
 
5.7%
range 6575
 
5.2%
africa 4502
 
3.6%
northwest 4499
 
3.5%
hills 3995
 
3.2%
queen 3445
 
2.7%
alexandra 3444
 
2.7%
mountains 3004
 
2.4%
al 2663
 
2.1%
grove 2496
 
2.0%
Other values (37726) 84860
66.9%

Most occurring characters

ValueCountFrequency (%)
81032
 
10.0%
a 72715
 
8.9%
e 48167
 
5.9%
n 38392
 
4.7%
0 34943
 
4.3%
r 33097
 
4.1%
i 32658
 
4.0%
l 31873
 
3.9%
t 30898
 
3.8%
o 30428
 
3.7%
Other values (86) 378919
46.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 440949
54.2%
Decimal Number 205415
25.3%
Uppercase Letter 84942
 
10.4%
Space Separator 81032
 
10.0%
Close Punctuation 295
 
< 0.1%
Open Punctuation 295
 
< 0.1%
Dash Punctuation 98
 
< 0.1%
Other Punctuation 96
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 72715
16.5%
e 48167
10.9%
n 38392
8.7%
r 33097
 
7.5%
i 32658
 
7.4%
l 31873
 
7.2%
t 30898
 
7.0%
o 30428
 
6.9%
s 20972
 
4.8%
m 12393
 
2.8%
Other values (39) 89356
20.3%
Uppercase Letter
ValueCountFrequency (%)
A 14120
16.6%
M 11173
13.2%
R 7599
8.9%
Y 7327
8.6%
N 5796
 
6.8%
H 5676
 
6.7%
G 4682
 
5.5%
L 4630
 
5.5%
D 3777
 
4.4%
Q 3478
 
4.1%
Other values (21) 16684
19.6%
Decimal Number
ValueCountFrequency (%)
0 34943
17.0%
9 24444
11.9%
8 22179
10.8%
1 21986
10.7%
2 19839
9.7%
7 19347
9.4%
3 17379
8.5%
4 16001
7.8%
5 14812
7.2%
6 14485
7.1%
Other Punctuation
ValueCountFrequency (%)
' 67
69.8%
. 29
30.2%
Space Separator
ValueCountFrequency (%)
81032
100.0%
Close Punctuation
ValueCountFrequency (%)
) 295
100.0%
Open Punctuation
ValueCountFrequency (%)
( 295
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 98
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 525891
64.7%
Common 287231
35.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 72715
13.8%
e 48167
 
9.2%
n 38392
 
7.3%
r 33097
 
6.3%
i 32658
 
6.2%
l 31873
 
6.1%
t 30898
 
5.9%
o 30428
 
5.8%
s 20972
 
4.0%
A 14120
 
2.7%
Other values (70) 172571
32.8%
Common
ValueCountFrequency (%)
81032
28.2%
0 34943
12.2%
9 24444
 
8.5%
8 22179
 
7.7%
1 21986
 
7.7%
2 19839
 
6.9%
7 19347
 
6.7%
3 17379
 
6.1%
4 16001
 
5.6%
5 14812
 
5.2%
Other values (6) 15269
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 812638
99.9%
None 484
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
81032
 
10.0%
a 72715
 
8.9%
e 48167
 
5.9%
n 38392
 
4.7%
0 34943
 
4.3%
r 33097
 
4.1%
i 32658
 
4.0%
l 31873
 
3.9%
t 30898
 
3.8%
o 30428
 
3.7%
Other values (58) 378435
46.6%
None
ValueCountFrequency (%)
é 204
42.1%
ş 125
25.8%
Ö 63
 
13.0%
á 11
 
2.3%
ö 11
 
2.3%
ä 10
 
2.1%
ó 8
 
1.7%
ü 8
 
1.7%
ñ 8
 
1.7%
ã 5
 
1.0%
Other values (18) 31
 
6.4%

id
Real number (ℝ)

Distinct45716
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26883.906
Minimum1
Maximum57458
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size357.4 KiB
2023-01-25T14:26:35.458167image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2388.75
Q112681.25
median24256.5
Q340653.5
95-th percentile54890.75
Maximum57458
Range57457
Interquartile range (IQR)27972.25

Descriptive statistics

Standard deviation16863.446
Coefficient of variation (CV)0.62726917
Kurtosis-1.1601308
Mean26883.906
Median Absolute Deviation (MAD)13264
Skewness0.26653007
Sum1.2292935 × 109
Variance2.843758 × 108
MonotonicityNot monotonic
2023-01-25T14:26:35.598320image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 2
 
< 0.1%
6 2
 
< 0.1%
10 2
 
< 0.1%
370 2
 
< 0.1%
379 2
 
< 0.1%
390 2
 
< 0.1%
392 2
 
< 0.1%
398 2
 
< 0.1%
417 2
 
< 0.1%
2 2
 
< 0.1%
Other values (45706) 45706
> 99.9%
ValueCountFrequency (%)
1 2
< 0.1%
2 2
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 2
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 2
< 0.1%
11 1
< 0.1%
ValueCountFrequency (%)
57458 1
< 0.1%
57457 1
< 0.1%
57456 1
< 0.1%
57455 1
< 0.1%
57454 1
< 0.1%
57453 1
< 0.1%
57436 1
< 0.1%
57435 1
< 0.1%
57434 1
< 0.1%
57433 1
< 0.1%

nametype
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
Valid
45651 
Relict
 
75

Length

Max length6
Median length5
Mean length5.0016402
Min length5

Characters and Unicode

Total characters228705
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowValid
2nd rowValid
3rd rowValid
4th rowValid
5th rowValid

Common Values

ValueCountFrequency (%)
Valid 45651
99.8%
Relict 75
 
0.2%

Length

2023-01-25T14:26:35.730513image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-01-25T14:26:35.841886image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
valid 45651
99.8%
relict 75
 
0.2%

Most occurring characters

ValueCountFrequency (%)
l 45726
20.0%
i 45726
20.0%
V 45651
20.0%
a 45651
20.0%
d 45651
20.0%
R 75
 
< 0.1%
e 75
 
< 0.1%
c 75
 
< 0.1%
t 75
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 182979
80.0%
Uppercase Letter 45726
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 45726
25.0%
i 45726
25.0%
a 45651
24.9%
d 45651
24.9%
e 75
 
< 0.1%
c 75
 
< 0.1%
t 75
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
V 45651
99.8%
R 75
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 228705
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 45726
20.0%
i 45726
20.0%
V 45651
20.0%
a 45651
20.0%
d 45651
20.0%
R 75
 
< 0.1%
e 75
 
< 0.1%
c 75
 
< 0.1%
t 75
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 228705
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 45726
20.0%
i 45726
20.0%
V 45651
20.0%
a 45651
20.0%
d 45651
20.0%
R 75
 
< 0.1%
e 75
 
< 0.1%
c 75
 
< 0.1%
t 75
 
< 0.1%

recclass
Categorical

Distinct466
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
L6
8287 
H5
7143 
L5
4797 
H6
4529 
H4
4211 
Other values (461)
16759 

Length

Max length26
Median length2
Mean length3.0525303
Min length1

Characters and Unicode

Total characters139580
Distinct characters62
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique145 ?
Unique (%)0.3%

Sample

1st rowL5
2nd rowH6
3rd rowEH4
4th rowAcapulcoite
5th rowL6

Common Values

ValueCountFrequency (%)
L6 8287
18.1%
H5 7143
15.6%
L5 4797
10.5%
H6 4529
9.9%
H4 4211
9.2%
LL5 2766
 
6.0%
LL6 2043
 
4.5%
L4 1253
 
2.7%
H4/5 428
 
0.9%
CM2 416
 
0.9%
Other values (456) 9853
21.5%

Length

2023-01-25T14:26:35.948799image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
l6 8341
17.6%
h5 7165
15.1%
l5 4818
10.2%
h6 4530
9.6%
h4 4223
 
8.9%
ll5 2766
 
5.8%
ll6 2046
 
4.3%
l4 1256
 
2.7%
iron 1070
 
2.3%
h4/5 428
 
0.9%
Other values (434) 10712
22.6%

Most occurring characters

ValueCountFrequency (%)
L 28467
20.4%
H 18396
13.2%
5 16419
11.8%
6 16132
11.6%
4 6930
 
5.0%
e 3972
 
2.8%
i 3834
 
2.7%
r 3648
 
2.6%
t 3327
 
2.4%
3 3278
 
2.3%
Other values (52) 35177
25.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 57793
41.4%
Decimal Number 44118
31.6%
Lowercase Letter 29926
21.4%
Other Punctuation 3293
 
2.4%
Dash Punctuation 1835
 
1.3%
Space Separator 1747
 
1.3%
Math Symbol 320
 
0.2%
Open Punctuation 274
 
0.2%
Close Punctuation 274
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3972
13.3%
i 3834
12.8%
r 3648
12.2%
t 3327
11.1%
n 2520
8.4%
o 2458
8.2%
c 1767
 
5.9%
u 1469
 
4.9%
a 1409
 
4.7%
l 1016
 
3.4%
Other values (12) 4506
15.1%
Uppercase Letter
ValueCountFrequency (%)
L 28467
49.3%
H 18396
31.8%
I 2753
 
4.8%
C 1785
 
3.1%
E 1261
 
2.2%
A 985
 
1.7%
M 913
 
1.6%
B 754
 
1.3%
O 542
 
0.9%
V 350
 
0.6%
Other values (10) 1587
 
2.7%
Decimal Number
ValueCountFrequency (%)
5 16419
37.2%
6 16132
36.6%
4 6930
15.7%
3 3278
 
7.4%
2 646
 
1.5%
7 251
 
0.6%
8 216
 
0.5%
9 111
 
0.3%
1 100
 
0.2%
0 35
 
0.1%
Other Punctuation
ValueCountFrequency (%)
/ 1174
35.7%
. 1064
32.3%
, 1031
31.3%
? 24
 
0.7%
Math Symbol
ValueCountFrequency (%)
~ 319
99.7%
< 1
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 1835
100.0%
Space Separator
ValueCountFrequency (%)
1747
100.0%
Open Punctuation
ValueCountFrequency (%)
( 274
100.0%
Close Punctuation
ValueCountFrequency (%)
) 274
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 87719
62.8%
Common 51861
37.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 28467
32.5%
H 18396
21.0%
e 3972
 
4.5%
i 3834
 
4.4%
r 3648
 
4.2%
t 3327
 
3.8%
I 2753
 
3.1%
n 2520
 
2.9%
o 2458
 
2.8%
C 1785
 
2.0%
Other values (32) 16559
18.9%
Common
ValueCountFrequency (%)
5 16419
31.7%
6 16132
31.1%
4 6930
13.4%
3 3278
 
6.3%
- 1835
 
3.5%
1747
 
3.4%
/ 1174
 
2.3%
. 1064
 
2.1%
, 1031
 
2.0%
2 646
 
1.2%
Other values (10) 1605
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 139580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L 28467
20.4%
H 18396
13.2%
5 16419
11.8%
6 16132
11.6%
4 6930
 
5.0%
e 3972
 
2.8%
i 3834
 
2.7%
r 3648
 
2.6%
t 3327
 
2.4%
3 3278
 
2.3%
Other values (52) 35177
25.2%

mass (g)
Real number (ℝ)

Distinct12576
Distinct (%)27.6%
Missing131
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean13278.426
Minimum0
Maximum60000000
Zeros19
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size357.4 KiB
2023-01-25T14:26:36.089324image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.1
Q17.2
median32.61
Q3202.9
95-th percentile4000
Maximum60000000
Range60000000
Interquartile range (IQR)195.7

Descriptive statistics

Standard deviation574926.01
Coefficient of variation (CV)43.297752
Kurtosis6798.3984
Mean13278.426
Median Absolute Deviation (MAD)30.51
Skewness76.918472
Sum6.0542985 × 108
Variance3.3053992 × 1011
MonotonicityNot monotonic
2023-01-25T14:26:36.228556image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.3 171
 
0.4%
1.2 140
 
0.3%
1.4 138
 
0.3%
2.1 130
 
0.3%
2.4 126
 
0.3%
1.6 120
 
0.3%
0.5 119
 
0.3%
1.1 116
 
0.3%
3.8 114
 
0.2%
1.5 111
 
0.2%
Other values (12566) 44310
96.9%
(Missing) 131
 
0.3%
ValueCountFrequency (%)
0 19
< 0.1%
0.01 2
 
< 0.1%
0.013 1
 
< 0.1%
0.02 1
 
< 0.1%
0.03 1
 
< 0.1%
0.04 1
 
< 0.1%
0.05 1
 
< 0.1%
0.06 1
 
< 0.1%
0.07 3
 
< 0.1%
0.08 2
 
< 0.1%
ValueCountFrequency (%)
60000000 1
< 0.1%
58200000 1
< 0.1%
50000000 1
< 0.1%
30000000 1
< 0.1%
28000000 1
< 0.1%
26000000 1
< 0.1%
24300000 1
< 0.1%
24000000 1
< 0.1%
23000000 1
< 0.1%
22000000 1
< 0.1%

fall
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
Found
44609 
Fell
 
1117

Length

Max length5
Median length5
Mean length4.9755719
Min length4

Characters and Unicode

Total characters227513
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFell
2nd rowFell
3rd rowFell
4th rowFell
5th rowFell

Common Values

ValueCountFrequency (%)
Found 44609
97.6%
Fell 1117
 
2.4%

Length

2023-01-25T14:26:36.356514image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-01-25T14:26:36.466761image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
found 44609
97.6%
fell 1117
 
2.4%

Most occurring characters

ValueCountFrequency (%)
F 45726
20.1%
o 44609
19.6%
u 44609
19.6%
n 44609
19.6%
d 44609
19.6%
l 2234
 
1.0%
e 1117
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 181787
79.9%
Uppercase Letter 45726
 
20.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 44609
24.5%
u 44609
24.5%
n 44609
24.5%
d 44609
24.5%
l 2234
 
1.2%
e 1117
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
F 45726
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 227513
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 45726
20.1%
o 44609
19.6%
u 44609
19.6%
n 44609
19.6%
d 44609
19.6%
l 2234
 
1.0%
e 1117
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 227513
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 45726
20.1%
o 44609
19.6%
u 44609
19.6%
n 44609
19.6%
d 44609
19.6%
l 2234
 
1.0%
e 1117
 
0.5%

year
Date

Distinct265
Distinct (%)0.6%
Missing291
Missing (%)0.6%
Memory size357.4 KiB
Minimum1970-01-01 00:00:00
Maximum1970-01-01 00:00:00.000002
2023-01-25T14:26:36.575889image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-01-25T14:26:36.837052image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

reclat
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct12738
Distinct (%)33.2%
Missing7315
Missing (%)16.0%
Infinite0
Infinite (%)0.0%
Mean-39.107095
Minimum-87.36667
Maximum81.16667
Zeros6438
Zeros (%)14.1%
Negative23416
Negative (%)51.2%
Memory size357.4 KiB
2023-01-25T14:26:36.973033image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum-87.36667
5-th percentile-84.35476
Q1-76.71377
median-71.5
Q30
95-th percentile34.494325
Maximum81.16667
Range168.53334
Interquartile range (IQR)76.71377

Descriptive statistics

Standard deviation46.386011
Coefficient of variation (CV)-1.1861278
Kurtosis-1.4768651
Mean-39.107095
Median Absolute Deviation (MAD)12.76459
Skewness0.49131573
Sum-1502142.6
Variance2151.662
MonotonicityNot monotonic
2023-01-25T14:26:37.106848image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6438
 
14.1%
-71.5 4761
 
10.4%
-84 3040
 
6.6%
-72 1506
 
3.3%
-79.68333 1130
 
2.5%
-76.71667 680
 
1.5%
-76.18333 539
 
1.2%
-84.21667 263
 
0.6%
-86.36667 226
 
0.5%
-86.71667 217
 
0.5%
Other values (12728) 19611
42.9%
(Missing) 7315
 
16.0%
ValueCountFrequency (%)
-87.36667 4
 
< 0.1%
-87.03333 3
 
< 0.1%
-86.93333 3
 
< 0.1%
-86.71667 217
0.5%
-86.56667 17
 
< 0.1%
-86.54488 1
 
< 0.1%
-86.5379 1
 
< 0.1%
-86.53734 1
 
< 0.1%
-86.53725 1
 
< 0.1%
-86.53035 1
 
< 0.1%
ValueCountFrequency (%)
81.16667 1
< 0.1%
76.53333 1
< 0.1%
76.13333 1
< 0.1%
72.88333 1
< 0.1%
72.68333 1
< 0.1%
70.73333 1
< 0.1%
70 1
< 0.1%
69.1 1
< 0.1%
68 1
< 0.1%
67.88333 1
< 0.1%

reclong
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct14640
Distinct (%)38.1%
Missing7315
Missing (%)16.0%
Infinite0
Infinite (%)0.0%
Mean61.052594
Minimum-165.43333
Maximum354.47333
Zeros6214
Zeros (%)13.6%
Negative4057
Negative (%)8.9%
Memory size357.4 KiB
2023-01-25T14:26:37.249472image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum-165.43333
5-th percentile-90.427
Q10
median35.66667
Q3157.16667
95-th percentile168
Maximum354.47333
Range519.90666
Interquartile range (IQR)157.16667

Descriptive statistics

Standard deviation80.655258
Coefficient of variation (CV)1.3210783
Kurtosis-0.73139356
Mean61.052594
Median Absolute Deviation (MAD)39.53972
Skewness-0.17438133
Sum2345091.2
Variance6505.2706
MonotonicityNot monotonic
2023-01-25T14:26:37.389975image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6214
 
13.6%
35.66667 4985
 
10.9%
168 3040
 
6.6%
26 1506
 
3.3%
159.75 657
 
1.4%
159.66667 637
 
1.4%
157.16667 542
 
1.2%
155.75 473
 
1.0%
160.5 263
 
0.6%
-70 228
 
0.5%
Other values (14630) 19866
43.4%
(Missing) 7315
 
16.0%
ValueCountFrequency (%)
-165.43333 9
< 0.1%
-165.11667 17
< 0.1%
-163.16667 1
 
< 0.1%
-162.55 1
 
< 0.1%
-157.86667 1
 
< 0.1%
-157.78333 1
 
< 0.1%
-149.5 4
 
< 0.1%
-148.55 2
 
< 0.1%
-148 3
 
< 0.1%
-146.26667 1
 
< 0.1%
ValueCountFrequency (%)
354.47333 1
 
< 0.1%
178.2 1
 
< 0.1%
178.08333 1
 
< 0.1%
175.73028 1
 
< 0.1%
175.13333 1
 
< 0.1%
175 185
0.4%
174.50043 1
 
< 0.1%
174.4 1
 
< 0.1%
172.7 1
 
< 0.1%
172.6 1
 
< 0.1%

GeoLocation
Categorical

HIGH CARDINALITY  MISSING 

Distinct17100
Distinct (%)44.5%
Missing7315
Missing (%)16.0%
Memory size2.9 MiB
(0.0, 0.0)
6214 
(-71.5, 35.66667)
4761 
(-84.0, 168.0)
3040 
(-72.0, 26.0)
 
1505
(-79.68333, 159.75)
 
657
Other values (17095)
22234 

Length

Max length24
Median length22
Mean length17.304809
Min length10

Characters and Unicode

Total characters664695
Distinct characters16
Distinct categories6 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16363 ?
Unique (%)42.6%

Sample

1st row(50.775, 6.08333)
2nd row(56.18333, 10.23333)
3rd row(54.21667, -113.0)
4th row(16.88333, -99.9)
5th row(-33.16667, -64.95)

Common Values

ValueCountFrequency (%)
(0.0, 0.0) 6214
 
13.6%
(-71.5, 35.66667) 4761
 
10.4%
(-84.0, 168.0) 3040
 
6.6%
(-72.0, 26.0) 1505
 
3.3%
(-79.68333, 159.75) 657
 
1.4%
(-76.71667, 159.66667) 637
 
1.4%
(-76.18333, 157.16667) 539
 
1.2%
(-79.68333, 155.75) 473
 
1.0%
(-84.21667, 160.5) 263
 
0.6%
(-86.36667, -70.0) 226
 
0.5%
Other values (17090) 20096
43.9%
(Missing) 7315
 
16.0%

Length

2023-01-25T14:26:37.528905image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
0.0 12652
 
16.5%
35.66667 4991
 
6.5%
71.5 4761
 
6.2%
84.0 3041
 
4.0%
168.0 3040
 
4.0%
26.0 1512
 
2.0%
72.0 1506
 
2.0%
79.68333 1130
 
1.5%
76.71667 680
 
0.9%
159.75 657
 
0.9%
Other values (26608) 42852
55.8%

Most occurring characters

ValueCountFrequency (%)
. 76822
11.6%
6 67560
 
10.2%
7 52499
 
7.9%
0 49033
 
7.4%
3 44771
 
6.7%
1 44476
 
6.7%
5 42757
 
6.4%
( 38411
 
5.8%
, 38411
 
5.8%
38411
 
5.8%
Other values (6) 171544
25.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 406756
61.2%
Other Punctuation 115233
 
17.3%
Open Punctuation 38411
 
5.8%
Space Separator 38411
 
5.8%
Close Punctuation 38411
 
5.8%
Dash Punctuation 27473
 
4.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 67560
16.6%
7 52499
12.9%
0 49033
12.1%
3 44771
11.0%
1 44476
10.9%
5 42757
10.5%
8 32680
8.0%
2 29923
7.4%
4 23646
 
5.8%
9 19411
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 76822
66.7%
, 38411
33.3%
Open Punctuation
ValueCountFrequency (%)
( 38411
100.0%
Space Separator
ValueCountFrequency (%)
38411
100.0%
Close Punctuation
ValueCountFrequency (%)
) 38411
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 27473
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 664695
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 76822
11.6%
6 67560
 
10.2%
7 52499
 
7.9%
0 49033
 
7.4%
3 44771
 
6.7%
1 44476
 
6.7%
5 42757
 
6.4%
( 38411
 
5.8%
, 38411
 
5.8%
38411
 
5.8%
Other values (6) 171544
25.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 664695
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 76822
11.6%
6 67560
 
10.2%
7 52499
 
7.9%
0 49033
 
7.4%
3 44771
 
6.7%
1 44476
 
6.7%
5 42757
 
6.4%
( 38411
 
5.8%
, 38411
 
5.8%
38411
 
5.8%
Other values (6) 171544
25.8%

source
Categorical

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.7 MiB
NASA
45726 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters182904
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNASA
2nd rowNASA
3rd rowNASA
4th rowNASA
5th rowNASA

Common Values

ValueCountFrequency (%)
NASA 45726
100.0%

Length

2023-01-25T14:26:37.645437image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-01-25T14:26:37.746455image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
nasa 45726
100.0%

Most occurring characters

ValueCountFrequency (%)
A 91452
50.0%
N 45726
25.0%
S 45726
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 182904
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 91452
50.0%
N 45726
25.0%
S 45726
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 182904
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 91452
50.0%
N 45726
25.0%
S 45726
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 182904
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 91452
50.0%
N 45726
25.0%
S 45726
25.0%

boolean
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size44.8 KiB
True
22934 
False
22792 
ValueCountFrequency (%)
True 22934
50.2%
False 22792
49.8%
2023-01-25T14:26:37.832136image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

mixed
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
A
22889 
1
22837 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45726
Distinct characters2
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd rowA
3rd row1
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A 22889
50.1%
1 22837
49.9%

Length

2023-01-25T14:26:37.921741image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-01-25T14:26:38.025888image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
a 22889
50.1%
1 22837
49.9%

Most occurring characters

ValueCountFrequency (%)
A 22889
50.1%
1 22837
49.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 22889
50.1%
Decimal Number 22837
49.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 22889
100.0%
Decimal Number
ValueCountFrequency (%)
1 22837
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22889
50.1%
Common 22837
49.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 22889
100.0%
Common
ValueCountFrequency (%)
1 22837
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45726
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 22889
50.1%
1 22837
49.9%

unhashable
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.4 MiB

reclat_city
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct38401
Distinct (%)> 99.9%
Missing7315
Missing (%)16.0%
Infinite0
Infinite (%)0.0%
Mean-39.153542
Minimum-104.31717
Maximum77.749011
Zeros0
Zeros (%)0.0%
Negative26603
Negative (%)58.2%
Memory size357.4 KiB
2023-01-25T14:26:38.136112image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum-104.31717
5-th percentile-87.871058
Q1-78.407752
median-68.975293
Q34.7886449
95-th percentile35.42981
Maximum77.749011
Range182.06618
Interquartile range (IQR)83.196397

Descriptive statistics

Standard deviation46.685687
Coefficient of variation (CV)-1.1923745
Kurtosis-1.446385
Mean-39.153542
Median Absolute Deviation (MAD)17.255843
Skewness0.48160358
Sum-1503926.7
Variance2179.5534
MonotonicityNot monotonic
2023-01-25T14:26:38.275006image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50.51806008 2
 
< 0.1%
43.27957156 2
 
< 0.1%
52.01104434 2
 
< 0.1%
-32.5810219 2
 
< 0.1%
49.60726921 2
 
< 0.1%
-29.65152821 2
 
< 0.1%
36.5165896 2
 
< 0.1%
-23.28864666 2
 
< 0.1%
23.16596589 2
 
< 0.1%
52.70663547 2
 
< 0.1%
Other values (38391) 38391
84.0%
(Missing) 7315
 
16.0%
ValueCountFrequency (%)
-104.3171665 1
< 0.1%
-102.4312375 1
< 0.1%
-102.0868253 1
< 0.1%
-101.5556373 1
< 0.1%
-101.3269284 1
< 0.1%
-101.2084341 1
< 0.1%
-101.0146935 1
< 0.1%
-100.9191264 1
< 0.1%
-100.7856947 1
< 0.1%
-100.5751117 1
< 0.1%
ValueCountFrequency (%)
77.74901083 1
< 0.1%
72.80622023 1
< 0.1%
72.75730423 1
< 0.1%
72.42607973 1
< 0.1%
72.25809595 1
< 0.1%
71.78938297 1
< 0.1%
71.42543169 1
< 0.1%
70.89755212 1
< 0.1%
70.53373183 1
< 0.1%
70.48523932 1
< 0.1%

Interactions

2023-01-25T14:26:33.715289image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-01-25T14:26:31.123141image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-01-25T14:26:31.755647image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-01-25T14:26:32.490438image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-01-25T14:26:33.109823image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-01-25T14:26:33.844691image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/