Generates profile reports from a pandas
df.describe() function is great but a little basic for serious exploratory data analysis.
pandas_profiling extends the pandas DataFrame with
df.profile_report() for quick data analysis.
For each column the following statistics - if relevant for the column type - are presented in an interactive HTML report:
Type inference: detect the types of columns in a dataframe.
Essentials: type, unique values, missing values
Quantile statistics like minimum value, Q1, median, Q3, maximum, range, interquartile range
Descriptive statistics like mean, mode, standard deviation, sum, median absolute deviation, coefficient of variation, kurtosis, skewness
Most frequent values
Correlations highlighting of highly correlated variables, Spearman, Pearson and Kendall matrices
Missing values matrix, count, heatmap and dendrogram of missing values
Duplicate rows Lists the most occurring duplicate rows
Text analysis learn about categories (Uppercase, Space), scripts (Latin, Cyrillic) and blocks (ASCII) of text data