Start by loading your pandas
DataFrame as you normally would, e.g. by using:
import numpy as np import pandas as pd from pandas_profiling import ProfileReport df = pd.DataFrame(np.random.rand(100, 5), columns=["a", "b", "c", "d", "e"])
To generate the standard profiling report, merely run:
profile = ProfileReport(df, title="Pandas Profiling Report")
Using inside Jupyter Notebooks
There are two interfaces to consume the report inside a Jupyter notebook (see animations below): through widgets and through an embedded HTML report.
This is achieved by simply displaying the report as a set of widgets. In a Jupyter Notebook, run:
The HTML report can be directly embedded in a cell in a similar fashion:
Exporting the report to a file
To generate a HTML report file, save the
ProfileReport to an object and use the
Alternatively, the report’s data can be obtained as a JSON file:
# As a JSON string json_data = profile.to_json() # As a file profile.to_file("your_report.json")
Command line usage
For standard formatted CSV files (which can be read directly by pandas without additional settings), the
pandas_profiling executable can be used in the command line. The example below generates a report named Example Profiling Report, using a configuration file called
default.yaml, in the file
report.html by processing a
pandas_profiling --title "Example Profiling Report" --config_file default.yaml data.csv report.html
Information about all available options and arguments can be viewed through the command below. The CLI allows defining input and output filenames, setting a custom report title, specifying a configuration file for custom behaviour and control other advanced aspects of the experience.
The contents, behaviour and appearance of the report are easily customizable. The example code below loads the explorative configuration file, which includes many features for text analysis (length distribution, word distribution and character/unicode information), files (file size, creation time) and images (dimensions, EXIF information). The exact settings used in this explorative configuration file can be compared with the default configuration file.
profile = ProfileReport(df, title="Pandas Profiling Report", explorative=True)
On the CLI utility
pandas_profiling, this mode can be activated with the
-e flag. Learn more about configuring
pandas-profiling on the Available settings.