Getting started Start by loading in your pandas DataFrame, e.g. by using
import numpy as np import pandas as pd from pandas_profiling import ProfileReport df = pd.DataFrame(np.random.rand(100, 5), columns=["a", "b", "c", "d", "e"])
To generate the report, run:
profile = ProfileReport(df, title="Pandas Profiling Report")
You can configure the profile report in any way you like. The example code below loads the explorative configuration file, that includes many features for text (length distribution, word distribution and character/unicode information), files (file size, creation time) and images (dimensions, exif information). If you are interested what exact settings were used, you can compare with the default configuration file.
profile = ProfileReport(df, title="Pandas Profiling Report", explorative=True)
Learn more about configuring
pandas-profiling on the Advanced Usage page.
We recommend generating reports interactively by using the Jupyter notebook. There are two interfaces (see animations below): through widgets and through a HTML report.
This is achieved by simply displaying the report. In the Jupyter Notebook, run:
The HTML report can be included in a Jupyter notebook:
Run the following code:
Saving the report
If you want to generate a HTML report file, save the
ProfileReport to an object and use the
Alternatively, you can obtain the data as json:
# As a string json_data = profile.to_json() # As a file profile.to_file("your_report.json")
Command line usage
For standard formatted CSV files that can be read immediately by pandas, you can use the pandas_profiling executable. Run
for information about options and arguments.