Other DataFrame libraries

pandas-profiling is built on pandas and numpy. Pandas supports a wide range of data formats including CSV, XLSX, SQL, JSON, HDF5, SAS, BigQuery and Stata. Read more on supported formats by Pandas.

If you have data in another framework of the Python Data ecosystem, you can use pandas-profiling by converting to a pandas DataFrame, as direct integrations are not yet supported. Large datasets might require sampling (as seen in Profiling large datasets).

PySpark to Pandas
 # Convert spark RDD to a pandas DataFrame
 df = spark_df.toPandas()
Dask to Pandas
 # Convert dask DataFrame to a pandas DataFrame
 df = df.compute()
Vaex to Pandas
 # Convert vaex DataFrame to a pandas DataFrame
 df = df.to_pandas_df()
Modin to Pandas
# Convert modin DataFrame to pandas DataFrame
df = df._to_pandas()

# Note that:
#   "This is not part of the API as pandas.DataFrame, naturally, does not posses such a method.
#   You can use the private method DataFrame._to_pandas() to do this conversion.
#   If you would like to do this through the official API you can always save the Modin DataFrame to
#   storage (csv, hdf, sql, ect) and then read it back using Pandas. This will probably be the safer
#   way when working big DataFrames, to avoid out of memory issues."
# Source: https://github.com/modin-project/modin/issues/896