Data peeping function?

T

Thor Whalen

The first thing I do once I import new data (as a pandas dataframe) is to .head() it, .describe() it, and then kick around a few specific stats according to what I see.

But I'm not satisfied with .describe(). Amongst others, non-numerical columns are ignored, and off-the-shelf stats will be computed for any numerical column.

I've been shopping around for a "data peeping" function that would:

(1) Have a hands-off mode where simply typing
diagnose_this(data)
the function would figure things out on its own, and notify me when in doubt. For example, would assume that any string data with not too many unique values should be considered categorical and appropriate statistics erected.

(2) Perform standard diagnoses and print them out. For example, (a) missing values? (b) heterogeneously formatted data? (c) columns with only one unique value? etc.

(3) Be parametrizable, if I so choose.

Does anyone know of such a function?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top