Data cleaning workouts

F

Fg Nu

List folk,

I am a newbie trying to get used to Python. I was wondering if anyone knows of web resources that teach good practices in data cleaning and management for statistics/analytics/machine learning, particularly using Python.

Ideally, these would be exercises of the form: here is some horrible raw data --> here is what it should look like after it has been cleaned. Guidelines about steps that should always be taken, practices that should be avoided; basically, workflow of data analysis in Python with special emphasis on the cleaning part.
 
R

rusi

List folk,

I am a newbie trying to get used to Python. I was wondering if anyone knows of web resources that teach good practices in data cleaning and management for statistics/analytics/machine learning, particularly using Python.

Ideally, these would be exercises of the form: here is some horrible raw data --> here is what it should look like after it has been cleaned. Guidelines about steps that should always be taken, practices that should be avoided; basically, workflow of data analysis in Python with special emphasis on the cleaning part.

Since no one has answered, I suggest you narrow your searching from
'python' to 'scipy' (or 'numpy').
Also perhaps ipython.
And then perhaps try those specific mailing lists/fora.

Since I dont know this area much, not saying more.
 
F

Fg Nu

Thanks. I will try the SciPy list. It was a bit of a hail mary anyway. Pretty sure elevated Python types don't actually get their hands dirty with data. ;)



----- Original Message -----
From: rusi <[email protected]>
To: (e-mail address removed)
Cc:
Sent: Thursday, August 23, 2012 11:01 PM
Subject: Re: Data cleaning workouts

List folk,

I am a newbie trying to get used to Python. I was wondering if anyone knows of webresources that teach good practices in data cleaning and management for statistics/analytics/machine learning, particularly using Python.

Ideally, these would be exercises of the form: here is some horrible raw data --> here is what it should look like after it has been cleaned. Guidelines about steps that should always be taken, practices that should be avoided; basically, workflow of data analysis in Python with special emphasis on thecleaning part.

Since no one has answered, I suggest you narrow your searching from
'python' to 'scipy' (or 'numpy').
Also perhaps ipython.
And then perhaps try those specific mailing lists/fora.

Since I dont know this area much, not saying more.
 
M

Mark Lawrence

Elevated Python types don't get their hands dirty top posting, but I'm
certain that they would when talking data or there wouldn't be so many
debates on which data type to use :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top