Fundamentals of Data Science with Python
- 3.8
Brief Introduction
Implement powerful data science techniques with Python using NumPy, SciPy, Matplotlib, and scikit-learnDescription
Python has grown into a key language that can be used to develop solutions for a variety of data science challenges. This course will teach you the fundamentals of data science using Python and its growing collection of libraries that focus on particular elements of data science.
In this course, we will get hands-on with a variety of data science tasks. After a quick primer on Python, you will start with a quick task: sourcing, processing, and cleaning a dataset. Then, you will use Python to mine data from its source and analyze available data via statistical and probability analysis techniques by using NumPy and pandas. You will also look at modeling data in order to perform Artificial Intelligence prediction by using the SciPy, scikit-learn, and statsmodels libraries. The course also covers visualization methods using the Matplotlib library to display this analysis and visually demonstrate patterns in the data.
By the end of this course, you will be able to work on data science tasks in a practical way with different Python libraries and achieve your goals.
About the Author
Nicolas Rangeon is a freelance data scientist. He has spent the last 2 years teaching data science, emphasizing how to store, retrieve, and analyze data from any kind of database. He developed a feel for teaching both technical skills and mathematical concepts; both are required if you want to be a proficient data analyst.
After having graduated with a Masters degree in Computer Science, Nicolas worked as a freelance data scientist and data engineer for several small businesses where he deployed, managed, and mined databases in order to get value from their stored data.
When it comes to deploying and managing a relational database, his first choice is always PostgreSQL, due to its robustness and its ability to handle large amounts of data efficiently.
Requirements
- The course begins with a primer to Python, so you don’t have to worry if you haven’t worked with Python before.
Outline
- Primer on Python
- Course Overview
- Introduction to Python
- Installing Python and Creating a First Jupyter Notebook
- Overview of the Different Variable Types
- Manipulating Variables with Operators
- Writing Functions with Python
- Conditions and Loops
- Object-Oriented Programming with Python
- Test your knowledge
- Python for Data Science
- Introduction to the NumPy Array
- Manipulating NumPy Arrays with Operators and Aggregate Functions
- Making Your First Steps with Pandas
- Performing Common Operations with Pandas
- Test your knowledge
- Getting Your Dataset Ready for Processing
- Sourcing the Data
- Getting Familiar with Our Two Datasets
- Loading the Datasets into Your Program
- Getting a Global Overview of the Data
- Finding Missing Values in Data
- Cleaning the Data for Use
- Test your knowledge
- How Visualizations Work
- Using the Simple Bar Graph
- Exploring Histogram
- Working with Boxplots
- Detecting Correlations with Scatter Plots
- Extending Matplotlib's Possibilities Thanks to Seaborn
- Dealing with Several Plots
- Test your knowledge
- Working with Statistics and Probability
- Finding Patterns with Descriptive Statistics
- Finding Patterns with Python: Univariate Analysis
- Finding Patterns with Python: Bivariate Analysis
- Working with Probability and Distribution
- Inferential Statistics: Testing Hypothesis with the T-Test and the Chi² Test
- Test your knowledge
- Statistical Modelling and Fitting
- Exploring Statistical Modelling
- Using Data to Test a Statistical Model
- Exploring Linear Regression
- Analysis of Variance (ANOVA)
- Working with Logistic Regression
- Test your knowledge
- Explore Machine Learning
- Getting Started with Machine Learning and AI
- Differentiating Types of Learning
- Training a Model with Scikit-Learn
- Test your knowledge