Jump to content

PySpark Essentials for Data Scientists (Big Data + Python)


Oranek

Recommended Posts

PySpark Essentials for Data Scientists (Big Data + Python) 

 

Language: English |

Format: MP4 | AVC 1280×720 | AAC 48KHz 2ch

Duration: 16 Hours

Size: 7.43 GB

 

Learn how to wrangle Big Data for Machine Learning using Python & MLflow on Apache Spark taught by an industry expert!

 

This course is for data scientists (or aspiring data scientists) who want to get PRACTICAL training in PySpark (Python for Apache Spark) using REAL WORLD datasets and APPLICABLE coding knowledge that you’ll use everyday as a data scientist! By enrolling in this course, you’ll gain access to over 100 lectures, hundreds of example problems and quizzes and over 100,000 lines of code!

 

I’m going to provide the essentials for what you need to know to be an expert in Pyspark by the end of this course, that I’ve designed based on my EXTENSIVE experience consulting as a data scientist for clients like the IRS, the US Department of Labor and United States Veterans Affairs.

 

I’ve structured the lectures and coding exercises for real world application, so you can understand how PySpark is actually used on the job. We are also going to dive into my custom functions that I wrote MYSELF to get you up and running in the MLlib API fast and make getting started building machine learning models a breeze! We will also touch on MLflow which will help us manage and track our model training and evaluation process in a custom user interface that will make you even more competitive on the job market!

 

Each section will have a concept review lecture as well as code along activities structured problem sets for you to work through to help you put what you have learned into action, as well as the solutions to each problem in case you get stuck. Additionally, real world consulting projects have been provided in every section with AUTHENTIC datasets to help you think through how to apply each of the concepts we have covered.

 

Lastly, I’ve written up some condensed review notebooks and handouts of all the course content to make it super easy for you to reference later on. This will be super helpful once you land your first job programming in PySpark!

 

 

What you’ll learn

⭐ Use Python with Big Data on a distributed framework (Apache Spark)
⭐ Work with REAL datasets on realistic consulting projects
⭐ Gets hands on practice solving REAL problems with BIG DATA
⭐ Integrate a UI to monitor your model training and development process with MLflow
⭐ Theory and application of cutting edge data science algorithms
⭐ Manipulate, Join and Aggregate Dataframes in Spark with Python
⭐ Learn how to apply Spark’s machine learning techniques on distributed Dataframes
⭐ Cross Validation & Hyperparameter Tuning
⭐ Frequent Pattern Mining Techniques
⭐ Classification & Regression Techniques
⭐ Data Wrangling for Natural Language Processing
⭐ How to write SQL Queries in Spark

 

 

Part 1: 

This is the hidden content, please

 

Part 2: 

This is the hidden content, please

 

Part 3: 

This is the hidden content, please

 

Part 4: 

This is the hidden content, please

 

Part 5: 

This is the hidden content, please

 

 

NOTE: PLEASE PA REACT NAMAN SA POST NA TO THEN PA FOLLOW NIYO NA RIN AKO PARA UPDATED KAYO SA MGA POST KO ?

pyspark-essentials-data-scientists-video (1).jpg

Edited by Oranek
  • Love 1
Link to comment
Share on other sites

  • 2 weeks later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...