Learn Big Data with Google's Dataflow —in 8 weeks.
Big Data just got a lot easier!
In 2015 Google announced and released Dataflow — an open source SDK and a fully managed service for building and executing sophisticated Big Data computation pipelines. Dataflow SDK removes the burden of having to be a MapReduce expert, while Dataflow managed service removes the need to build out a complex Big Data infrastructure. Moreover, as of February 1st, 2016 Dataflow SDK has been officially accepted as an Apache Software Foundation incubator project — renamed as Apache Beam. Pipelines written with Apache Beam SDK can be executed on infrastructures such as Apache Flink, Apache Spark, as well as Google's Dataflow managed service. Going forward, this will most certainly establish the Dataflow/Apache Beam SDK as THE standard for building and executing Big Data computations.
Be the first to become an expert Dataflow developer. Harness the power of Google style Big Data by learning how to build sophisticated pipelines that will bring insight into your app's user behavior & habits, allow you to imbue your products with real-time analytics, and will let you slice & dice data for machine learning. — all in 8 short weeks.
Who are we?
We're a group of experienced Software Engineers who've previously worked at a number of large software companies (including Google). We've witnessed first hand the power of Big Data computing and saw how critical it is to the success of major software corporations. Frankly speaking, deciding to not employ Big Data in a tech business is almost as bad as deciding to go for a drive with your eyes closed. DON'T DO IT!
Here’s what we want to build
To be fair to all the tech entrepreneurs out there, in the past the costs and the tech expertise associated with Big Data computing were often prohibitive. You'd need senior engineering expertise to setup & manage your infrastructure, as well as to build & run the actual pipelines. With the advent of Google's Dataflow, these problems cease to exist.
We intend for this course to be a practical, hands on guide that you can use to get started with Big Data quickly and easily. We're expecting to create up to 10 hours of video content. We'll also be providing coding examples for nearly every section of our course. It will kick off with a walkthrough of all that Google's Dataflow platform has to offer. Then we'll jump right into building concrete examples that you can apply to your business today. The course will wrap up with one or two (depending on how much funding we raise) full system examples that you can use as foundation for real products. The final product will retail for $199 USD.
How is this course different?
Google's Dataflow is very new. As far as we know, no other course yet exists on the market. With that being said, we'd like to note that our team has years of experience with Dataflow's predecessor - FlumeJava, which has been used internally at Google since 2010. Because of this experience, we feel confident in saying that we have the best possible perspective on how and where to apply this technology, as well as how to get the most out of it.
Why learn Big Data and Google's Dataflow?
- With the power of Big Data you'll be able build pipelines that extract vital analytics from terabytes of data. Practically speaking, you'll be able to acquire insights like custom business intelligence about your product, understand how folks are engaging with your social media accounts, as well as many more things.
- Dataflow/Apache Beam SDK is likely to become the standard way of building and executing Big Data computation pipelines. This is your chance to get ahead of the curve by learning it today.
Who is this for?
- Any developer looking to broaden their technical horizons by learning about what is Big Data and how to apply it.
- Modern tech startups that generates tons of logs that harbor valuable user behavior data (that's most startups today).
What will the course cover?
Section 1 - Introduction
Section 2 - A Brief History of Big Data - MapReduce to Dataflow
Section 3 - Setup of Google Cloud Platform Account & Tools
Section 4 - Installing Eclipse
Section 5 - Java Crash Course
Section 6 - Overview of Dataflow SDK and Dataflow Managed Service
Section 7 - Batch Pipelines
Section 8 - Section 10 - Building & Testing Batch Pipeline Examples
Section 11 - Real-time/Streaming Pipelines
Section 12 - 14 - Building & Testing Real-time/Streaming Pipeline Examples
Section 15 - End-to-End Log Analytics Example
Section 16 - Where do you go from here?
Will this course be for you?
If you're looking to build a career in the much sought after Big Data engineering profession, then this course is the perfect way for you to get started on a path to that goal.
If you're a small startup that's begun to acquire new users and you don't yet have plans in place for how you're going to convert your application logs into key customer insights, then this course is definitely for you!
Why are we doing a Kickstarter?
We love the community aspect of Kickstarter. Being able to connect with our students before the course is fully built is an amazing opportunity for us to make sure that we're teaching what people actually need. We want this course to show practical solutions to existing problems. Getting a dialog going with our future students is the best of way of getting this done!
Risks and challenges
There are very few risk here. Your money will only start getting spent after the lesson plans are fully constructed and pre-production has been done. In the unlikely event that we cannot get through pre-production, we will NOT move ahead with shooting videos and coding examples. We'll simply refund everyone.Learn about accountability on Kickstarter
- (60 days)