Learning Data Science on the Ethereum Blockchain with @omnianalytics
71%
average score over 1 application evaluations
Develop modular online courses combining data science, blockchain analytics, and statistical programming, aimed at new Ethereum developers. Utilize case studies and open data sources for practical applications.

We want to share our joy of data science with the Ethereum eco-system through an informative set of online courses and case studies that merge data analysis, blockchain analytics and statistical programming.

We aspire to create a modular online course to help new blockchain developers understand the principles and best practices of data science. Using open data sources from across the Ethereum landscape (think DefiLlama, UniSwap, & Etherscan), the course will teach the topics of data munging, data visualization, exploratory data analysis, machine learning, and dabble in a bit of deep learning and artificial intelligence. As the course grows, we intend to create case studies around how to apply data science to specific DApps (think machine learning techniques for predicting markets with Numer.ai data or analyzing traits in Hashmasks).

Our ultimate vision is to spur and inspire the next generation of developers through interesting applications of data science to emerging blockchain technologies.

Course Link: https://github.com/Omni-Analytics-Group/eth-data-science-course

Course Outline ๐Ÿ’ปModule 1 - Basic Data Structures and Munging

Motivating Example (Slides, Python Slides, Video) Reading Files (Slides, Python Slides, Video) Basics (Slides, Video) Data structures (Slides, Video)

Data Sources: CryptoPunks, Crypto Art Pulse

๐Ÿ“Š Module 2 - Statistical Graphics and Visualization

Why graph? Visualization Principles and Practices Plotting Basics Building Plots Layer by Layer Polishing Your Graphs

Data Sources: Omni Analytics Group, CryptoPunks

๐Ÿค– Module 3 - Supervised and Unsupervised Machine Learning

The ML workflow Supervised learning for classification Unsupervised learning for grouping Forecasting what's next Deep learning for sequences

Data Sources: Omni Analytics Group

๐Ÿ’ก Module 4 - Case Studies

Clustering and segmenting Ethereum validator performance (R, Python, Video) Visualizing slashings in the Ethereum Medalla testnet (R, Python) Reconstructing the Crypto Sentiment Investment Curve in ggplot2 Interacting with and Analyzing Numerai Network Growth with GraphQL and ggplot2 Tornado.Cash Initial Distribution Analysis Stable Coin Analysis Using R Shiny to explore Numerai tournament data (Part 1, Part 2, Part 3, Part 4, Part 5, Part 6, Part 7, Part 8, Part 9) Crypto Punks NFT Value Analysis (Slides, Video) Hashmask Rarity Analysis (Slides, Video) Geocoding and Mapping Cryptojobs (Slides, Video) Cryptojobs Exploratory Data Analysis (Slides, Video) Making Graphs fun with BadgerDAO An Exploratory Data Analysis of PoolTogether An Exploratory Data Analysis of Yearn.Finance Interacting with the Covalent API via httr Analyzing the Ethereum Name Service (ENS) (Slides) Forecasting the Trend in Bitcoin Dominance (Slides) An Analysis of the Uniswap Platform (Slides) Forefront Social Token Analysis (Slides) Gitcoin Grants Analysis (Slides) Predicting Growth in L2 Chains (Slides) A Statistical Dive into the Unlock Protocol (Slides, Video) Filecoin Miner Index API Exploration (Slides) Uniswap Airdrop Liquidity Provider Analysis (Slides) Uniswap Governance Analysis (Slides)

Data Sources: Beaconscan, Numerai Tournament Data, Haskmasks, Crypto Punks, Crypto Jobs, PoolTogether, BadgerDAO, Yearn Finance, ENS, Uniswap, Unlock Protocol, Optimisim, ForeFront, Filecoin

๐Ÿ’œ Module 5 - Understanding and Defending Gitcoin

Quadratic Funding in the Wild: A Post-Round Analysis of Gitcoinโ€™s Fund Matching Mechanism (Slides) Gitcoin Grants Analysis (Slides)

Data Sources: Gitcoin, Omni Analytics Group

๐ŸŽ Lagniappe

The Omniacsโ€™ Data Science Code Snippet Tweet Book Vol 1 Kids Data Collection Worksheet

Updates

Gitcoin Grant Round 8

Like everything else in the world, 2020 flipped our course development plans upside down. Instead of building the course from the bottom up, we chose to repurpose and refactor our Medalla research into motivating case studies on how to perform data analysis on Ethereum 2.0 blockchain data. The case studies walk through, in detail, how we performed the analysis that ultimately netted us a bronze prize. We also included a tutorial on how to use R Shiny as an exploration tool for understanding the Numer.ai dApp's tournament data.

๐Ÿ“ Change log

Restructured the repository for clarity Added 3 case studies

Gitcoin Grants Round 9

For Round 9 we've doubled down on our "case study first" approach to teaching data science using projects on the Ethereum blockchain as examples. This update includes a look at NFTs, stable coins, market capitalization estimation, and an introduction to GraphQL. We expanded two of our original case studies to include Python version, so if you are interested in learning more about that language you can check those out here and here. This update also includes our first attempt at creating video lectures for the eager learners who would like to dive deeper into the concepts. We intend to use funding from this round to further expand the set of case studies we produce and improve the quality of our video content.

๐Ÿ“ Change log

Major update to the course aesthetics Module 1 updated 5 New case studies (with 2 more to be published during the active grant round) First video lecture published to Youtube

Gitcoin Grants Round 10

This round update had us working with our first outside contributor. @Amelia188 gave our course a proper copy edit by fixing tons of typos, correcting grammar errors and improving the overall readability of the material. We look forward to her continued contributions and encourage others to reach out to us about opportunities to collaborate. To further expand out the base content for the course, we completed the material for the second module that focuses on statistical graphics. Other updates for this round include two NFT related case studies and a host of new video lectures. Your support for this round will help us expand our contributor pool and further improve on the quality of our content.

๐Ÿ“ Change log

Tons of copy edits 4 Video lectures for Module 1 have been published Slides for Module 2 - Statistical Graphics and Visualization have been completed 2 NFT related case studies were created and videos produced

Gitcoin Grants Round 11

We've been busy! Over the last quarter we've been working with DAOs to help them understand their data and it has been this work that inspired us to create two new case studies all about blockchain jobs using data from Cryptojobs. In addition to these two case studies, we've included another one on Yearn.Finance created by our newest collaborator @vintro. This update also includes 3 new videos to supplement the case studies. As usual, we really appreciate the support and contributions this round will help us find and compensate additional course contributors.

๐Ÿ“ Change log

3 video lectures 3 New case studies Various copy edits

Gitcoin Grants Round 12

Our update for this round is on the smaller side. We have fixed a few links and added a couple of case new case studies. If you can, be a little patient with us. We're going to try to come back with some big updates for the next round. Stay tuned!

๐Ÿ“ Change log

2 New case studies Various link fixes and copy edits

Gitcoin Grants Round 13

In addition to making progress on Module 3, we've doubled down on our cross-platform initiative to include more Python coded examples. We now have the first two sections of Module 1 translated into Python. A special shout out to @JSchoonmaker! Stay tuned because we're actually going to be updating the course throughout this round.

๐Ÿ“ Change log

Added a Python tutorials for Module 1 Updates to Module 3 (sit tight!) Various link fixes and copy edits

Gitcoin Grants Round 14

Weโ€™re bouncing all around for this season! The new case studies for this round touch on forecasting L2 contract deployment, exploring ENS domain names, characterizing Filecoin miners, understanding Unlock Protocol contract interactions, and analyzing Uniswap with the uniswappeR package. Weโ€™ll be updating throughout the period so donโ€™t be too surprised if a few new case studies or additional content pops up out of nowhere! Also, we don't want to forget to extend kudos to @NadiaAntony for her course content contributions!

๐Ÿ“ Change log

5+ new Case Studies Various link fixes and copy edits Minor updates to Module 3

Gitcoin Grant Round 15 and Beta Round

This round we've decided to experiment with a little bit of alternative styled content! In the lagniappe folder you'll find a compilation "Tweet Book" of R programming related tweets from our Twitter. As our first foray into children's content, we created a data collection worksheet that has young, budding data scientists describing crypto currency logos in spreadsheet form. It should be a short, fun exercise for anyone getting used to how data is structured. We've also included two new Uniswap-centric case studies, one of which won an award for most insightful analysis of the protocol's governance. With that, we hope you enjoy this update!

๐Ÿ“ Change log

2 new Case Studies Added our first kids data collection worksheet Added Vol. 1 of our quick stats "Tweet book" Various link fixes

FAQ

Do you all have experience in this stuff?

Why yes, we do! Omni Analytics Group is a team of PhD level statistical consultants that have been teaching and solving difficult data science problems for nearly a decade. We are passionate about data science and blockchain technologies. Just check out our twitter.

Do I need any prior experience before taking this course?

Our intention is to start from the beginning and build up not only your data chops, but your statistical intuition and programming knowledge. At the end of these courses, you should be able to match a statistical technique to a blockchain data problem, write a basic script to analyze it and confidently search online for more advanced knowledge.

What programming languages will the course focus on?

We'll initially focus on the statistical language R, but then expand to Python. As the course grows, we hope to include examples with contracts written in Solidity.

Can I request a topic?

Sure! Once we flesh out the initial course material. If funding persists, we'd be more than happy to take suggestions on case studies or topics.

Testimonials

"This course is like a rain following a drought. It kindly walks you through the process starting from the use of R to introduction of graphs and machine learning concepts with interesting case studies. I strongly recommend it not only to researchers interested in Ethereum blockchain but also to any students or professionals that have interest in learning data analysis and science." - Will Shin (Principal Economist at Klaytn)

Impact and Accolades

March 11th, 2022 - 59 Stars - 16 Forks June 8th, 2022 - 76 Stars - 17 Forks Sept 6th, 2022 - 90 Stars - 21 Forks April 16th, 2023 - 101 Stars - 25 Forks

Featured Projects

CryptoPunks Crypto Art Pulse Numerai Tornado Cash Hashmasks Cryptojobs BadgerDAO Yearn.Finance Covalent PoolTogether ENS Unlock Protocol Filecoin Optimism Polygon Uniswap Gitcoin

Learning Data Science on the Ethereum Blockchain with @omnianalytics History

Explore projects

A Malayalam-focused media initiative educating and connecting 35 million potential users in Kerala, India, to the global crypto community through culturally-tailored content and events.
Test project description. Enter enter enter.
Conducting zero-knowledge proof (ZKP) research on private healthcare records for secure storage and analysis on the blockchain, preserving privacy while allowing data access for scientific study.
The Family Food Bank Project empowers families through sustainable tech, enhances food security, energy efficiency, and community resilience, leveraging Web3 and blockchain for global deployment and impact.
Premier pop-up hacker house in African cities to spark Ethereum and Layer 2 blockchain innovation with intensive collaboration, prototype development, and networking among builders and industry professionals.