atomcamp

Data Science Bootcamp Curriculum

Curriculum Highlights

Basics of Data Science

Python for AI

Machine Learning

NLP, LLMs,
Computer Vision

Curriculum Overview

  • Data Literacy
  • Statistical Foundations
  • Descriptive Statistics
  • Statistical Inference and Sampling Techniques
  • Regression Analysis
  • Basic Data and Statistical Concepts
  • Data Cleaning, Preparation, and Management
  • Data Processing
  • Importing data from diverse sources
  • Creating a basic report with various visuals
  • Data analysis, manipulation and filtering in Power BI
  • Creating measures and calculated columns
  • Filtering data in a report
  • Using slicers, dynamic filtering of a report
  • Introduction to DAX
  • Using DAX to solve complex data problems
  • Visualizing cross sections of data
  • Introduction to SQL
  • Filtering, Sorting, and Aggregating Data
  • Table Joins and Case Statements
  • Advanced Query Techniques
  • Window Functions and Data Modification
  • Creating New Tables
  • Advanced Topics and Artificial Intelligence
  •  Date Variables and Artificial Intelligence
  • Basics of Prompt Engineering
  • Definition and importance of prompt engineering.
  • The relationship between prompts and AI outputs.
  • Applications in Data Science and Machine Learning
  • Case studies illustrating the impact of prompt engineering in these fields.
  • Introduction of spatial analysis
  •  Introduction of QGIS
  • Creating shapefiles. (Point, line and polygon)
  • Learning basic Cartography
  • Learning basic vector analysis tools.
  • Creating heatmaps
  • Using Google maps to create maps in QGIS
  • Introduction of Raster file
  • Real-world examples of GIS in various industries and fields
  • Linear algebra 
  • Probability
  • Statistics
  • Introduction to Python: history, features, and advantages
  • Expressions and operators: arithmetic, assignment, comparison, and logical
  • Understanding type() function and type inference
  • Introduction to data structures: lists, tuples, and dictionaries
  • Recap of Python basics
  • Working with arithmetic operators: addition, subtraction, multiplication, division, modulus, and exponentiation
  • Using comparison operators: equal to, not equal to, greater than, less than, etc.
  • Logical operators: and, or, and not
  • Exploring advanced data types: sets and strings manipulation.
  • Evaluating expressions: operator precedence and associativity
  • Introduction to conditional statements: if, elif, and else
  • Executing code based on conditionals.
  • Understanding the flow of control in conditional statements
  • Iteration using the for loop: range(), iteration over lists, and strings.
  • Working with while loop: syntax, conditions, and examples
  • Combining loops and conditionals
  • Using the break statement to exit loops prematurely.
  • Utilizing the continue statement to skip iterations.
  • Implementing nested loops for complex iterations
  • Introduction to functions: purpose, advantages, and best practices
  • Defining and calling user-defined functions
  • Parameters and arguments: positional, keyword, and default values
  • Return statement and function output.
  • Variable scope and lifetime
  • Function documentation and code readability
  • Understanding exceptions: errors, exceptions, and exception hierarchy
  • Handling exceptions using try-except blocks: handling specific exceptions, multiple exceptions, and else and finally clauses.
  • Raising exceptions and creating custom exception classes
  • File handling in Python: opening, reading, writing, and closing files.
  • Working with different file modes and file objects
  • Introduction to the NumPy module: features and applications
  • Working with multidimensional arrays: creation, indexing, slicing, and reshaping
  • Performing element-wise operations: arithmetic, logical, and statistical
  • Overview of the Matplotlib module: data visualization and plotting
  • Customizing plots: line properties, markers, colors, labels, and legends
  • Introduction to Kaggle platform: features and benefits
  • Leveraging Kaggle for real-life datasets: data exploration, analysis, and visualization
  • Introduction to machine learning modules on Kaggle: scikit-learn, TensorFlow, and PyTorch
  • Overview of running machine learning experiments on Kaggle
  • Resources for further learning and exploration
  • Introduction to Exploratory Data Analysis (EDA)
  • Importance of EDA in data analysis
  • Steps involved in EDA
  • Handling missing values: identification, analysis, and treatment strategies • Imputation techniques for missing values
  • Data consistency checks using fuzzy logic
  • Binning and discretization techniques for continuous variables
  • Outlier detection and analysis methods
  • Handling outliers: techniques for treatment or removal
  • Importance of feature selection in EDA
  • Feature selection techniques: filter methods, wrapper methods, and embedded methods
  • Data wrangling: cleaning and transforming data for analysis
  • Handling categorical variables: encoding techniques
  • Inference and hypothesis testing in EDA
  • Common statistical tests: t-test, chi-square test, ANOVA, etc.
  • Visualization techniques for EDA: histograms, box plots, scatter plots, etc. 
  • Hands-on  practical session for complete EDA using a dataset
  • Evaluation metrics for classification problems: accuracy, precision, recall, F1 score, etc.
  • Introduction to Naive Bayes algorithm and its applications
  • Implementing Naive Bayes for classification tasks
  • Logistic Regression: theory, interpretation, and applications
  • Support Vector Machines (SVM): concepts, kernels, and use cases
  • Decision Trees: construction, pruning, and interpretability
  • Random Forests: ensemble learning and feature importance
  • Bagging and Boosting: techniques for improving model performance
  • Hyperparameter tuning techniques: grid search, random search, and Bayesian optimization
  • Principal Component Analysis (PCA): dimensionality reduction and feature extraction
  • Singular Value Decomposition (SVD): applications in matrix factorization and data compression
  • Introduction to clustering: unsupervised learning technique
  • Partitioning algorithms: K-means, K-medoids
  • Hierarchical clustering: agglomerative and divisive approaches
  • Density-based clustering: DBSCAN, OPTICS
  • Cluster evaluation metrics: silhouette coefficient, Davies-Bouldin index
  • Introduction to regression analysis
  • Linear regression: assumptions, interpretation, and model evaluation • Evaluation metrics for regression: mean squared error, R-squared, etc.
  • Other regression methods: polynomial regression, ridge regression, lasso regression
  • Concepts and characteristics of time series data
  • Time series components: trend, seasonality, and noise
  • Popular time series forecasting models: ARIMA, SARIMA, exponential smoothing • Implementing time series forecasting models
  • Evaluation metrics for time series forecasting: mean absolute error, mean absolute percentage error, etc.
  • Cross-validation techniques for time series data
  • Hyperparameter tuning for time series models
  • Overview of Natural Language Processing (NLP)
  • Evolution of Large Language Models (LLMs)
  • Importance and Applications of NLP and LLMs
  • Linguistic Concepts
  • Tokenization and Text Preprocessing
  • Part-of-Speech (POS) Tagging
  • Named Entity Recognition (NER)
  • Sentiment Analysis
  • Text Classification
  • Word Embeddings and Language Representations
  • The Transformer Architecture
  • Attention Mechanisms
  • GPT, BERT, and Other Key Models
  • Pretraining and Fine-Tuning Techniques
  • Evaluation Metrics and Benchmarks
  • Chatbots and Conversational AI
  • Text Summarization
  • Machine Translation
  • Content Generation and Creative Writing
  • Question Answering Systems
  • Semantic Search and Text Mining
  • Bias and Fairness
  • Privacy and Security
  • Model Interpretability and Explainability
  • Environmental Impact and Computational Requirements
  • Getting Started with NLP Libraries (spaCy, NLTK, Hugging Face Transformers)
  • Building a Simple Text Classifier
  • Fine-Tuning a Large Language Model for a Specific Task
  • Evaluating Model Performance and Error Analysis
  • Multimodal Models and Human-AI Interaction
  • Low-Resource Languages and Transfer Learning
  • Knowledge-Enhanced Language Models
  • Efficient Training and Deployment Techniques
  • Cascade and HOG classifiers to detect faces
  • Face detection using OpenCV and Dlib library
  • Detect other objects using OpenCV, such as cars, clocks, eyes, and full body of people
  • KCF and CSRT algorithms to perform object tracking
  • convolutional neural networks and implement them using Python and TensorFlow
  • Detect objects in images in videos using YOLO, one of the most powerful algorithms today
  • Recognize gestures and actions in videos using OpenCV
  • Create hallucinogenic images with Deep Dream
  • Create images that don’t exist in the real world with GANs (Generative Adversarial Networks)
  • Fundamentals of Reinforcement Learning
  • Sample-based Learning Methods
  • Prediction and Control with Function Approximation
  • Fundamentals of Diffusion Models
  • Stable Diffusion in Practice
  • Methods, Jobs and Tools of Stable Diffusion
  • Github Actions
  • Airflow
  • Kubernetes
  • MLFlow
  • ML System Design
  • API Building (Flask/FastAPI)
  • Cloud Services (AWS/Azure)
  • WandB
  • 10+ projects in 6 months
  • International speakers and mentors for guided projects
  • Industry level data sets and projects 
  • Continuous practice with real world case studies with data analytics
  • Skills demonstration on data cleaning, data analysis, data visualization
  • Email writing
  • Logic and critical thinking
  • Reporting writing
  • LinkedIn optimisation
  • Presentation and visual communication
  • Resume, CV and cover letter writing
  • Acing interviews
  • Personal branding
  • Global market understanding
  • One on one mentorship