Data Analyst Portfolio

Data Analyst building machine learning, GenAI, and BI projects with Python and SQL.

Focused on supervised learning, causal inference, forecasting, Power BI, and natural-language AI, with code and notebooks organized on GitHub.

Focus Supervised learning, causal inference, forecasting, natural-language AI

Tools Python, SQL, Power BI, PySpark, Fabric, Databricks

Aarista

Current role

Data Analyst

Currently working at Aarista / Altea Healthcare on analytics, reporting, dashboards, and data product work.

McGill University

Education

McGill University

Master of Management in Analytics with training in applied analytics, experimentation, modeling, and business-facing data work.

Portfolio Projects

Selected GitHub projects.

Spotlight project

Natural Language to SQL Query Generator

A GenAI project built around translating natural-language questions into SQL, making structured data more accessible through an LLM-driven interface.

GenAI

natural-language interface

SQL

query generation

LLM

prompt-driven workflow

Problem

Reduce the friction between plain-language business questions and the SQL needed to query structured datasets.

Approach

Built a notebook-driven GenAI workflow that maps natural-language prompts to SQL generation logic within a focused application setting.

Outcome

Added a more concrete GenAI project to the portfolio with a direct link to a real analytics use case.

GenAI experimentation

GenAI Models

Notebook-based generative AI experiments focused on model behavior, applied workflows, and prompt-driven exploration.

GenAI Prompting Notebook Workflows
Repository

Causal inference

Causal Inference Model for Bank Client Subscription

Built a project focused on estimating the impact of marketing campaigns on bank client term-deposit subscriptions using causal inference techniques.

Causal Inference Marketing Analytics Python
Repository

Supervised learning

PyTorch Multi-task Learning for House Prices and Categories

Developed a multi-task deep learning project that predicts both house prices and house categories in a single supervised learning setup.

PyTorch Regression Classification
Repository

Capabilities

Machine learning, GenAI, BI, and data engineering.

Generative AI and agents

Applied work around LLM systems, tools, memory, orchestration, and modern GenAI workflows.

Supervised learning and NLP

Classification, regression, anomaly detection, NLP, SHAP-based explanation, and notebook-based ML workflows using practical datasets.

Causal inference and analysis

Causal modeling, bias analysis, and experimentation approaches that help separate signal from correlation.

Data engineering and BI

SQL, PySpark, Fabric, Databricks, ETL and ELT pipelines, embedded Power BI, and structured project design for usable outputs.

GitHub Project Archive

Public GitHub repositories.

GitHub repositories

Loading projects from GitHub...

How I Work

Clear problems, readable outputs, reproducible structure.

Problem first

I prefer projects with a clear question, measurable output, and a reason the result matters.

Simple communication

Models and dashboards are most useful when the choices, tradeoffs, and conclusions stay easy to follow.

Reproducible structure

I favor project setups that make it easier for someone else to review the work, rerun it, and build on it.

Get in Touch

Send a message or review the work on GitHub.

For a fast review, start with GitHub. For direct outreach, use the message form and your email app will open a prefilled draft.

This opens your email app with a prepared draft.