About Projects Experience Skills Weta Contact

Samantha Tan

Principal Data Scientist & AI/ML Engineer

Designing intelligent systems across finance, retail, and real estate

0 Years of Experience
0 AI / ML projects in 2025
0 AI programs completed
0 Industries served

About Me

I am a Principal Data Scientist and AI/ML Engineer who has spent the last 8 years turning messy, real‑world problems into reliable machine learning and LLM systems.

My work focuses on applied AI — from forecasting and recommendation systems to GenAI, RAG and multi‑agent workflows — mostly in real estate, retail, QSR and financial services. I enjoy designing experiments, evaluation strategies and production pipelines that make models not only accurate, but observable, debuggable and safe to ship.

These days I split my time between leading AI/ML initiatives at CBRE and deepening my foundations through programs like Stanford's AI and reinforcement learning courses, with a particular interest in agents, RLHF/RLAIF and practical evaluation for LLM‑based systems.

Fueled by curiosity (and coffee)
Passionate about applied AI
Samantha Tan portrait
8 Years of hands‑on AI / ML experience
Principal AI/ML leadership & architecture
ML Platform MLOps, data pipelines & observability
LLM Systems Agents, RAG, evaluation & safety

Featured Projects

📆

Voice‑Driven Calendar Agent

LLM‑powered assistant that automates Google Calendar via browser

Built a FastAPI + Playwright agent that turns natural spoken commands into structured events, performs conflict checks, and drives the real Google Calendar UI to create or reject meetings.

LLM Agent FastAPI Playwright Google Calendar
🧬

AI Digital Twin

Personal website agent that talks as my professional twin

Created a backend + Next.js front‑end that serves an AI digital twin of myself, grounded in structured facts, LinkedIn profile, summaries, and communication style, with session memory for multi‑turn conversations.

Digital Twin LLM Agent Python Next.js AWS / Terraform
💼

Financial Lead‑Gen Agent

OpenAI Assistants + Airtable for sales automation

Developed a Flask API around an OpenAI Assistants v2 financial advisor that captures user intent, validates contact details, and writes structured leads directly into Airtable via function calling.

RAG OpenAI Assistants Function Calling Flask Airtable
🎯

Contextual Bandit Treatment Policy

Optimizing Warfarin dosing with LinUCB

Implemented and evaluated contextual bandit algorithms (LinUCB and baselines) on clinical warfarin data to learn personalized dosing policies and study exploration–exploitation trade‑offs.

Reinforcement Learning Contextual Bandits LinUCB Python
🕹️

Deep Q‑Learning for Atari

Nature DQN and linear agents on Pong

Built tabular, linear and deep Q‑learning agents for Pong‑v5, including exploration and learning‑rate schedules, frame preprocessing, and analysis of Atari training curves with TensorBoard.

DQN Deep RL Atari Python
🤝

RL from Human Feedback on Hopper

Reward modeling, DPO and RLHF in continuous control

Implemented a reward model from preference data and used it to train PPO agents with DPO and RLHF variants on the Hopper‑v4 environment, comparing learned vs original returns and analyzing stability.

RLHF DPO PPO Gym Python
🏃‍♀️

Policy Gradient Control

Baseline vs no‑baseline policy gradient on classic control

Implemented vanilla policy gradient with and without baselines across CartPole, InvertedPendulum and HalfCheetah, analyzing variance reduction, learning curves and stability across random seeds.

Policy Gradient Baselines Gym Python
🔤

Knowledge‑Rich GPT Mini

implementing a GPT‑style Transformer for factual QA

Implemented a mini GPT model with causal self‑attention, explored cross‑attention and positional encodings, and evaluated how well pretrained Transformers access factual knowledge for question answering over Wikipedia‑style text.

Transformers GPT PyTorch
🌐

Neural Machine Translation

BiLSTM + Luong attention for character‑level NMT

Implemented a bidirectional LSTM encoder–decoder with global Luong attention for character‑level English–Chinese translation, including beam search decoding and careful handling of padding and masking.

NMT Attention PyTorch

Experience Timeline

A quick overview of my journey across data, AI, and analytics.

Jun 2024 – Present
Principal Data Scientist (AI/ML Engineer)
CBRE
Leading AI/ML initiatives including multi‑agent systems and RAG solutions on cloud infrastructure, partnering with senior stakeholders to drive product and analytics strategy.
Oct 2023 – Jun 2024
Data Scientist / Tech Lead
CBRE
Owned end‑to‑end ML projects from data engineering and modeling to deployment, while providing technical leadership and mentorship to the team.
2022 – 2023
Data Scientist — Project Lead
Loblaw Companies Limited
Led data science projects in retail and e‑commerce, building models and experimentation workflows to optimize customer experience and operations.
Feb 2022 – Oct 2022
Manager, Machine Learning Engineer
Restaurant Brands International
Designed and deployed ML solutions using PyTorch and Apache Spark to support marketing, pricing, and operations for global QSR brands.
2020 – 2022
Senior Data Analyst
iA Financial Group
Built tree‑based models and unsupervised learning workflows to support risk, underwriting, and customer analytics for financial products.
2019 – 2020
Research Assistant — Machine Learning and NLP
The AI Hub, Durham College
Conducted applied research in NLP and ML, contributing to academic and industry projects and supporting publication‑quality experiments.
Certificate
Artificial Intelligence Professional Program
Stanford University
Advanced courses in NLP with Deep Learning, Reinforcement Learning, and Deep Generative Models, covering LLMs, RLHF, DPO, VAE and modern generative techniques.
Certificate
Artificial Intelligence
University of Toronto
Focused on reinforcement learning, intelligent agents, deep learning, and retrieval‑augmented generation (RAG) with large language models.
Certificate
Data Science
University of Waterloo
Training in big data, statistics, machine learning, deep learning, and Bayesian inference for applied analytics.
B.Eng
Bachelor of Engineering
Universidad Tecnológica de México (GPA 3.7/4.0)
Engineering foundation with strong quantitative and problem‑solving skills.
MBA
Master of Business Administration
Universidad Tecnológica de México (GPA 4.0/4.0)
Business and management training that complements data‑driven decision‑making and product strategy.
Graduate Certificates
AI & Data Analytics
Durham College — President's Honour Roll
Artificial Intelligence Analysis, Design and Implementation, and Data Analytics for Business Decision Making.
Technical
Key Strengths
AI & Software
End‑to‑end AI/ML ownership: from framing problems and designing data pipelines to training/evaluating ML & LLM systems, and shipping reliable cloud‑native services with strong observability and reliability.
Future
What I'm Exploring Next
AI Research Interests
Generative agents, RLHF/RLAIF and evaluation methods that make LLM systems safer, more controllable and measurable — especially combining multi‑agent workflows with rigorous monitoring in production.

Skills & Tools

In Loving Memory

This space is dedicated to Weta, our Holland Lop boy, my best little companion and a true member of our family.

Weta stayed with us for 11 years and 7 months, until the morning of November 19, 2025. He was incredibly smart and deeply empathetic — he seemed to understand multiple languages, could sense our emotions, and responded with a kind of abstract understanding that felt far beyond what you expect from a pet.

He was the first real little actor in our home: playful, humorous, always finding ways to be naughty just enough to make everyone laugh. Weta loved being at the centre of attention and bathed in that love, becoming the spotlight of our family. He left carrying that mixture of sadness and pride at being so loved and appreciated.

🐾 Our bright, curious little boy who always knew how to pull our focus back to him.
Thank you, Weta, for growing up with us and filling our home with light. You are always part of our family.
Weta, our beloved Holland Lop rabbit

Let's Connect

For collaboration opportunities, speaking, or just to say hi, feel free to reach out via email or LinkedIn below.

Get In Touch

Location Toronto, Canada
LinkedIn samantha-tam7
GitHub samtam0714
LeetCode yy7tan
Opportunities Selective AI / ML leadership roles, consulting projects
and meaningful collaboration opportunities