Principal Data Scientist

Sebastián
García Vilches

I build ML systems that move financial metrics. Currently leading data science at MACH, Chile's largest digital bank — from credit scoring to LLM-powered products. Bridging deep technical expertise with business strategy through an MBA at UC.

|Santiago, Chile

About

Building ML systems that matter

I'm a data scientist with 7+ years of experience building and deploying machine learning systems in fintech and banking. I specialize in the full ML lifecycle — from problem framing and feature engineering to production deployment and monitoring.

At MACH, Chile's largest digital bank, I lead technical direction for the data science team, translating business strategy into ML systems that reach millions of users.

I studied Civil Industrial Engineering at Pontificia Universidad Católica de Chile (top 10% of class), and I'm currently pursuing an MBA at UC to complement my technical depth with strategic leadership skills.

Years in Fintech

50+

Models in Production

Companies

Experience

Where I've worked

7+ years building data products at the intersection of machine learning and financial services.

Technical lead for the data science team at Chile's largest digital bank. End-to-end ownership from research to production across recommenders, NLP, computer vision, and credit scoring systems.

▸Set technical direction for the data science org: modeling standards, feature engineering best practices, and production ML guidelines across multiple squads
▸Partnered with product, risk, and engineering leaders to prioritize high-impact ML initiatives, aligning technical roadmaps with business strategy
▸Built a personalization recommendation model increasing user engagement by +130% incremental clicks
▸Designed an LLM-based text classification model for the customer service chatbot, improving satisfaction score from 10% to 60%
▸Developed a Computer Vision identity validation model, replacing a costly external provider and reducing operational expenses by ~10%
▸Led credit scoring system development, enabling access to financial services for over 150,000 customers
▸Led a behavioral model enabling 80% credit exposure expansion for top-tier customers
▸Architected MACH's data lake migration to Apache Iceberg, reducing SQL query costs by 80% with full data versioning
▸Designed and institutionalized an org-wide MLOps monitoring framework for early detection of feature drift and model degradation

PythonAWS SageMakerMLflowHuggingFaceApache IcebergComputer VisionLLMCredit ScoringSparkAirflow

Impact

Impact by the numbers

Key results from production ML systems — measured, deployed, and monitored at scale.

MACH

Engagement Uplift

Personalization recommendation model for mobile app home shortcuts.

MACH

Chatbot Satisfaction

LLM-based text classification transformed the customer service experience.

MACH

SQL Cost Reduction

Data lake migration to Apache Iceberg with full versioning and scalability.

MACH

Credit Exposure Expansion

Behavioral model enabling significantly higher credit limits for top-tier clients.

MACH

Operational Cost Savings

Computer Vision identity validation replacing a costly third-party provider.

Santander

Report Automation

C-level financial report fully automated with near-zero calculation errors.

Skills

Tools of the trade

The technologies I use to build, deploy, and monitor ML systems at scale.

Languages

PythonSQLRGit

ML & AI

Scikit-LearnHuggingFaceMLflowComputer VisionLLM / NLPRecommendation SystemsCredit Risk ModelingA/B Testing

Data Engineering

Apache AirflowApache SparkApache IcebergFeature StoresETL Pipelines

Cloud (AWS)

SageMakerGlueLambdaAthenaS3QuickSight

MLOps

Model MonitoringDrift DetectionModel GovernanceETL TestingProduction ML

Projects

Side projects

Personal R&D — exploring the intersection of audio ML, LLM APIs, and developer tools.

Real-time Audio Transcription Pipeline

Active

A personal productivity tool that captures voice input, filters silence, transcribes speech in real time, and passes the transcript to Claude for analysis — all with a push-to-talk interface.

Pipeline

Microphone

PyAudio capture

VAD

Silero silence filter

Transcribe

Cohere multilingual

Analyze

Claude API

Real-time voice activity detection (VAD) using Silero to eliminate silence and reduce transcription costs
Spanish-first transcription via Cohere's multilingual API
LLM analysis layer (Claude) for summarization, data extraction, and compliance review
Push-to-talk GUI designed as a voice-first interface for Claude Code

PythonSilero VADCohereClaude APIPyAudio

Contact

Let's connect

Open to new opportunities, collaborations, or just a good conversation about ML systems.

Reach me directly

garciav.sebastian@gmail.com

linkedin.com/in/sebastian-garcia-vilches

Santiago, Chile

I typically respond within 24-48 hours. For urgent matters, LinkedIn is the fastest way to reach me.

SebastiánGarcía Vilches

Building ML systems that matter

Where I've worked

Principal Data Scientist

Data Engineer

Data Scientist

Mathematical Modeling Engineer

Impact by the numbers

Tools of the trade

Languages

ML & AI

Data Engineering

Cloud (AWS)

MLOps

Side projects

Real-time Audio Transcription Pipeline

Let's connect

Reach me directly

Sebastián
García Vilches