Inform. Inspire. Empower.

Jonathan Reyes - Innovation Engineering Leader

Senior Principal
Engineer

Establishing systems of innovation
Empowering human talent
Constantly learning...

Jonathan Reyes, Senior Principal Engineer and Innovation Engineering Leader specializing in AI platforms, microservices, and cloud architecture
Location
Denver, CO
Company
Dispatch

About

Senior Principal Engineer, leading transformative infrastructure and platform initiatives that enable rapid, reliable software delivery at scale.

With a foundation in innovation and entrepreneurship since 2005, I specialize in leading and innovating the way to resilient, impactful systems that bridge the gap between cutting-edge technology and business value.

Core Expertise

  • Innovation
  • Cross-functional Empowerment
  • Platform Engineering
  • Kubernetes & Cloud
  • Event Architecture
  • Developer Experience
  • Microservices

Certifications

  • AWS Solutions Architect
  • AWS ML Specialty
  • Certified ScrumMaster

Languages

  • English Native
  • Arabic Professional
20+
Years Experience
Learning
Innovation

Experience

Most Recent Technical Initiatives

01

Enterprise AI Platform Architecture & Governance Program

Context

Company needed scalable AI capabilities across 10+ teams while managing compliance, costs, security, and provider lock-in risks as AI adoption accelerated.

Action

Architected enterprise AI platform using Genkit framework with standardized agent lifecycle management. Deployed centralized AI Hub gateway handling 70K+ daily interactions with intelligent routing and fallbacks to various Bedrock models. Implemented comprehensive observability pipeline, guardrails for PII/toxicity, and policy enforcement. Worked with leadership to established governance board, ethics committee, and working groups for ongoing cross-functional knowledgesharing.

Result

Delivered 30x faster AI iteration. Observability surfaced prompt injection on existing initiatives and PII for immediate remediation. Internal teams have built 100+ efficiency/productivity workflows with approved models and guardrails.

Technologies

Genkit AWS Bedrock OpenAI Anthropic AI Gateway
02

Company-Wide AI Innovation Program

Context

Business teams blocked on engineering for AI and automation, creating bottlenecks in discovery, POC creation, and non-technical teams were severly overloaded with manual, but automatable, tasks consuming 20-40% of their week. Only 5% of company, mostly in engineering, leveraging AI capabilities despite competitive pressure.

Action

Designed and launched comprehensive AI enablement program reaching 130+ employees across 15+ technical and non-technical teams. Deployed N8N for no-code AI automation with several custom nodes for internal enablement. Integrated with centralized AI Hub for governance and cost control. Applied diffusion of innovation model for training department champions and spreading knowledge. Created self-serve documentation, videos, playbooks, and conducted 20+ training sessions. Worked with senior leadership to create and communicate the vision.

Result

Increased AI adoption from 5% to 90% of company within 3 months. Reduced AI implementation timelines from 2 months to 1 day. Initial rollout immediately escalated prompt injection and bias evals giving us clear insights into systems. Product teams immediately able to capitalize on this infrastructure for natural-language initiatives. Legal team has confidence in our governance strategy for detection and escalation of issues. Non-technical teams shipped 30+ AI workflows independently. Company will have saved $4M with efficiency gains through initial automation workflows. Created repeatable model now used for other technology rollouts.

Technologies

N8N AI Hub Low-Code Training Programs
03

Omen: Code Analysis CLI for AI-Assisted Development

Context

Tech debt backlogs that would take months to clear with a team of 30 devs, but no way to prioritize what actually mattered. PRs slipped through fast reviews in complex areas. Developers new to parts of the codebase didn't understand the nuances. Every change looked the same on the surface, but some touched high-churn, high-complexity code where bugs hide. AI tools could see the code, but not the weight behind it - that knot in your stomach when you open a god object knowing one wrong move could break everything.

Action

Built an open-source code analysis CLI in Go that surfaces the 'omens' - complexity hotspots, defect-prone files, architectural coupling, and technical debt. Started from PMAT/PAIML but rebuilt it for fast installs via Homebrew, cross-platform support, and seamless LLM integration. Added MCP server integration so Claude, Cursor, and Codex can see where the landmines are before writing code. Built quality gates via githooks that force AI to be more careful when touching risky areas. Supports 12+ languages through tree-sitter parsing.

Result

Refactored a large Rails API monolith - complexity dropped from 30 to under 12 across the codebase in 2 hours, zero regressions. Upgraded an Angular app from version 3 to 13 in 2 hours with significant code improvements and no regressions. Historical PR analysis proved the defect prediction works - flagging commits that would have caused production issues. Teams can now track codebase health trends over time and enforce quality standards beyond just test coverage.

Technologies

Go Tree-sitter MCP Server Git Analysis
04

AI-Powered Innovation & Ideation Platform

Context

Company suffering from 45-day define & design phase cycles in SDLC, creating significant bottlenecks in feature development. Ideas from across the organization couldn't reach the intake board, would take too long to vet, or couldn't be represented well because the submitter didn't have time to flesh out an idea. This limited innovation to a small group and missing valuable insights from customer-facing teams. Traditional define & design process required extensive documentation and multiple stakeholder meetings before ideas could even be considered.

Action

Architected and launched company-wide innovation platform based on Amazon's PRFAQ (Press Release/FAQ) methodology. Built AI-powered assistants to help employees articulate ideas regardless of technical expertise, handling formatting, language refinement, and stakeholder-specific framing. Implemented intelligent idea routing to relevant teams and automated initial feasibility assessments. Can integrate platform with downstream systems via MCP servers, API endpoints, and vector databases for seamless handoff to development teams and/or agents.

Result

Increased the idea submission pool to the entire company. Increased the quality of thought-out submissions (each submission was empowered by AI to improve understandability to different contexts). Reduced idea definition & design time from 45 days to 1 hour (99.9% improvement). Now there exists a centralized idea repository that can also assist with future ideation.

Technologies

MCP Servers Vector Databases AI Assistants API Integration PRFAQ Methodology
05

Monolith to Microservices Transformation (2-Year Platform Evolution)

Context

Monolithic architecture causing 45-60 minute deploy times, blocking multiple teams who were colliding on each other's code changes. Test suites taking hours to run. Production elasticity issues where traffic spikes weren't detected quickly enough by HPAs, causing availability interruptions and customer impact. Teams did not have any microservice experience. Teams unable to work independently, creating bottlenecks across the organization.

Action

Led 2-year microservices adoption strategy coordinating across data, product, engineering, QA, and DevOps teams. Established Kubernetes platform with automated infrastructure provisioning, deployment pipelines, node group strategies, security protocols, service mesh for internal traffic controls. Deployed GraphQL federation to create frontend-backend contracts enabling decoupled development. Implemented event-driven architecture for data streaming between services. Introduced architectural quanta concepts and trained teams on domain/data decomposition, fracture planes, transactional sagas, and service sizing strategies. Built developer CLI tooling for service scaffolding and operations. Established service ownership paradigms with uptime responsibilities. Migrated to a schema registry of Protocol Buffers for backwards-compatible service contracts in gRPC and through Kafka.

Result

Reduced deployment times from 45-60 minutes to 2-4 minutes (95% improvement). Successfully decomposed monolith into 40+ production microservices over 2 years. Eliminated team blocking with independent service deployments. Improved system elasticity with granular scaling per service. Achieved 99.9% uptime through service-level ownership and monitoring. Enabled parallel development across all teams. Reduced the blast radius of the application. Introduced type-safey with golang and distroless container security to our infrastructure.

Technologies

Kubernetes GraphQL Federation Event-Driven Architecture Protocol Buffers Microservices
06

Data Explorer Slackbot

Context

Getting business insights required knowing SQL, understanding complex data schemas, and having the time to write queries. With 120+ people across the company needing answers - driver experience teams checking market SLAs, executives looking at customer trends - everyone had questions but few could answer them. The data team was swamped with tickets, and one-off questions got ignored because they couldn't be prioritized over product work already in the backlog. Wait times stretched into weeks, so product decisions got made without the numbers. In a culture that required data-driven evidence for every decision, this bottleneck was slowing everything down. By the time data came back, sunk cost fallacy had already kicked in.

Action

I built a Slackbot that opened up data access to everyone in the company. When AWS Knowledgebases turned out to be too limited and buggy, I built custom MCP servers in Go - a choice I stand by for its type safety and performance. The system pulls from our data warehouse plus real-time clickstream and usage data, all running through Genkit for local development, evals, and A/B testing. I worked through the full spectrum of LLM challenges: guardrails and gateways, infinite loops, long response times, tool failures, parsing issues, graceful degradation, retries, MCP connectivity and session management, context engineering, persistent conversations, and prompting techniques like ReAct. I designed it to be modular so other internal automation and API gateways could use the same infrastructure.

Result

Questions that took weeks now get answered in seconds. The bot handles hundreds of conversations a day, generates charts, and goes beyond the initial ask - exploring interesting paths that surface deeper insights and prompting follow-up questions to help people who wouldn't already be thinking about what to ask next. The whole company now asks questions directly while the data team maintains evals and monitors for drift. They shifted from ticket triage to high-value work like machine learning - things they'd much rather be doing. It took longer to build the right way, but the result is faster iterations, automated tooling, and continuous improvement.

Technologies

Go Genkit AWS Bedrock LiteLLM MCP Kubernetes Slack
07

Cloud-Native Platform Transformation

Context

Company operating with manual deployments across 3 environments, experiencing 4+ hour recovery times, 1 hour build times, and 30% deployment failure rate impacting 60+ engineers.

Action

Led migration and creation of 40+ services to Kubernetes orchestration with Helm charts and ArgoCD GitOps. Architected multi-cluster strategy with Istio service mesh for traffic management. Implemented comprehensive observability stack with Prometheus/OTEL metrics, Jaeger distributed tracing, and standardized health checking. Enabled self-service deployments with ArgoCD SSO/RBAC.

Result

Reduced deployment failures from 30% to <2% and MTTR from 4 hours to <10 minutes. Decreased build time by 60%. Saved $500K annually through improved resource utilization.

Technologies

Kubernetes Helm ArgoCD Istio Jaeger Prometheus OpenTelemetry Kiali
08

Developer Productivity Platform Tooling (Go CLI with Plugin Architecture)

Context

70+ developers losing 5+ hours/week to inconsistent tooling, slow debugging cycles, and 7-day onboarding process hampering growth, productivity, and significantly increasing cognitive load.

Action

Architected and built extensible Go-based DevEx CLI with plugin SDK supporting 50+ commands. Implemented workstation bootstrapping, multi-environment cloud management tooling, code artifact tooling, and workstation setup tooling. Created versioned release pipeline via S3 with auto-update mechanism. Established plugin marketplace with templates enabling teams to contribute domain-specific tools.

Result

Reduced onboarding from 7 days to 10 minutes. Saved $500k+ by standardizing workstation tooling, access, and upgrade process. Achieved 100% adoption across engineering with over 80% using it daily. Plugin ecosystem grew to 20+ team-contributed extensions.

Technologies

Go Plugin Architecture S3 CLI Tools
09

Unified Identity & Authentication Platform

Context

Monolithic authentication preventing accelerated microservice adoption. Deprecated methods of authenication that needed to be removed. Fragmented authentication across web and mobile causing security vulnerabilities and preventing enterprise SSO deals worth $5M+ annually.

Action

Built centralized OAuth2 identity service handling all OAuth, Password, OTP, SSO / SCIM workloads. Integrated Ory Hydra for token management and WorkOS for enterprise SSO/SCIM. Implemented secure session management with encrypted tokens. Added i18n support for global expansion. Orchestrated zero-downtime migration using feature flags.

Result

Unlocked $5M in enterprise deals requiring SSO. Greatly improved our security posture with the latest authentication methods and best practices. Enabled auto-provisioning and onboarding of enterprise organizations. Enabled single sign-on across all company properties. Decreased auth implementation time for new services from weeks to hours.

Technologies

OAuth2 Hydra WorkOS SSO i18n
10

Unified Observability & Analytics Platform

Context

12 different tracking and monitoring tools scattered across teams. Logs in Datadog, infrastructure metrics in AWS, traces in Grafana/Tempo, application metrics in Grafana/Prometheus, plus overlapping analytics and session recording tools per team. Pages were bloated with redundant scripts, we were double-paying for the same data, and during incidents no one had a complete picture because visibility was fragmented across a dozen dashboards.

Action

Consolidated the entire observability stack down to two platforms: PostHog for user analytics with flexible dashboards and webhook integrations, and Coralogix for infrastructure visibility covering traces, metrics, logs, security monitoring, and OpenTelemetry instrumentation across both application and AI workloads.

Result

Reduced tooling costs by 20% and cut inter-AZ replication spend by 60% by switching from collect-process-duplicate to collect-and-send. Eliminated significant engineering hours previously spent maintaining self-hosted monitoring infrastructure. Product POC automation dropped from 2-3 weeks to 1 hour through PostHog's integration capabilities. Teams now have unified visibility instead of siloed dashboards.

Technologies

PostHog Coralogix OpenTelemetry
11

Implementation of a Product Operating System

Context

Product and growth teams making decisions with limited data, running untracked experiments, inability to track success metrics with features being rolled out, and taking 6+ weeks to validate hypotheses, losing competitive advantage. Siloed teams (sales, marketing, product) working from different datasets.

Action

Implemented comprehensive Product OS integrating analytics, experimentation, and AB Testing / Recording across web and mobile. Deployed PostHog for product analytics with custom dashboards and cohort exports. Trained others on self-service experimentation platform with feature flags and A/B tests. Integrated data pipelines / webhooks with custom workflows.

Result

Reduced hypothesis validation from 4 weeks to 1 day. Improved feature auditiability and visibility by having a centralized dashboard and audit trail. Enabled cross-functional collaboration between marketing, product, and sales.

Technologies

PostHog Feature Flags Analytics Product Management A/B Testing
12

On-Demand Environment Platform

Context

Shared staging environment causing daily conflicts between 15+ teams, challenging data consistency, feature flag collisions, blocking releases and creating several-day testing bottlenecks for critical features and decreased confidence in work being rolled out.

Action

Architected self-service platform for ephemeral EKS environments provisioned via CI/CD. Managed costs by adopting a spot-instance strategy and sleeping environment strategy, Implemented dynamic DNS, ingress routing, and ArgoCD application filtering. Built webhook broadcaster for external event mirroring across environments. Added intelligent resource management with auto-teardown and cost controls.

Result

Any engineer can run a single-line command to create an isolated environemnt with production-like data in 2 minutes. Eliminated environment conflicts saving 30+ hours weekly per team. Reduced feature testing time from 3 days to 2 hours. Enabled parallel testing of 20+ features simultaneously.

Technologies

EKS CI/CD ArgoCD Webhooks CLI
13

Database Platform Modernization (Neon Branching + Zero-Downtime Migration)

Context

Numerous incidents regarding elasticity of primary db cluster. 10x overprovisioning to account for burst traffic. Inability to properly test features due to the lack of production-like data resulting in $50k+ level incidents every few months.

Action

Led the vetting and adoption of Neon's branching database technology for instant production-like environments. Architected streaming RDS↔Neon replication for zero-downtime migration. Optimized connection pooling reducing overhead by 60%. Implemented automated backup strategy with point-in-time recovery. Developed an internal branch kubernetes operator to enable teams to manage branches with CRDs in microservices. Led the adoption of VPCE to reduce ingress/egress costs and enhance security.

Result

Reduced database experiment time from days to seconds. Feature testability with production-like data now exists across all environments. Over 40 clusters have point-in-time recovery and instant environment creation.

Technologies

Neon PostgreSQL RDS Database Replication

Contact

Location
Denver Metropolitan Area
Colorado, USA
Current Position
Senior Principal Engineer
@ Dispatch

© 2025 Jonathan Reyes.

Wanting to make a positive impact on the world.