Alex Peczon

Hello, I'm Alex

Enjoyer of Software Development, Data Science and Simulations

Side Projects I like

Dive deep into my projects — click to explore the depths of each creation.

NextSteamGame
Live

nextsteamgame.com

Gameplay‑first game recommender using ETL + vector similarity. 20k+ titles, user‑weighted tags.

GoFastAPIHTMXSQLite3
Antidote Intelligence
Open Source

Antidote Intelligence

Open source tool to help make LLMs more secure by detecting data poisoning. Read the blog post →

OpenAI GPT-46-Agent PipelineML Security
Dreamville
In progress

Dreamville

Gamified Canvas LMS tracker. Earn coins by completing tasks; urgency scored via regression.

GodotGoCanvas API
Maldemic Simulator

Maldemic Simulator

Real‑time SIRD with Markov chains + grant‑funded NN work; 3D globe viz in Godot.

GodotNumPySciPy
USF Search Engine Crawler

USF Search Engine Crawler

High-performance concurrent crawler with 300 extract workers + 300 DB workers. Batch processing (150 docs/batch) with 9000-job queue buffer for lightning-fast website indexing.

GoSQLite
Hyper Rosen

Hyper Rosen

3D galaxy toy with Perlin fields and vector‑field enemies in Godot.

GodotPerlin
NutriFinder App

NutriFinder

Dietary search over restaurant menus; Flask API + React client.

ReactFlask
Old Man Climbs

Old Man Climbs

Mini jam game built in a weekend for UC Merced GDC.

GodotGame Jam
Spiral Visualizer

Spiral Visualizer

Queue‑driven spiral plotting with Matplotlib.

PythonMatplotlib

Experiences

My voyage through professional waters, charting courses through real‑world challenges.

Alaris Security Alaris Security

Junior Fullstack Engineer
Aug 2025 – Present
San Francisco
  • Designed scalable data pipeline with Prefect + Airflow, processing 500K+ security events daily and improving analytics reliability by 60%.
  • Resolved critical UI bugs in React-based security platform, reducing system downtime by 90% and improving user experience for 1000+ enterprise clients.
  • Authored comprehensive platform-wide data flow documentation, enabling leadership to evaluate scaling solutions and reducing onboarding time for new engineers by 50%.
PrefectAirflowReactData PipelinesSecurity Platform

Future Tilt Future Tilt (Ecommerce Marketing Agency)

Software Developer
Aug 2025 – Present
San Francisco
  • Built an AI template builder that auto-generates email boilerplates, cutting design prep time by 40%.
  • Developed BigQuery dashboards analyzing 10M+ daily records, improving brand forecasting.
  • Maintained an AWS Lambda + BigQuery alerting pipeline, reducing manual checks by 80%.
  • Automated Trello board updates via Lambda, streamlining campaign tracking for 15+ clients.
  • Deployed ECS Dockerized Lambda for real-time revenue alerts, cutting lag from 24h to <1h.
AI TemplatesAWS LambdaBigQueryECSDockerTrello API

Future Tilt Future Tilt (Ecommerce Marketing Agency)

Software Development Intern
Jul 2025 – Aug 2025
San Francisco
  • Building micro‑services with AWS Lambda and BigQuery to take the busywork out of eCommerce, and improve sales forecasting.
  • Automating processes in our CRM, setting up Big Query integration.
AWSTrello APIAutomation

USF USF MAGIC Lab

NLP Research Assistant
Mar 2025 – Present
San Francisco
  • Built an ETL pipeline (BeautifulSoup) scraping 20k+ news articles for sentiment analysis.
  • Developed SpaCy + NetworkX models to map sentiment and reveal bias trends.
PythonSpaCyNetworkXETL

Stanford iD Tech Camps (Stanford)

Machine Learning Instructor
Jun 2024 – Aug 2024
Stanford, California
  • Taught high school students Python, neural networks, and key tools like NumPy, Pandas, Keras, and ChatGPT through project‑based learning.
  • Rebuilt check‑in/out system using Seaborn heatmaps to optimize traffic flow, improving efficiency by 40% adopted by 2 other iD Tech camps.
PythonPyTorchKerasNumPyPandasSeabornGoogle WorkspaceClassroom InstructionOrganization Skills

USF USF Strategic Enrollment Management

Predictive Analytics / Web Intern
Jul 2024 – Jul 2025
San Francisco
  • Created predictive models that improved enrollment forecasting accuracy by 15%.
  • Developed semantic search + Pandas system, reducing record reconciliation time by 50%.
  • Automated website updates with Python + Jinja2, cutting update time from hours to minutes.
SQLPandasOpenAIJinja2

UC Merced UC Merced — SATAL

Data Research Analyst
Jun 2023 – Sep 2024
California
  • Designed and conducted statistical analysis on 50+ classroom feedback surveys for educational improvement.
  • Built ML classification system using LLMs + TensorFlow, achieving 99% accuracy in response categorization.
  • Processed complex XML-based Qualtrics survey data and created automated reporting systems for faculty.
  • Applied NLP techniques to extract insights from open-ended survey responses.
PyTorchLLMFlask

Candle Stories Candle Stories

Production Assistant
Apr 2025 – Aug 2025 · Completed
San Francisco
  • Supported on‑set operations and equipment handling across documentary shoots.
ProductionLogistics

Acme Builders Acme Builders Incorporated

Data Analyst
May 2021 – Dec 2024
Oakland
  • Started off as a construction laborer then worked on business logistics.
  • Built business data systems in Python using NumPy and Pandas to clean and organize records across department.
  • Organized, updated, and archived company records to support accurate data management.
Data MaintenanceBusiness Data ManagementGoogle WorkspaceData Control

Engineering Blog

Latest insights from AI security and data science research.

Demoing Data Poisoning Detection at Continue DX

Latest Post • December 2024 • ML Security Research Update

Recently had the opportunity to demo Antidote Intelligence at Continue DX, showcasing our content-aware data poisoning detection system to industry professionals. The response was incredibly encouraging, with several attendees expressing interest in the practical applications for enterprise ML pipelines.

Current Outreach Efforts

I'm actively reaching out to vector embedding marketing vendors and MIT researchers to explore collaboration opportunities around data quality assessment. The core methodology we've developed—using AI agents for hypothesis generation and systematic validation—has broader applications beyond just poisoning detection.

Key Technical Innovations

  • 6-agent pipeline for comprehensive content analysis
  • Statistical validation using sample size significance algorithms
  • Content-aware filtering that goes beyond metadata
  • Real-time detection for high-stakes ML applications

Industry Impact Potential

The conversations at Continue DX reinforced something important: data quality is the silent crisis in ML. While everyone focuses on model architecture and training techniques, corrupted training data can undermine even the most sophisticated systems. This is especially critical in financial services, healthcare, and autonomous systems where the stakes are highest.

Looking forward to sharing more updates as these partnerships develop. If you're working on data quality challenges or RAG system validation, I'd love to connect and discuss potential applications.

ML Security Data Quality Research Collaboration Industry Demo