Zaman Ayaz — ML Engineer

About

A bit about me

I'm a Machine Learning Engineer based in Islamabad, Pakistan, with a deep interest in making AI smaller, more efficient, and deployable at the edge. I believe the future of AI isn't just in scaling up — it's in making powerful models accessible everywhere.

My work centers on fine-tuning Small Language Models (270M–3B parameters), optimizing them through quantization, and deploying them for real-world enterprise use. I build full pipelines — from data preparation and training to evaluation and production serving.

Beyond my core ML work, I develop agentic AI systems using LangChain and LangGraph, build robust backends with FastAPI, and work with RAG systems to create intelligent, context-aware applications.

Education

BS Information Technology

The Islamia University of Bahawalpur · 2021 – 2025

Current Role

ML Engineer

wAI Advanced Industries · Nov 2025 – Present

Focus Areas

SLMs · Quantization · Edge AI

Making AI efficient, private, and production-ready

Experience

Where I've worked

Building production ML systems and fine-tuning pipelines.

Nov 2025 — Present

Machine Learning Engineer

wAI Advanced Industries · Islamabad

Fine-tuning Small Language Models for enterprise products. Building automated evaluation pipelines, deploying models to production servers, and engineering FastAPI backends that orchestrate ML workflows end-to-end. Focused on optimizing inference latency and memory efficiency.

Jul 2025 — Sep 2025

AI Engineer — Intern

Nexus Technologies

Worked on applied AI projects, gaining hands-on experience with model training workflows, data pipeline design, and foundational ML engineering practices.

Projects

Selected work

Real-world ML systems I've built and shipped.

🧠

Alara SLM Training

Multi-Agent Swarm Enablement

Fine-tuned SLMs (270M–3B) for the Alara Multi-Agent Swarm architecture. Trained task-specific models for sentiment, structured output, and reasoning with 4-bit quantization.

PyTorch Unsloth Quantization Multi-Agent

🌍

Multilingual Sentiment Engine

Gemma 270M · 4 Languages · 89% Accuracy

Built a multilingual sentiment model covering English, Urdu, Arabic, and Roman Urdu. Achieved 4× inference speedup through quantization while maintaining accuracy.

Gemma Unsloth Multilingual NLP

📋

Structured Output LLM

Schema-Valid JSON Generation

Fine-tuned Gemma 270M to reliably produce schema-valid JSON outputs for automation workflows. Designed end-to-end dataset transformation and validation pipelines.

Gemma JSON Schema Data Pipelines

⚡

Automated Fine-Tuning System

FastAPI Backend · Full Pipeline

Engineered a FastAPI service that ingests JSONL datasets, triggers configurable fine-tuning pipelines, and generates evaluation reports with version tracking.

FastAPI JSONL MLOps

🔗

Reasoning-Enhanced Model

Llama 3.2 3B · Chain-of-Thought

Improved logical inference through Chain-of-Thought fine-tuning on Llama 3.2 3B. Deployed via Llama.cpp for structured reasoning evaluation.

Llama 3.2 CoT Llama.cpp

🔍

Research Assistant System

LangChain + LangGraph + ArXiv

Developed a multi-step retrieval assistant that synthesizes information from ArXiv and research sources using contextual memory and LangGraph orchestration.

LangChain LangGraph RAG

Skills

Tech stack

Tools and technologies I work with daily.

⚙️

ML & LLM Engineering

PyTorch Scikit-Learn Unsloth LLaMA-Factory Gemma Qwen Llama 4-bit Quantization Chain-of-Thought Multi-task Learning

🛠️

Backend & Pipelines

Python FastAPI REST APIs JSONL Pipelines Gradio LangChain LangGraph

🚀

Deployment & DevOps

GGUF Conversion Ollama Llama.cpp HuggingFace Spaces Docker AWS EC2 Git Ubuntu Linux

📊

Data & Analysis

Pandas NumPy Matplotlib Seaborn Multilingual Datasets

🤖

AI Systems

RAG Systems MCP Agentic Frameworks Multi-Agent Swarms Structured Outputs

🎯

Specializations

Small Language Models Edge AI Model Optimization On-device Inference Privacy-focused AI

Building smaller,
smarter models.

A bit about me

Where I've worked

Selected work

Tech stack

Writing & thoughts

Let's connect

Building smaller,smarter models.

A bit about me

Where I've worked

Selected work

Tech stack

Writing & thoughts

Let's connect

Building smaller,
smarter models.