James Z. Zhang

Current Focus

Three areas of AI systems engineering I’m currently deepening through projects, system design, and implementation work.

End-to-End RAG System Design

Microservices Cloud Deployment RAGOps

Designing RAG systems as modular, production-oriented architectures — separating responsibilities across services, defining clear API boundaries, and deploying on cloud infrastructure with CI/CD, observability, and operational resilience in mind.

AI Workflow Orchestration & Evaluation

Multi-Agent Automation Evaluation

Designing agentic workflows where specialized components coordinate across task boundaries, supported by evaluation pipelines, retrieval benchmarking, and structured memory or context management to keep quality measurable and improvable.

AI Guardrails & Control

Safety Tool Control Quality Gates

Defining trust boundaries inside AI systems — controlling tool access, validating information before it crosses service boundaries, and adding human or policy-based gates where autonomous actions carry higher risk.

Featured Repositories

A focused set of repositories that reflect my current AI engineering direction, from flagship system building to workflow and architecture documentation.

🧠

graph-rag-finance-assistant

My flagship AI engineering project: a GraphRAG-based financial assistant that combines SEC data ingestion, graph construction, retrieval, and cloud service deployment.

Python FastAPI Neo4j GCP

View Repo

🧰

operating-system-toolset

Practical shell configurations, workflow helpers, and small operating-system-level tools for daily productivity across macOS, Windows, and Linux.

Zsh Git Utilities Workflow

View Repo

My Current View of the ML/AI Engineering Workflow

ML/AI engineering is a systems discipline spanning data preparation, retrieval and storage design, model interaction, backend services, evaluation, deployment, and iteration. This section reflects my current field-level understanding of how those layers connect.

1️⃣ Data & Knowledge Preparation

Collect, clean, and normalize source data
Parse structured and unstructured documents
Chunk, enrich, and organize knowledge for retrieval

2️⃣ Storage & Retrieval Design

Choose database and retrieval strategy
Store vector, graph, or structured data appropriately
Design query flow for relevance, cost, and speed

3️⃣ Model Gateway & Prompting

Define model access layer and provider strategy
Write structured prompts and system instructions
Constrain outputs for reliability and downstream use

4️⃣ Service Architecture

Separate search, build, API, and model-facing responsibilities
Design service boundaries and communication paths
Keep scalability and operational clarity in mind

5️⃣ Evaluation & Debugging

Test retrieval quality, correctness, latency, and failure modes
Diagnose system issues across data, code, config, and infrastructure
Use evidence-based iteration instead of guesswork

6️⃣ Deployment & Operations

Deploy services with repeatable workflows
Manage secrets, runtime configs, and cloud dependencies
Balance cost, capability, and operational simplicity

7️⃣ Iteration & System Growth

Improve architecture as constraints become clearer
Add reliability, caching, and scaling only when justified
Let real workloads shape the next design decisions