James Z. Zhang

Building production-oriented AI systems with GraphRAG, distributed services, and practical engineering workflows

I build AI systems that go beyond model demos: retrieval pipelines, graph-based reasoning, cloud-native services, deployment workflows, and the supporting system design behind them.

My current work focuses on GraphRAG, AI system design, cloud deployment on Google Cloud Platform, and practical AI-assisted engineering workflows for real-world development.

Current Focus

Three areas of AI systems engineering I’m currently deepening through projects, system design, and implementation work.

End-to-End RAG System Design
Microservices Cloud Deployment RAGOps

Designing RAG systems as modular, production-oriented architectures — separating responsibilities across services, defining clear API boundaries, and deploying on cloud infrastructure with CI/CD, observability, and operational resilience in mind.

AI Workflow Orchestration & Evaluation
Multi-Agent Automation Evaluation

Designing agentic workflows where specialized components coordinate across task boundaries, supported by evaluation pipelines, retrieval benchmarking, and structured memory or context management to keep quality measurable and improvable.

AI Guardrails & Control
Safety Tool Control Quality Gates

Defining trust boundaries inside AI systems — controlling tool access, validating information before it crosses service boundaries, and adding human or policy-based gates where autonomous actions carry higher risk.

Featured Repositories

A focused set of repositories that reflect my current AI engineering direction, from flagship system building to workflow and architecture documentation.

🧠
graph-rag-finance-assistant

My flagship AI engineering project: a GraphRAG-based financial assistant that combines SEC data ingestion, graph construction, retrieval, and cloud service deployment.

Python FastAPI Neo4j GCP
🧰
operating-system-toolset

Practical shell configurations, workflow helpers, and small operating-system-level tools for daily productivity across macOS, Windows, and Linux.

Zsh Git Utilities Workflow

My Current View of the ML/AI Engineering Workflow

ML/AI engineering is a systems discipline spanning data preparation, retrieval and storage design, model interaction, backend services, evaluation, deployment, and iteration. This section reflects my current field-level understanding of how those layers connect.

1️⃣ Data & Knowledge Preparation
  • Collect, clean, and normalize source data
  • Parse structured and unstructured documents
  • Chunk, enrich, and organize knowledge for retrieval
2️⃣ Storage & Retrieval Design
  • Choose database and retrieval strategy
  • Store vector, graph, or structured data appropriately
  • Design query flow for relevance, cost, and speed
3️⃣ Model Gateway & Prompting
  • Define model access layer and provider strategy
  • Write structured prompts and system instructions
  • Constrain outputs for reliability and downstream use
4️⃣ Service Architecture
  • Separate search, build, API, and model-facing responsibilities
  • Design service boundaries and communication paths
  • Keep scalability and operational clarity in mind
5️⃣ Evaluation & Debugging
  • Test retrieval quality, correctness, latency, and failure modes
  • Diagnose system issues across data, code, config, and infrastructure
  • Use evidence-based iteration instead of guesswork
6️⃣ Deployment & Operations
  • Deploy services with repeatable workflows
  • Manage secrets, runtime configs, and cloud dependencies
  • Balance cost, capability, and operational simplicity
7️⃣ Iteration & System Growth
  • Improve architecture as constraints become clearer
  • Add reliability, caching, and scaling only when justified
  • Let real workloads shape the next design decisions