open to ML / SWE / LLM roles · Sydney, Australia

anyesh

Anish Shrestha

Machine learning & software engineer. Building LLM infrastructure.

focus
ML engineering · SWE · LLM systems · Rust · Python · agent infrastructure
open to
ML / SWE / LLM-engineering roles · Sydney AU and remote
reach
sir.anishshrestha@gmail.com

projects

  1. EVOKE eviction demo on Qwen 2.5: a fact planted at turn 1 survives 40 evictions and 13 recoveries and is recalled at turn 14
    fig. 01 · 14-turn session under a 1024-token budget; the fact planted at turn 1 is recalled at turn 14.

    EVOKE

    Long-running agent sessions outgrow the physical KV cache within a few turns. EVOKE evicts low-relevance blocks under budget pressure and recovers them recompute-free through a custom save/restore primitive in a forked llama.cpp.

    evictions survived
    40
    recoveries
    13
    vs re-prefilling
    20–32× faster
  2. incremental inserts vs from-scratch
    100–236× faster
    propagation through function DAGs
    ~135 ns/node
    collection operators
    9
    fig. 02 · incr benchmarks, measured with criterion.

    incr

    A Rust library that tracks dependencies between computations automatically and reruns only what a change actually affects. One engine, two surface crates: single-threaded with zero atomic-fence cost, or Send + Sync with lock-free reads. Same API, one-line swap.

  3. claude code cursor knowledge graphKuzuDB · BGE-small MCP any agent

    reinforces · contradicts · supersedes

    fig. 03 · a single knowledge graph ingesting every tool's history, served back over MCP.

    second-brain

    Claude Code and Cursor each keep conversation history locked in their own format. second-brain ingests them all into one graph-backed store, embeds everything for semantic search, and serves it over MCP so any agent can recall what was discussed, decided, and built.

  4. key = blake3(
        command  + cwd
      + hash(src/lib.rs) + hash(src/main.rs) + hash(Cargo.lock) + ...
      + env(RUSTFLAGS, CARGO_TARGET_DIR)
    )
    fig. 04 · a hit is a proof (inputs identical), not a gamble: 23.3% tool-time saved on real agent runs, 99.97% LLM latency saved on hits.

    verdant

    Like make: if the inputs haven't changed, the output can't have changed. Verdant keys agent tool calls and LLM completions on actual content and returns exact bytes from prior executions.

  5. snap AI tags daily outfits

    self-hosted · any OpenAI-compatible API · OIDC

    fig. 05 · works great on a Raspberry Pi 5.

    wardrowbe

    Self-hosted wardrobe management with AI-powered outfit recommendations: photos in, AI tagging, daily suggestions by weather and occasion. Your data stays on your hardware.

all projects, indexed

name kind started status stack
EVOKE oss 2025 in-progress Python · Rust · C++ · llama.cpp fork · CUDA · Qwen
incr oss 2023 shipped Rust · Cargo · proptest · criterion
second-brain oss 2025 active Rust · KuzuDB · BGE-small · MCP
verdant oss 2025 active Rust · blake3 · MCP
cognitive-cache oss 2024 shipped Python · scikit-learn · networkx · Hypothesis
wardrowbe oss 2024 shipped Next.js · TypeScript · FastAPI · Python · PostgreSQL · Redis · Docker · Ollama
memories-for-llms oss 2025 in-progress Python · SQLite · QLoRA · unsloth · Qwen
wardrowbe.com proprietary 2024 active Next.js · TypeScript · FastAPI · Python · PostgreSQL · iOS · Android · Stable Diffusion
Certus oss 2024 shipped Python · Qwen 2.5 Coder · QLoRA · Hypothesis · unsloth
skillprobe oss 2025 active Python · Claude Code · Cursor
eon side 2024 active Rust · Python · NEAT · Godot · WebSocket
art_gan side 2020 archived Python · GAN
dreamery proprietary 2023 shipped SvelteKit · ComfyUI · RunPod · DeepFace · Stripe · PostgreSQL · microservices

work history

  1. 2022-05 now

    Senior Software Engineer

    AlayaCare active

    Backend and product engineering on a cloud-based SaaS for home care. Architected the Support at Home reconciliation pipeline, built a third-party-API testing harness, and lead AI coding-agent adoption as the Australian engineering team's AI champion.

    • Support at Home (2025–2026) — architected and built the entire reconciliation engine and pipeline for Services Australia's new Support at Home funding model (the successor to Home Care Packages).
    • Testing harness (2025–2026) — Services Australia doesn't provide sandbox or test endpoints, so I architected and built our own: a mock / monkey-patching layer that suppresses third-party API calls in-app and lets us run the rest of the system end-to-end against fixtures.
    • AI champion, Australian engineering team (2026) — proposed, built, and adopted across the team an end-to-end Cypress test-generation framework on top of Cursor's hooks, skills, and commands. Given a PRD or Jira ticket, the LLM explores the web app through browser tools, finds useful selectors, maintains memories and navigation indexes, and writes Cypress specs nearly one-shot.
    • Engineered HCP invoicing, claiming, smart reconciliation, and budget management. 30% reduction in customer support tickets.
    • Shipped the auto travel time feature in the scheduling and payroll workflow. 25% efficiency gain for clients.
    • Pioneered a core product delivery team driving technical design and strategic value for Australian clients. 100% compliance with Australian standards.
    PythonTypeScriptReactDjangoPostgreSQLCeleryRedisGCPCursorCypress
  2. 2020-02 2021-11

    Machine Learning Engineer

    Fusemachines

    ML engineer across three Fusemachines clients: TIME Magazine, Hospital for Special Surgery, and Fuse AI. Pipelines, OLAP consolidation, viral-content prediction, implant supply chain optimization, and ML curriculum.

    • TIME Magazine — streamlined data pipelines for brand audits, insights, and strategic recommendations. Consolidated scattered data into a single OLAP system.
    • TIME Magazine — collaborated with the Data VP and a Ph.D. on a viral-content prediction ML pipeline. 90% accuracy.
    • TIME Magazine — automated reporting systems. 75% reduction in report generation time.
    • Hospital for Special Surgery — ML-driven implant supply chain optimization. Analyzed 15,000+ cases, predicted patient implant needs at over 90% accuracy.
    • Fuse AI — restructured and rebuilt the best ML and deep learning academic courses taught in Fuse Classroom to tens of thousands of students worldwide. Implemented core ML and NLP algorithms at the maths level to explain mechanics.
    PythonPyTorchTensorFlowSklearnGCPVertex AIStable Diffusion

credentials

education

  • Bachelor of Science in Computer Science
    Lord Buddha Education Foundation (APU) · Kathmandu, Nepal
    2016 – 2019

speaking

  • 2024-06
    Google I/O Extended
    "What does the future of AI hold for us"
    Panelist · NSW Teachers Federation Conference Centre, Sydney
  • 2024-05
    Google Developer Student Clubs — Google Labs
    "Leveraging open-source text-to-image generative AI for practical applications"
    Speaker · Google HQ, Sydney
  • 2024-04
    Google Developer Group — Google Cloud
    "Dreamery: generative-AI on GCP. Distributed services, serverless GPU, 90% cost reduction"
    Speaker · Google Developer Group, Sydney
  • 2020-11
    Fuse AI Training
    "End-to-end ML pipeline development and deployment workshop"
    AI Instructor · Fusemachines