Skip to content
Back to Projects
AIEU R&DIoTDockerMobile

HEDGE ExpertAI - AI App Discovery Assistant

Built a HEDGE-IoT App Store discovery assistant with six Dockerized microservices, hybrid vector/BM25/SAREF retrieval, Qdrant, Redis/Celery ingestion, Ollama/Qwen explanations, an embeddable widget, and a companion Flutter mobile client.

Role: AI Systems Architect & Full-Stack DeveloperPeriod: 2026Funding: HEDGE-IoT Open Call - Topic 15

6 + 3 infra

services

75

indexed apps

71.7%

p at 2

HEDGE ExpertAI - AI App Discovery AssistantHEDGE ExpertAI - AI App Discovery Assistant

Problem

IoT app stores can become hard to navigate once the catalog grows across energy, buildings, health, manufacturing, environment, water, agriculture, and smart city domains. Keyword search alone misses intent, while generic chat answers are hard to trust unless they are grounded in real catalog metadata.

HEDGE ExpertAI solves that by turning the HEDGE-IoT App Store into a conversational discovery experience: users describe what they need, the system retrieves relevant apps, and the assistant explains recommendations using catalog evidence.

My Role & Responsibilities

  • Designed and implemented the AI discovery and recommendation architecture for HEDGE-IoT Open Call Topic 15
  • Built a six-service Docker Compose system: gateway, chat intent, expert recommendation, discovery ranking, metadata ingestion, and mock App Store API
  • Implemented hybrid app search with vector similarity, BM25-style keyword scoring, and SAREF ontology boosts
  • Integrated Qdrant for vector search, Redis/Celery for async ingestion, and Ollama/Qwen for local LLM explanations
  • Built the React + TypeScript validation console and production embeddable widget for App Store integration
  • Added evaluation scripts, test queries, OpenAPI exports, service documentation, and deployment guides
  • Built a separate Flutter mobile client that consumes the same HEDGE gateway contract for catalog browsing, chat recommendations, feedback, saved apps, and settings

Product Scope

The project is split across two repositories:

  • HEDGE-ExpertAI: the backend, web validation console, embeddable widget, ranking services, ingestion jobs, evaluation framework, and deployment stack
  • HEDGE-Mobile: a standalone Flutter client that keeps mobile release work separate while consuming the gateway endpoints from the backend

The mobile client is intentionally thin. It calls GET /api/v1/catalog/apps, POST /api/v1/chat, and POST /api/v1/feedback through the gateway instead of duplicating backend business logic.

Architecture

User
  -> Embeddable Widget / React Console / Flutter Mobile
  -> Gateway
  -> Chat Intent
  -> Expert Recommend
  -> Discovery Ranking
  -> Qdrant vector index

Metadata Ingest
  -> App Store API / Mock API
  -> Redis + Celery
  -> Discovery Ranking index update

The ranking layer combines three signals:

  • Vector similarity: semantic matching with 384-dimensional embeddings
  • Keyword relevance: BM25-inspired scoring with domain-aware stopword filtering
  • SAREF alignment: ontology class inference for energy, building, environment, water, agriculture, city, health, and manufacturing

The recommendation layer then asks the local LLM to explain the ranked results. A consistency check catches explanations that contradict the ranking order and falls back to deterministic text when needed.

Tech Stack

  • Backend: Python 3.11, FastAPI, Uvicorn, shared Pydantic models
  • Search: Qdrant, all-MiniLM-L6-v2 embeddings, vector retrieval, BM25-style keyword ranking, SAREF ontology boosts
  • LLM: Ollama with Qwen3.5:2b for recommendation explanations and streaming responses
  • Async jobs: Redis, Celery, metadata ingestion, scheduled catalog freshness workflows
  • Frontend: React, TypeScript, Vite, Tailwind, Framer Motion, production widget assets
  • Mobile: Flutter, gateway API client, Discover/Browse/Details/Saved/Settings screens
  • Infrastructure: Docker Compose, Nginx gateway, health checks, Makefile automation, OpenAPI exports
  • Quality: unit and integration tests, evaluation harness, documented architecture/API/deployment guides

Platform Preview

Key Features Delivered

Conversational App Discovery

Users ask for an IoT capability in natural language, such as energy monitoring or predictive maintenance. The system classifies intent, searches the indexed app catalog, and returns ranked recommendations with explanations.

Hybrid Retrieval and SAREF Ranking

The discovery service blends semantic search, keyword relevance, and SAREF ontology class signals. This makes it better suited to IoT domains than a plain vector search or generic chatbot.

Embeddable Widget and Validation Console

The project includes a production widget for App Store integration and a React validation console for internal development, testing, and evaluation.

Mobile Companion App

HEDGE-Mobile adds a Flutter interface for the same backend. It includes a chat-first Discover screen, Browse screen with local filtering, app detail pages with Ask AI handoff, saved app shortlisting, and configurable gateway settings.

Results & Impact

  • Evaluated a 75-app catalog across 69 natural-language queries covering all 8 SAREF domains
  • Achieved 71.7% Precision@2, 90.9% Recall@5, and 0.991 MRR in the documented evaluation run
  • Hit 0.05s median hybrid-search latency and 0.08s P95 search latency, well under the 5s search-path target
  • Delivered a Dockerized stack with six application services plus Ollama, Qdrant, and Redis
  • Kept web, widget, backend, and mobile boundaries clean through a gateway-first API contract

Challenges & Lessons Learned

  • Recommendation trust: LLM explanations have to stay aligned with ranking output, so the system checks narrative consistency and uses deterministic fallback text when needed
  • IoT search relevance: SAREF ontology signals improve ranking for domain-specific queries that would otherwise be too broad or ambiguous
  • Mobile/backend coupling: separating HEDGE-Mobile from the backend made the mobile app easier to evolve without importing backend logic
  • Deployment constraints: local LLM inference makes end-to-end chat latency depend heavily on server resources, so the evaluation separates hybrid-search latency from LLM-inclusive chat latency

How AI/Agents Were Used

AI is both the product surface and part of the implementation workflow. The delivered system uses semantic retrieval plus local LLM generation for grounded recommendations, while development used agentic coding workflows to scaffold services, tests, documentation, and integration logic that I reviewed and hardened.