Architecture

The ITSM Automation platform is a multi-tenant, AI-driven IT Service Management system built for enterprise-scale operations. This document describes the full system architecture spanning 130+ NestJS modules, 430+ database tables, and a multi-provider AI gateway -- all deployed on Kubernetes with HPE AI Essentials.

This architecture reference reflects the production topology. All services listed here are persistent, distributed, and survive restarts. There are no in-memory substitutes or mock implementations in the production path.

Tech stack

The platform is composed of purpose-built layers, each selected for enterprise reliability and horizontal scalability. The backend alone contains 92 controllers exposing 500+ endpoints, while the frontend delivers 200+ routes through 90+ React components.

Backend

Name
Runtime
Type
NestJS + TypeScript
Description
130+ modules, 92 controllers, 500+ REST endpoints with full OpenAPI documentation. Built on NestJS for dependency injection, modular architecture, and decorator-driven route handling.
Name
ORM
Type
Prisma
Description
Type-safe database access with generated client, migrations, and introspection against PostgreSQL 16. Schema covers 430+ tables and 91 enums.
Name
Validation
Type
class-validator + class-transformer
Description
DTO-level input validation on every endpoint with automatic transformation and whitelist enforcement.

Frontend

Name
Framework
Type
Next.js 15 + React 19
Description
200+ routes with server-side rendering, React Server Components, and streaming. App Router architecture with nested layouts.
Name
Styling
Type
TailwindCSS
Description
90+ components built with utility-first CSS, design tokens, and dark mode support across all tenant themes.

Infrastructure

Name
Database
Type
PostgreSQL 16 + pgvector
Description
430+ tables, 91 enums, row-level security for tenant isolation. UUIDv7 primary keys for time-sortable identifiers.
Name
Cache and Queue
Type
Redis + BullMQ
Description
Three Redis instances (cache, queue, locks). 12 BullMQ queues for background processing. Redlock for distributed locking.
Name
Workflows
Type
Temporal.io
Description
5 task queues for durable workflow orchestration: ticket lifecycle, SLA enforcement, approval chains, scheduled reports, and bulk operations.
Name
AI
Type
Multi-provider LLM Gateway
Description
OpenAI, vLLM (HPE 120B GPT), and Ollama with automatic fallback. Qdrant vector database for RAG and semantic search.
Name
Deployment
Type
Kubernetes + Helm + Istio
Description
HPE AI Essentials for on-premises GPU inference. Prometheus, Grafana, and Jaeger for observability.

Quick verification

# Count NestJS modules
find backend/src -name "*.module.ts" | wc -l
# => 130+

# Count controllers
find backend/src -name "*.controller.ts" | wc -l
# => 92

Service topology

The platform runs as a set of independently deployable services, each with dedicated health checks and monitoring endpoints. All inter-service communication uses either gRPC (Temporal workers) or HTTP with circuit breakers.

Core services

Name
API Server
Type
port 3000
Description
NestJS application serving all REST endpoints. Health check at /health returns service status, database connectivity, Redis availability, and Temporal connection state.
Name
Frontend
Type
port 3001
Description
Next.js 15 application with SSR. Serves the tenant-aware dashboard, ticket management UI, knowledge base portal, and admin console.
Name
Worker
Type
port 9464
Description
BullMQ processor and Temporal activity worker. Exposes Prometheus metrics at /metrics for queue depth, processing latency, and failure rates across all 12 queues.

Data services

Name
PostgreSQL
Type
port 5432
Description
Primary datastore with 430+ tables, pgvector extension for embedding storage, and row-level security policies for multi-tenant isolation.
Name
Redis
Type
port 6379
Description
Three Redis instances for Redlock distributed locking, BullMQ job queues, and application caching. Separate instances ensure lock contention does not affect queue throughput.
Name
Qdrant
Type
port 6333
Description
Vector database for semantic search, RAG document retrieval, and similar-ticket lookup. Stores embeddings generated by the AI gateway.
Name
Temporal
Type
port 7233
Description
Workflow orchestration engine managing 5 task queues for long-running processes: ticket lifecycle, SLA escalation, approval chains, scheduled reports, and bulk operations.

Health checks

curl -sf http://localhost:3000/health | jq
# {
#   "status": "healthy",
#   "database": "connected",
#   "redis": "connected",
#   "temporal": "connected",
#   "uptime": 86400
# }

Observability services

Name
Prometheus
Type
port 9090
Description
Metrics collection and alerting. Scrapes all service endpoints on 15-second intervals with retention configured for 30 days.
Name
Grafana
Type
port 3005
Description
Dashboards for SLA compliance, ticket throughput, AI model latency, queue health, and per-tenant usage. Pre-built dashboards ship with the Helm chart.
Name
Jaeger
Type
port 16686
Description
Distributed tracing across API, worker, Temporal, and AI gateway calls. Every ticket triage request generates a full trace from ingestion through AI classification to database persistence.

Service map

{
  "services": {
    "api":        { "port": 3000, "health": "/health" },
    "frontend":   { "port": 3001, "health": "/" },
    "worker":     { "port": 9464, "health": "/metrics" },
    "temporal":   { "port": 7233 },
    "postgresql": { "port": 5432 },
    "redis":      { "port": 6379 },
    "qdrant":     { "port": 6333 },
    "prometheus": { "port": 9090 },
    "grafana":    { "port": 3005 },
    "jaeger":     { "port": 16686 }
  }
}

Architecture layers

The system follows a strict five-layer architecture. Each layer has a well-defined responsibility boundary and communicates only with its immediate neighbors.

Layer 1: Presentation

The presentation layer is stateless by design. All session state lives in JWT tokens and server-side caches.

Name
Dashboard
Type
Next.js App Router
Description
Real-time ticket overview, SLA timers, team workload visualization, and AI-suggested actions. 45+ routes.
Name
Tenant Portal
Type
Multi-theme SSR
Description
Organization-branded self-service portal for ticket submission, knowledge base, and resolution tracking.
Name
Admin Console
Type
Role-gated routes
Description
System configuration, user management, SLA policy editor, workflow designer, and AI model tuning. 30+ routes.

Frontend structure

{
  "routes": {
    "dashboard": "45+ pages",
    "admin": "30+ pages",
    "portal": "25+ pages"
  },
  "stores": 12,
  "components": "90+"
}

Layer 2: API Gateway

Single entry point for all client requests with authentication, authorization, rate limiting, and validation.

Name
Authentication
Type
JWT RS256
Description
Stateless JWT with 15-minute access tokens and 7-day refresh tokens. Per-device tracking and revocation via Redis.
Name
Authorization
Type
RBAC + ABAC
Description
5 role levels: end_user, agent, team_lead, admin, org_owner. Permissions evaluated per-request via NestJS guards.
Name
Rate Limiting
Type
Redis token bucket
Description
Per-tenant, per-endpoint rate limiting. Configurable burst and sustained rates per subscription tier.
Name
Validation
Type
class-validator DTOs
Description
Every endpoint enforces input validation through DTO decorators with structured error responses.

Rate limit headers

curl -si http://localhost:3000/api/v1/tickets \
  -H "Authorization: Bearer {jwt_token}" \
  | grep -i "x-ratelimit"
# X-RateLimit-Limit: 1000
# X-RateLimit-Remaining: 997
# X-RateLimit-Reset: 1708819200

Layer 3: AI Orchestration

Provider-agnostic AI gateway routing inference to OpenAI, vLLM, or Ollama with automatic fallback. Every call is traced, logged, and validated through guardrails.

Name
Intent Classification
Type
LLM inference
Description
Classifies tickets into service categories using the tenant service catalogue. Supports multiple providers with automatic fallback.
Name
Priority Prediction
Type
LLM inference
Description
Predicts P1-P5 priority based on description, affected service, historical patterns, and SLA impact. Confidence scores attached.
Name
Entity Extraction
Type
LLM inference
Description
Extracts structured entities: affected systems, error codes, user identifiers, Polish PII (PESEL, NIP, IBAN).
Name
RAG Search
Type
Qdrant + LLM
Description
Retrieval-augmented generation over knowledge base and resolution history. Semantic similarity ranking.

AI triage request

curl -sf http://localhost:3000/api/v1/triage \
  -X POST \
  -H "Authorization: Bearer {jwt_token}" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Cannot access production database",
    "description": "Connection timeout on pg-prod-01 since 14:30 UTC. API pods returning 503.",
    "reporter_email": "oncall@acme.com"
  }' | jq
# {
#   "ticket_id": "TKT-20260224-0042",
#   "classification": {
#     "category": "Infrastructure",
#     "subcategory": "Database",
#     "confidence": 0.94
#   },
#   "priority": { "level": "P1", "confidence": 0.97 },
#   "entities": {
#     "affected_system": "pg-prod-01",
#     "error_type": "connection_timeout"
#   },
#   "model_used": "hpe-gpt-120b",
#   "trace_id": "abc123def456"
# }

Layer 4: Business Logic

Core ITSM domain: ticket lifecycle, SLA enforcement, approval workflows, and escalation policies. Long-running processes orchestrated through Temporal.

Name
Ticket Lifecycle
Type
Temporal workflow
Description
Full state machine from creation through triage, assignment, resolution, and closure. Each transition audited.
Name
SLA Engine
Type
BullMQ + Temporal
Description
Monitors response and resolution time targets per priority and tenant SLA policy. Auto-escalation on breach.
Name
Approval Workflows
Type
Temporal workflow
Description
Multi-level approval chains for change requests. Sequential, parallel, and quorum-based patterns with timeout escalation.
Name
Notification Engine
Type
BullMQ
Description
6 channels: email (SMTP), SMS (Twilio), push (FCM), Slack, Teams, webhooks. Dedicated queues with retry.

Queue and workflow configuration

{
  "queues": [
    { "name": "ticket-triage", "concurrency": 5 },
    { "name": "sla-monitor", "concurrency": 3 },
    { "name": "notification-email", "concurrency": 10 },
    { "name": "notification-slack", "concurrency": 5 },
    { "name": "notification-teams", "concurrency": 5 },
    { "name": "approval-processor", "concurrency": 3 },
    { "name": "report-generator", "concurrency": 2 },
    { "name": "bulk-operations", "concurrency": 2 },
    { "name": "knowledge-indexer", "concurrency": 3 },
    { "name": "audit-logger", "concurrency": 5 },
    { "name": "webhook-dispatch", "concurrency": 5 },
    { "name": "ai-embedding", "concurrency": 4 }
  ]
}

Layer 5: Data Layer

Persistent storage, caching, and vector search. Each technology serves a distinct purpose with no overlap.

Name
PostgreSQL 16
Type
Primary datastore
Description
430+ tables with pgvector. All transactional data. Row-level security enforces tenant isolation at the database level.
Name
Redis
Type
Cache + Queue + Locks
Description
Three instances: caching (6379), BullMQ queues (6380), Redlock distributed locking (6381).
Name
Qdrant
Type
Vector database
Description
Embedding storage for semantic search. Collections partitioned by tenant. Configurable dimensions and distance metrics.
Name
Temporal
Type
Workflow state
Description
Persists workflow execution state, activity results, and timer schedules. Exactly-once execution guarantees.

Database details

-- Table count
SELECT count(*)
FROM information_schema.tables
WHERE table_schema = 'public';
-- => 430+

-- Enum count
SELECT count(*)
FROM pg_type
WHERE typtype = 'e';
-- => 91

-- Row-level security policies
SELECT count(*)
FROM pg_policies;
-- => 200+

AI orchestration

The AI orchestration layer is provider-agnostic, routing requests through a unified gateway that supports multiple LLM backends. Every inference call is tracked with the model provider, token usage, latency, and confidence score. If the primary provider is unavailable, the gateway automatically falls back to the next configured provider.

Provider configuration

Name
OpenAI
Type
Primary provider
Description
GPT-4 class models for high-accuracy classification, entity extraction, and knowledge synthesis. Used as the default provider for enterprise-tier tenants.
Name
vLLM
Type
Self-hosted inference
Description
Runs HPE 120B GPT on-premises via HPE AI Essentials. Provides data-sovereign inference for tenants with compliance requirements that prohibit external API calls.
Name
Ollama
Type
Development / fallback
Description
Local model inference for development environments and as a last-resort fallback. Supports Llama, Mistral, and other open-weight models.
Name
Qdrant
Type
Vector search
Description
Embedding storage and retrieval for RAG pipelines. Collections are partitioned by tenant with configurable embedding dimensions and distance metrics.

Guardrails

Every AI response passes through validation before reaching the caller. The guardrail pipeline checks for hallucinated entity references, toxic content, confidence thresholds, and schema compliance. Responses that fail validation are rejected and the request is retried with an adjusted prompt or routed to a human agent.

AI provider configuration

ai:
  gateway:
    default_provider: openai
    fallback_order:
      - openai
      - vllm
      - ollama
    timeout_ms: 30000
    retry:
      max_attempts: 3
      backoff_ms: 1000

  providers:
    openai:
      base_url: https://api.openai.com/v1
      model: gpt-4o
      max_tokens: 4096

    vllm:
      base_url: http://vllm.internal:8000/v1
      model: hpe-gpt-120b
      max_tokens: 4096

    ollama:
      base_url: http://ollama.internal:11434/v1
      model: llama3.1:70b
      max_tokens: 4096

  guardrails:
    min_confidence: 0.6
    max_retries_on_low_confidence: 2
    content_filter: enabled
    schema_validation: strict
    hallucination_check: enabled

Data layer

Database schema highlights

The PostgreSQL schema is organized into logical domains, each owning a set of tables, enums, and relationships. The schema uses UUIDv7 primary keys for globally unique, time-sortable identifiers.

Name
Ticketing domain
Type
~80 tables
Description
Tickets, comments, attachments, tags, custom fields, watchers, linked tickets, time tracking, and audit history.
Name
Identity domain
Type
~40 tables
Description
Users, organizations, teams, roles, permissions, sessions, API keys, and OAuth connections.
Name
SLA domain
Type
~25 tables
Description
SLA policies, targets, breach records, escalation rules, business hours, and holiday calendars.
Name
Knowledge domain
Type
~30 tables
Description
Articles, categories, versions, feedback ratings, view analytics, and embedding references.
Name
Workflow domain
Type
~20 tables
Description
Approval chains, approval steps, workflow templates, trigger conditions, and execution logs.
Name
Configuration domain
Type
~35 tables
Description
Tenant settings, feature flags, notification templates, integration configs, and webhook registrations.
Name
Audit domain
Type
~15 tables
Description
Change logs, login history, API access logs, data export records, and compliance snapshots.

Prisma schema examples

model Ticket {
  id             String         @id @default(dbgenerated("gen_random_uuid()"))
  orgId          String         @map("org_id")
  number         Int
  title          String
  description    String
  status         TicketStatus   @default(OPEN)
  priority       Priority       @default(P3)
  category       String?
  subcategory    String?
  assigneeId     String?        @map("assignee_id")
  teamId         String?        @map("team_id")
  slaBreached    Boolean        @default(false) @map("sla_breached")
  aiConfidence   Float?         @map("ai_confidence")
  modelUsed      String?        @map("model_used")
  traceId        String?        @map("trace_id")
  createdAt      DateTime       @default(now()) @map("created_at")
  updatedAt      DateTime       @updatedAt @map("updated_at")
  resolvedAt     DateTime?      @map("resolved_at")

  organization   Organization   @relation(fields: [orgId], references: [id])
  assignee       User?          @relation(fields: [assigneeId], references: [id])
  team           Team?          @relation(fields: [teamId], references: [id])
  comments       Comment[]
  attachments    Attachment[]
  tags           TicketTag[]
  slaRecords     SlaRecord[]

  @@unique([orgId, number])
  @@index([orgId, status])
  @@index([orgId, priority])
  @@index([assigneeId])
  @@map("tickets")
}

Multi-tenancy

Full multi-tenant isolation at every layer. No tenant can access another tenant's data through any API surface.

Name
Database
Type
Row-level security
Description
PostgreSQL RLS policies filter every query by org_id. Bypass impossible even through raw SQL.
Name
Cache
Type
Key-prefix isolation
Description
All Redis keys prefixed with organization identifier. Cache invalidation scoped per tenant.
Name
Vector Search
Type
Filtered collections
Description
Qdrant queries include mandatory org_id filter. Embeddings never returned across tenant boundaries.
Name
API
Type
JWT org_id claim
Description
Every request carries an org_id claim in the JWT. Gateway validates claim against requested resource.

Subscription tiers

Name
free
Type
tier
Description
5 agents, 100 tickets/month, community support, basic triage (Ollama).
Name
starter
Type
tier
Description
25 agents, 1,000 tickets/month, email support, AI triage with standard models.
Name
professional
Type
tier
Description
100 agents, 10,000 tickets/month, priority support, full AI suite, custom workflows.
Name
enterprise
Type
tier
Description
Unlimited agents and tickets, dedicated support, on-prem AI inference (vLLM/HPE), custom integrations, SSO/SAML, audit logs.
Name
unlimited
Type
tier
Description
Everything in enterprise plus dedicated infrastructure, custom model fine-tuning, 24/7 premium support, and SLA guarantees with financial penalties.

Multi-tenancy enforcement

-- Enable RLS on tickets table
ALTER TABLE tickets ENABLE ROW LEVEL SECURITY;

-- Tenant isolation policy
CREATE POLICY tenant_isolation ON tickets
  USING (org_id = current_setting('app.current_org_id')::uuid);

-- Verify isolation
SET app.current_org_id = 'org_abc123';
SELECT count(*) FROM tickets;
-- => Only returns tickets for org_abc123

Deployment

The platform deploys on Kubernetes with Helm charts, Istio service mesh, and HPE AI Essentials for on-premises AI inference.

Name
Kubernetes
Type
Orchestration
Description
All services as deployments with HPA. API scales 3-20 replicas based on CPU and request latency.
Name
Helm
Type
Package management
Description
Single chart packages all services. Values files for dev, staging, and production environments.
Name
Istio
Type
Service mesh
Description
Mutual TLS, traffic routing for canary deployments, and circuit breakers for external dependencies.
Name
HPE AI Essentials
Type
AI infrastructure
Description
On-premises GPU cluster running vLLM with HPE 120B GPT model and tensor parallelism.

Deployment commands

# Deploy to production
helm upgrade --install itsm ./helm/itsm \
  --namespace itsm-production \
  --values helm/values.production.yaml \
  --set image.tag=v2.4.0 \
  --wait --timeout 600s

# Verify rollout
kubectl -n itsm-production rollout status \
  deployment/itsm-api
# => deployment "itsm-api" successfully rolled out

What's next?

Now that you understand the architecture, here are some recommended next steps to explore the platform in detail:

Authentication -- Learn how JWT auth and API keys work
Quickstart -- Set up your first API client and make requests
Errors -- Understand error codes and troubleshooting
Webhooks -- Integrate real-time events into your systems

Get started with the Quickstart guide