A real-time microservice observability platform that ingests Istio telemetry data from Prometheus, builds a service dependency graph in Neo4j, and provides a RESTful API with comprehensive metrics and graph analytics.
This service continuously monitors your Istio service mesh, collecting performance metrics and infrastructure data to provide:
- Service Dependency Graph: Visual representation of microservice interactions
- Real-time Metrics: RPS, error rates, latency percentiles (P50, P95, P99)
- Infrastructure Monitoring: Node and Pod resource utilization
- Centrality Scores: PageRank and Betweenness metrics to identify critical services
- Historical Data: Time-series tracking of service metrics
- RESTful API: Complete observability data access with Swagger documentation
βββββββββββββββ ββββββββββββββββββ ββββββββββββββββ
β Istio βββββββΆβ Prometheus βββββββ Kubernetes β
β Service Meshβ β β β API β
βββββββββββββββ ββββββββββββββββββ ββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββββββββββββββββββββ
β Service Graph Engine β
β - Metrics Ingestion β
β - Graph Analytics β
β - API Server β
βββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββ
β Neo4j β
β Graph DB β
ββββββββββββββββ
- Service-to-Service Metrics: Request rates, error rates, latency distributions
- Infrastructure Metrics: CPU and RAM usage for nodes and pods
- Availability Tracking: Service health and pod counts
- Pod Placement: Node affinity and pod distribution
- PageRank: Identify most important services by incoming traffic
- Betweenness Centrality: Find critical services in communication paths
- Historical Analysis: Track metric evolution over time
/metrics/snapshot- Latest metrics for all services and edges/graph- Full service dependency graph with scores/graph/node/:name- Individual service details/graph/edges- All service-to-service connections/graph/history- Historical metric trends/infrastructure- Kubernetes infrastructure metrics/api-docs- Interactive Swagger documentation
- Node.js 16+ and npm
- Neo4j 6.0+ (AuraDB or self-hosted)
- Kubernetes cluster with Istio installed
- Prometheus with Istio metrics configured
- Kubernetes API access (for infrastructure metrics)
git clone <repository-url>
cd service-graph-enginenpm installCreate a .env file in the root directory:
# Prometheus Configuration
PROMETHEUS_URL=http://localhost:9090
# Neo4j Configuration
NEO4J_URI=neo4j+s://your-instance.databases.neo4j.io
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password
NEO4J_DATABASE=neo4j
# Kubernetes API Configuration
KUBERNETES_API_URL=http://your-k8s-api:8001
# Application Configuration
PORT=3000
POLL_INTERVAL_Ms=30000 # Metrics polling interval (30 seconds)
SCORE_CALCULATION_INTERVAL_Ms=120000 # Score calculation interval (2 minutes)node index.jsThe service will:
- Initialize Neo4j schema
- Check Graph Data Science (GDS) availability
- Perform initial sync and score calculation
- Start scheduled polling
- Launch API server on port 3000
The service uses a 1-minute window for metric aggregation. To modify this, edit src/config.js:
prometheus: {
queryWindow: '1m', // Change to '5m', '15m', etc.
}Adjust polling frequency in .env:
# Poll Prometheus every 30 seconds
POLL_INTERVAL_Ms=30000
# Calculate graph scores every 2 minutes
SCORE_CALCULATION_INTERVAL_Ms=120000curl http://localhost:3000/metrics/snapshotResponse:
{
"timestamp": "2026-01-10T12:00:00Z",
"window": "1m",
"services": [
{
"name": "frontend",
"namespace": "default",
"rps": 12.3,
"errorRate": 0.01,
"p95": 120.5
}
],
"edges": [
{
"from": "frontend",
"to": "backend",
"namespace": "default",
"rps": 5.2,
"errorRate": 0.00,
"p95": 80.1
}
]
}curl http://localhost:3000/graphResponse includes nodes with centrality scores and edges with metrics.
Open your browser to:
http://localhost:3000/api-docs
Deploy sample microservices to generate test traffic:
cd k8s-example
# Enable Istio injection
kubectl label namespace default istio-injection=enabled --overwrite
# Deploy services
kubectl apply -f backend.yaml
kubectl apply -f frontend.yaml
# Verify deployment
kubectl get pods -n defaultAfter 1-2 minutes, the service graph should show frontend β backend connections.
See k8s-example/README.md for details.
service-graph-engine/
βββ index.js # Main entry point
βββ package.json # Dependencies and metadata
βββ src/
β βββ config.js # Configuration management
β βββ server.js # Express API server with Swagger
β βββ prometheus.js # Prometheus metric queries
β βββ kubernetes.js # Kubernetes API integration
β βββ neo4j.js # Neo4j graph operations
β βββ scores.js # Neo4j GDS score calculation (deprecated)
β βββ scores_local.js # Client-side score calculation
β βββ swagger.js # Swagger/OpenAPI configuration
βββ k8s-example/
βββ backend.yaml # Example backend service
βββ frontend.yaml # Example frontend service
βββ README.md # Example deployment guide
Queries Istio metrics from Prometheus:
istio_requests_total- Request ratesistio_request_duration_milliseconds_bucket- Latency percentilescontainer_cpu_usage_seconds_total- CPU usagecontainer_memory_working_set_bytes- Memory usage
Manages Neo4j operations:
- CALLS_NOW: Latest service-to-service metrics
- CALLS_HISTORY: Historical metric snapshots
- RUNS_ON: Pod-to-Node placement
- HOSTS: Node-to-Pod relationships
Calculates graph centrality metrics using Graphology:
- PageRank: Importance based on incoming connections
- Betweenness: Criticality in communication paths
Fetches from Kubernetes API:
- Node CPU and memory usage
- Pod resource consumption
- Pod placement and status
# With auto-restart on file changes
npm install -g nodemon
nodemon index.jsThe service provides detailed logging:
- Sync cycle execution
- Metrics fetched count
- Neo4j graph updates
- Score calculation results
- API server status
Check if metrics are available:
# Test Prometheus connectivity
curl "http://localhost:9090/api/v1/query?query=up"
# Check Istio metrics
curl "http://localhost:9090/api/v1/query?query=istio_requests_total"config.js file currently contains hardcoded credentials. For production:
- Remove all hardcoded credentials
- Use environment variables exclusively
- Never commit
.envfiles to version control - Add
.envto.gitignore:
echo ".env" >> .gitignore- Use secret management tools (e.g., Kubernetes Secrets, HashiCorp Vault)
- Memory: ~100-200 MB baseline
- CPU: Minimal (~1-5% during polling)
- Network: Depends on cluster size and polling frequency
- Tested with up to 50+ services
- Graph queries optimized with Neo4j indexes
- Client-side score calculation scales to hundreds of nodes
- Increase polling intervals for larger clusters
- Use Neo4j AuraDB for managed, scalable database
- Filter out noisy namespaces in Prometheus queries
- Adjust query window based on traffic patterns
-
Verify Istio sidecars are injected:
kubectl get pods -n default -o wide # Should show 2/2 containers (app + istio-proxy) -
Check Prometheus has Istio metrics:
curl "http://prometheus:9090/api/v1/query?query=istio_requests_total" -
Verify namespace labels in Prometheus output
-
Test database connectivity:
# Update credentials in command cypher-shell -a neo4j+s://your-instance.databases.neo4j.io \ -u neo4j -p your-password "MATCH (n) RETURN count(n);"
-
Check firewall rules allow Neo4j ports (7687 for Bolt)
If seeing 500 errors in API responses:
- Check Neo4j logs for query failures
- Verify database disk space
- Review application logs for stack traces
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Commit changes:
git commit -am 'Add feature' - Push to branch:
git push origin feature-name - Submit a Pull Request
ISC License
- Istio - Service mesh platform
- Prometheus - Monitoring and alerting
- Neo4j - Graph database
- Graphology - Graph data structure library
- Express - Web framework
- Swagger - API documentation
For issues and questions:
- Create an issue in the repository
- Check existing issues for similar problems
- Review Prometheus and Neo4j logs for detailed error messages