Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions PR_DESCRIPTION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
## Pull Request

---

### πŸ“„ Summary
> Why does this change exist?
> What problem does it solve, and why is this the right approach?

Users deploying FastAPI applications with Gunicorn/Uvicorn workers often struggle with missing traces in SigNoz due to OpenTelemetry SDK initialization issues with process forking. While we documented this in the troubleshooting guide, there wasn't a complete, production-ready example showing how to properly handle it.

This adds a comprehensive FastAPI production example that:
- Demonstrates proper Gunicorn worker configuration with `post_fork` hook
- Shows both auto and manual instrumentation patterns
- Includes Docker deployment setup
- Provides troubleshooting guidance based on real-world issues

This is the right approach because it gives users a working reference implementation they can copy and adapt, reducing support load and adoption friction.

---

### βœ… Change Type
_Select all that apply_

- [x] ✨ Feature
- [ ] πŸ› Bug fix
- [ ] ♻️ Refactor
- [ ] πŸ› οΈ Infra / Tooling
- [ ] πŸ§ͺ Test-only
- [ ] πŸ“š Documentation

---

### πŸ› Bug Context
> Required if this PR fixes a bug

N/A - This is a new example addition.

---

### πŸ§ͺ Testing Strategy
> How was this change validated?

- Tests added/updated: N/A (example code)
- Manual verification:
- Code structure follows FastAPI best practices
- Gunicorn config includes proper `post_fork` hook for worker initialization
- README includes comprehensive setup and troubleshooting instructions
- Docker setup tested for syntax correctness
- Edge cases covered: Worker forking, error handling, nested spans

---

### ⚠️ Risk & Impact Assessment
> What could break? How do we recover?

- Blast radius: None - this is a new example, doesn't affect existing code
- Potential regressions: None
- Rollback plan: Simple revert if needed

---

### πŸ“ Changelog
> Fill only if this affects users, APIs, UI, or documented behavior
> Use **N/A** for internal or non-user-facing changes

| Field | Value |
|------|-------|
| Deployment Type | OSS |
| Change Type | Feature |
| Description | Added production-ready FastAPI example demonstrating OpenTelemetry instrumentation with Gunicorn worker support |

---

### πŸ“‹ Checklist
- [x] Tests added or explicitly not required (example code)
- [x] Manually tested (code structure and configuration verified)
- [x] Breaking changes documented (N/A - new addition)
- [x] Backward compatibility considered (N/A - new addition)

---

## πŸ‘€ Notes for Reviewers

This example addresses the common issue of missing spans with Gunicorn/Uvicorn workers that we documented in the Python troubleshooting guide. The `gunicorn_config.py` includes a `post_fork` hook that properly reinitializes OpenTelemetry in each worker process.

The example is production-ready and includes:
- Proper worker initialization
- Error handling with span status
- Manual span creation examples
- Docker deployment setup
- Comprehensive troubleshooting section

All code follows FastAPI and OpenTelemetry best practices.

---
4 changes: 2 additions & 2 deletions fastapi/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# FastAPI samples

FastAPI + OpenTelemetry examples. Suggested subfolders: `hello-http`, `auto-vs-manual`, `async-client`, `sqlalchemy`.
FastAPI + OpenTelemetry examples demonstrating production-ready instrumentation patterns.

| Sample | What it shows | Status |
| --- | --- | --- |
| _tbd_ | – | planned |
| `fastapi-production-demo` | Production deployment with Gunicorn/Uvicorn workers, proper worker initialization, manual spans, error handling | βœ… Ready |

Use `templates/SAMPLE_README.md` to document each app.
15 changes: 15 additions & 0 deletions fastapi/fastapi-production-demo/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
ENV/
.venv
*.egg-info/
dist/
build/
.pytest_cache/
.coverage
htmlcov/
28 changes: 28 additions & 0 deletions fastapi/fastapi-production-demo/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
FROM python:3.11-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
gcc \
&& rm -rf /var/lib/apt/lists/*

# Copy requirements and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Install OpenTelemetry instrumentation
RUN opentelemetry-bootstrap --action=install

# Copy application code
COPY . .

# Expose port
EXPOSE 8000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:8000/health')" || exit 1

# Run with Gunicorn
CMD ["opentelemetry-instrument", "gunicorn", "app:app", "-c", "gunicorn_config.py"]
199 changes: 199 additions & 0 deletions fastapi/fastapi-production-demo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
# FastAPI Production Demo

FastAPI app with OpenTelemetry instrumentation, configured for production with Gunicorn workers.

Shows:
- FastAPI auto-instrumentation
- Gunicorn with Uvicorn workers
- Worker initialization for OpenTelemetry (handles forking issue)
- Manual span creation
- Error handling with spans

## Stack

- **Runtime:** Python 3.11+
- **Framework:** FastAPI 0.115.0
- **ASGI Server:** Uvicorn 0.32.0
- **WSGI Server:** Gunicorn 23.0.0 (for production)
- **OpenTelemetry:** opentelemetry-distro 0.45b0, opentelemetry-exporter-otlp 1.27.0

## Prerequisites

- Python 3.11 or newer
- SigNoz instance (cloud or self-hosted)
- OTLP endpoint accessible (default: `http://localhost:4317` for self-hosted)

## Quick Start

### 1. Install Dependencies

```bash
pip install -r requirements.txt
```

### 2. Install OpenTelemetry Instrumentation

```bash
opentelemetry-bootstrap --action=install
```

### 3. Set Environment Variables

For **SigNoz Cloud**:
```bash
export OTEL_SERVICE_NAME=fastapi-production-demo
export OTEL_EXPORTER_OTLP_ENDPOINT=https://ingest.<REGION>.signoz.cloud:443
export OTEL_EXPORTER_OTLP_HEADERS="signoz-ingestion-key=<YOUR_INGESTION_KEY>"
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
```

For **Self-hosted SigNoz**:
```bash
export OTEL_SERVICE_NAME=fastapi-production-demo
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_RESOURCE_ATTRIBUTES="deployment.environment=production"
```

### 4. Run the Application

**Development (single process):**
```bash
opentelemetry-instrument uvicorn app:app --host 0.0.0.0 --port 8000
```

**Production (with Gunicorn workers):**
```bash
opentelemetry-instrument gunicorn app:app -c gunicorn_config.py
```

## Endpoints

- `GET /` - Root endpoint with service info
- `GET /health` - Health check
- `GET /api/users/{user_id}` - Get user by ID (simulates DB query)
- `GET /api/users/{user_id}/orders` - Get user orders (shows nested spans)
- `GET /api/metrics/demo` - Metrics demonstration endpoint

## What to Look For

In SigNoz, you should see:
- HTTP request spans (auto-created)
- Custom spans for DB queries and API calls
- Nested spans showing operation flow
- Error spans when things fail

Auto-instrumentation handles HTTP requests. Manual spans are used for DB queries and external calls.

## Production Deployment

### Gunicorn Configuration

The `gunicorn_config.py` includes a `post_fork` hook that creates a fresh TracerProvider in each worker process. This is necessary because the OTel SDK's background threads (BatchSpanProcessor, etc.) don't survive `fork()`. Without this, spans from worker processes won't export.

### Worker Count

Adjust based on your workload:
```bash
export WORKERS=4 # Default
opentelemetry-instrument gunicorn app:app -c gunicorn_config.py
```

For CPU-bound workloads: `workers = (2 Γ— CPU cores) + 1`
For I/O-bound workloads: `workers = (4 Γ— CPU cores) + 1`

### Docker Deployment

Example Dockerfile:
```dockerfile
FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
RUN opentelemetry-bootstrap --action=install

COPY . .

EXPOSE 8000

CMD ["opentelemetry-instrument", "gunicorn", "app:app", "-c", "gunicorn_config.py"]
```

## Troubleshooting

### Missing Spans with Multiple Workers

If spans are missing with Gunicorn workers, it's because the OpenTelemetry SDK's background threads (BatchSpanProcessor) don't survive `fork()`. The `post_fork` hook in `gunicorn_config.py` creates a fresh TracerProvider in each worker - it's already set up.

### Spans Not Appearing in SigNoz

1. **Check OTLP endpoint:**
```bash
echo $OTEL_EXPORTER_OTLP_ENDPOINT
```

2. **Verify connectivity:**
```bash
curl $OTEL_EXPORTER_OTLP_ENDPOINT/health
```

3. **Check service name:**
```bash
echo $OTEL_SERVICE_NAME
```

4. **Enable debug logging:**
```bash
export OTEL_LOG_LEVEL=debug
```

### Hot Reload Issues

**Problem:** Instrumentation breaks when using `--reload` flag.

**Solution:** Don't use `--reload` in production. For development, use single process mode:
```bash
opentelemetry-instrument uvicorn app:app --reload
```

### gRPC vs HTTP Exporter

This example uses gRPC by default. For HTTP:
```bash
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
```

## Validation

1. **Start the application:**
```bash
opentelemetry-instrument gunicorn app:app -c gunicorn_config.py
```

2. **Make test requests:**
```bash
curl http://localhost:8000/
curl http://localhost:8000/api/users/1
curl http://localhost:8000/api/users/1/orders
```

3. **Check SigNoz:**
- Navigate to Traces section
- Filter by service name: `fastapi-production-demo`
- Verify spans are appearing with proper hierarchy

## Notes

- **Resource attributes:** Set via `OTEL_RESOURCE_ATTRIBUTES` env var
- **Context propagation:** Automatic for HTTP requests via FastAPI instrumentation
- **Worker processes:** Each worker maintains its own OpenTelemetry SDK instance
- **Error handling:** Exceptions are automatically recorded in spans

## Related Documentation

- [SigNoz Python Instrumentation Guide](https://signoz.io/docs/instrumentation/opentelemetry-python/)
- [FastAPI Instrumentation](https://signoz.io/docs/instrumentation/fastapi/)
- [OpenTelemetry Python Multiprocessing](https://opentelemetry-python.readthedocs.io/en/latest/instrumentation/runtime.html#multiprocessing)
- [Gunicorn Configuration](https://docs.gunicorn.org/en/stable/settings.html)
Loading