Skip to main content

Troubleshooting Guide

Encountering issues with Aicser Platform? This comprehensive troubleshooting guide will help you diagnose and resolve common problems, from installation issues to performance bottlenecks.

🔍 Quick Diagnosis

Health Check Commands

# Check system status
docker-compose ps
docker-compose logs --tail=50

# Check service health
curl -f http://localhost:3005/health
curl -f http://localhost:8000/health
curl -f http://localhost:5432/health

# Check resource usage
docker stats
df -h
free -h

Common Error Patterns

  • Connection Refused: Service not running or wrong port
  • Timeout Errors: Network issues or service overload
  • Authentication Failed: Invalid credentials or expired tokens
  • Permission Denied: File system or database access issues

🚨 Common Issues & Solutions

1. Installation & Setup Issues

Docker Compose Won't Start

Symptoms: docker-compose up fails with connection errors

Solutions:

# Check Docker service
sudo systemctl status docker
sudo systemctl start docker

# Verify ports are available
netstat -tulpn | grep :3005
netstat -tulpn | grep :8000

# Clear Docker cache
docker system prune -a
docker volume prune

Port Already in Use

Symptoms: Error: Port 3005 is already in use

Solutions:

# Find process using port
lsof -i :3005
sudo fuser -k 3005/tcp

# Or change port in docker-compose.yml
ports:
- "3006:3005" # Use 3006 instead

2. Database Connection Issues

PostgreSQL Connection Failed

Symptoms: FATAL: password authentication failed

Solutions:

# Check database logs
docker-compose logs postgres

# Reset database password
docker-compose exec postgres psql -U postgres
ALTER USER aicser_user PASSWORD 'new_password';

# Verify environment variables
echo $POSTGRES_PASSWORD
echo $POSTGRES_USER

Database Migration Errors

Symptoms: Alembic migration failed

Solutions:

# Check migration status
docker-compose exec auth alembic current
docker-compose exec auth alembic history

# Reset migrations (WARNING: Data loss)
docker-compose exec auth alembic downgrade base
docker-compose exec auth alembic upgrade head

# Check database schema
docker-compose exec postgres psql -U aicser_user -d aicser_db -c "\dt"

3. AI Service Issues

OpenAI API Errors

Symptoms: OpenAI API error: Invalid API key

Solutions:

# Verify API key
echo $OPENAI_API_KEY

# Test API connection
curl -H "Authorization: Bearer $OPENAI_API_KEY" \
https://api.openai.com/v1/models

# Check rate limits
curl -H "Authorization: Bearer $OPENAI_API_KEY" \
https://api.openai.com/v1/usage

Model Loading Failures

Symptoms: Failed to load AI model

Solutions:

# Check model cache
ls -la /tmp/aicser_models/

# Clear model cache
rm -rf /tmp/aicser_models/*

# Check available memory
free -h

# Verify model files
docker-compose exec ai-analytics ls -la /app/models/

4. Frontend Issues

React App Won't Load

Symptoms: White screen or JavaScript errors

Solutions:

# Check frontend logs
docker-compose logs client

# Verify build files
ls -la packages/client/client/build/

# Rebuild frontend
docker-compose exec client npm run build

# Check browser console for errors
# Press F12 in browser

API Endpoint Errors

Symptoms: 404 Not Found or 500 Internal Server Error

Solutions:

# Check API routes
curl -X GET http://localhost:8000/api/v1/health

# Verify service discovery
docker-compose exec client curl http://auth:8000/health

# Check NGINX configuration
docker-compose exec nginx nginx -t

5. Performance Issues

Slow Query Response

Symptoms: Queries taking > 5 seconds

Solutions:

# Check database performance
docker-compose exec postgres psql -U aicser_user -d aicser_db -c "
SELECT query, mean_time, calls
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;"

# Check cache hit rate
docker-compose exec redis redis-cli info memory

# Monitor resource usage
docker stats --no-stream

High Memory Usage

Symptoms: Out of memory errors or slow performance

Solutions:

# Check memory usage
free -h
docker stats --no-stream

# Restart memory-intensive services
docker-compose restart ai-analytics
docker-compose restart postgres

# Adjust memory limits in docker-compose.yml
services:
ai-analytics:
deploy:
resources:
limits:
memory: 4G

🛠️ Debugging Techniques

1. Log Analysis

Enable Debug Logging

# Set debug level
export LOG_LEVEL=DEBUG
export DEBUG=true

# Restart services
docker-compose restart

# Follow logs in real-time
docker-compose logs -f --tail=100

Log Search & Filtering

# Search for errors
docker-compose logs | grep -i error

# Search for specific service
docker-compose logs auth | grep -i "authentication"

# Search with context
docker-compose logs | grep -A 5 -B 5 "error"

2. Network Debugging

Check Network Connectivity

# Test inter-service communication
docker-compose exec auth ping postgres
docker-compose exec client ping auth

# Check DNS resolution
docker-compose exec auth nslookup postgres
docker-compose exec auth nslookup redis

# Verify port accessibility
telnet postgres 5432
telnet redis 6379

Network Configuration

# Check Docker network
docker network ls
docker network inspect aicser-world_default

# Inspect container networking
docker inspect <container_id> | grep -A 20 "NetworkSettings"

3. Database Debugging

PostgreSQL Debugging

# Connect to database
docker-compose exec postgres psql -U aicser_user -d aicser_db

# Check active connections
SELECT * FROM pg_stat_activity;

# Check query performance
SELECT * FROM pg_stat_statements ORDER BY total_time DESC LIMIT 10;

# Check table sizes
SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
FROM pg_tables
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

Redis Debugging

# Connect to Redis
docker-compose exec redis redis-cli

# Check memory usage
INFO memory

# Check key statistics
INFO keyspace

# Monitor commands in real-time
MONITOR

📊 Monitoring & Alerting

1. Health Checks

Service Health Endpoints

# Health check URLs
http://localhost:3005/health # Frontend
http://localhost:8000/health # Auth API
http://localhost:8001/health # AI Analytics
http://localhost:5432/health # PostgreSQL
http://localhost:6379/health # Redis

# Health check script
#!/bin/bash
services=("3005" "8000" "8001" "5432" "6379")
for port in "${services[@]}"; do
if curl -f "http://localhost:$port/health" >/dev/null 2>&1; then
echo "✅ Port $port: Healthy"
else
echo "❌ Port $port: Unhealthy"
fi
done

Automated Monitoring

# Prometheus health check configuration
scrape_configs:
- job_name: 'aicser-health'
static_configs:
- targets: ['localhost:3005', 'localhost:8000', 'localhost:8001']
metrics_path: /health
scrape_interval: 30s

2. Performance Monitoring

Key Metrics to Monitor

  • Response Time: API endpoint latency
  • Error Rate: Failed requests percentage
  • Throughput: Requests per second
  • Resource Usage: CPU, memory, disk utilization

Alerting Rules

# Grafana alerting rules
groups:
- name: aicser-alerts
rules:
- alert: HighResponseTime
expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 2
for: 5m
labels:
severity: warning
annotations:
summary: "High response time detected"

🆘 Getting Help

1. Self-Service Resources

Documentation

Community Support

2. Professional Support

Enterprise Support

  • Email: support@aicser.com
  • Phone: +1-800-AICSER-HELP
  • Slack: Enterprise customers only
  • Response Time: 4 hours (business hours)

Support Tiers

  • Community: GitHub issues and community forum
  • Standard: Email support with 24-hour response
  • Premium: Phone, email, and Slack with 4-hour response
  • Enterprise: Dedicated support engineer

3. Escalation Process

When to Escalate

  1. Critical Issues: System down, data loss, security breach
  2. Performance Issues: Response time > 10 seconds
  3. Integration Problems: API failures affecting business operations
  4. Compliance Issues: Security or regulatory concerns

Escalation Contacts

📝 Issue Reporting

1. Bug Report Template

## Issue Description
Brief description of the problem

## Steps to Reproduce
1. Step 1
2. Step 2
3. Step 3

## Expected Behavior
What should happen

## Actual Behavior
What actually happens

## Environment
- OS: [e.g., Ubuntu 20.04]
- Docker: [e.g., 20.10.0]
- Aicser Version: [e.g., 1.0.0]

## Logs
Relevant log output

## Additional Context
Any other information that might be helpful

2. Feature Request Template

## Feature Description
What feature would you like to see

## Use Case
How would this feature help you

## Proposed Solution
Your suggested implementation

## Alternatives Considered
Other approaches you've considered

## Additional Context
Any other relevant information

Still Need Help?