Application Servers
Your application's backend code and Docker containers run on sherpa.sh's application servers. These servers sit behind our CDN and load balancer within our Kubernetes cluster in your selected region, providing a robust execution environment for your applications.
How It Works
When you deploy your application, sherpa.sh runs your backend code or Docker containers on our application servers. These servers handle all the compute-intensive work while our CDN and load balancer manage traffic distribution and static content delivery.
Architecture Flow:
User request → CDN (static files) or Load Balancer (dynamic content)
Load Balancer → Application Server instances
Application Servers → Your backend code/containers
Response flows back through the same path
For detailed architecture information, see our Architecture Overview page.
Default Resource Configuration
Every application server gets deployed as a swarm of docker containers behind a loadbalancer. There is a maximum resource allocation per individual container.
Resource Allocation
# Per instance limits
CPU: 1 core maximum
Memory: 1GB maximum
Auto-scaling Behavior
The amount of replicas of your containers that get created depends on your plan selected.
# Hobby and Starter Plan Settings
Minimum instances: 1
Maximum instances: 5
CPU scaling threshold: 80%
How scaling works: When your backend code uses more than 80% CPU across instances, new application servers automatically spin up to handle the load.
If you need more instance replicas or different CPU thresholdes, reach out to support in discord.sherpa.sh.
Infrastructure Types
Shared Application Servers (Default)
What you get: Your containers/code run on shared infrastructure alongside other applications.
Benefits:
Zero configuration: Deploy immediately
Automatic load distribution: Traffic spreads across instances
Cost-effective: Shared infrastructure costs
Built-in monitoring: Performance metrics included
Limitations:
Ephemeral storage: No persistent file writes
Shared resources: CPU/memory shared with other applications
Standard resource limits: Fixed allocation per instance
Best for: Stateless APIs, web applications, microservices
Dedicated Application Servers
What you get: Exclusive physical servers running only your application containers.
Available Configurations:
Compute: 2-96 CPU cores, 4-256GB memory
Storage: 80GB-300TB persistent disk
Network: 1-10Gbps dedicated bandwidth
Transfer: 20TB monthly included
Benefits:
Guaranteed performance: No resource contention
Persistent storage: File system writes supported
Custom sizing: Tailored to your workload
Isolation: Enhanced security and performance
Best for: Databases, file processing, high-traffic applications
Regional Deployment
Your application servers run in the region you select during deployment. View our available regions.
Benefits of regional deployment:
Reduced latency: Servers closer to your users
Data compliance: Meet regional data requirements
Improved performance: Faster database connections
Managing Application Servers
Viewing Server Status
Navigate to your application dashboard
Go to Resources > Application Server
Monitor instance count, CPU usage, and memory consumption
Requesting Dedicated Servers
Visit discord.sherpa.sh
Create support ticket with requirements:
Expected traffic volume
Resource requirements (CPU/memory)
Storage needs
Performance requirements
Best Practices
Stateless Design
Design your application to work seamlessly across multiple server instances by avoiding in-memory state storage.
Good: External State Management
// pages/api/users/[id].js
import { getUser } from '../../../lib/database';
export default async function handler(req, res) {
const { id } = req.query;
// Fetch from external database, not server memory
const user = await getUser(id);
if (!user) {
return res.status(404).json({ error: 'User not found' });
}
res.status(200).json(user);
}
Avoid: In-Memory State
// Don't do this - data lost during scaling
let userCache = {}; // Lost when new instances start
export default async function handler(req, res) {
const { id } = req.query;
if (!userCache[id]) {
userCache[id] = await getUser(id); // Won't persist across instances
}
res.status(200).json(userCache[id]);
}
Health Check Implementation
Sherpa.sh automatically checks your application health by requesting the root URL (/
). Ensure this endpoint returns a valid response.
Required Health Check Setup
// pages/index.js or app/page.js (App Router)
export default function Home() {
return (
<div>
<h1>Application Status: Healthy</h1>
<p>Server is running normally</p>
</div>
);
}
// Or for API-only applications
// pages/index.js
export default function handler(req, res) {
res.status(200).json({
status: 'healthy',
timestamp: new Date().toISOString(),
version: process.env.npm_package_version || '1.0.0'
});
}
Advanced Health Check with Dependencies
// pages/index.js
import { checkDatabase } from '../lib/database';
import { checkExternalAPI } from '../lib/external-services';
export default async function handler(req, res) {
try {
// Check critical dependencies
await checkDatabase();
await checkExternalAPI();
res.status(200).json({
status: 'healthy',
checks: {
database: 'connected',
external_api: 'responding'
},
timestamp: new Date().toISOString()
});
} catch (error) {
res.status(503).json({
status: 'unhealthy',
error: error.message,
timestamp: new Date().toISOString()
});
}
}
Efficient Resource Usage
Optimize your application for auto-scaling by implementing efficient async patterns and resource management.
Database Connection Management
// lib/database.js
import { Pool } from 'pg';
// Use connection pooling for database efficiency
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
max: 20, // Maximum connections
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
export async function queryDatabase(query, params) {
const client = await pool.connect();
try {
const result = await client.query(query, params);
return result.rows;
} finally {
client.release(); // Always release connection
}
}
Async API Route Optimization
// pages/api/products/search.js
export default async function handler(req, res) {
const { query, category } = req.query;
try {
// Run parallel requests for better performance
const [products, categories, recommendations] = await Promise.all([
searchProducts(query),
getCategories(category),
getRecommendations(query)
]);
res.status(200).json({
products,
categories,
recommendations
});
} catch (error) {
console.error('Search error:', error);
res.status(500).json({ error: 'Search failed' });
}
}
Monitoring and Optimization
Performance Metrics
CPU utilization: Monitor for scaling triggers
Memory usage: Track memory leaks and optimization opportunities
Response time: Measure application performance
Instance count: Understand scaling patterns
Optimization Strategies
Database optimization: Use connection pooling and query optimization
Caching: Implement Redis or in-memory caching
Async processing: Use background tasks for heavy operations
Resource monitoring: Set up alerts for resource usage
Next Steps
Review architecture: Read our Architecture Overview for complete system understanding
Set up monitoring: Configure application performance monitoring
Implement caching: Add Redis or similar caching layer
Plan scaling: Consider dedicated servers for growing applications
Optimize performance: Profile your application for bottlenecks
Need help optimizing your application server setup? Our support team provides detailed performance analysis and recommendations for your specific use case.
Last updated