How to Fix Java Container Issues on Azure: JVM & Memory

Microsoft Fix Intermediate 14 min read Official Docs Grounded Updated April 20, 2026

Why This Is Happening

I've seen this exact scenario play out on dozens of Azure deployments: a Java application runs perfectly on a developer laptop or a beefy on-premises server, and then the moment it lands inside a Docker container on Azure Kubernetes Service or Azure Container Apps, things go sideways. Pods restart randomly. Response times spike. Memory usage climbs until the container gets killed with an OOM (Out of Memory) error. And the error messages? Completely unhelpful. Something like Killed in the container logs, or just a silent pod restart with exit code 137.

Here's the thing , this isn't a bug in your code. It's almost always a mismatch between how the JVM thinks about memory and CPU, and what the container actually has available. When you fix Java container out of memory errors on Azure, you're really fixing a configuration problem that the JVM's default behavior makes worse.

The JVM was designed to run on bare metal or full VMs. It looks at the host system to decide how to configure itself , how big the heap should be, how many garbage collector threads to spin up, which GC algorithm to use. In a containerized environment, that's a problem. Your container might have a 2 GB memory limit, but the JVM is looking at the host node with 64 GB of RAM and thinking it can claim 16 GB for the heap. When it tries, the container's OOM killer steps in and terminates the process.

CPU quotas make things even trickier. If your Kubernetes pod has a cpu_limit of 500m (half a core), but the JVM counts all 32 cores on the node as "available processors," you'll end up with way too many GC threads competing for a tiny slice of CPU time. That's not just wasteful, it actively hurts throughput and causes unpredictable pauses.

This affects everyone: developers migrating existing Spring Boot apps to Azure Container Apps, teams containerizing Java microservices for AKS, and architects moving on-premises Tomcat workloads to Azure App Service. The JVM container configuration problems described here apply to OpenJDK 11 and later, including Microsoft Build of OpenJDK, Azul Zulu, Eclipse Temurin, and Oracle OpenJDK.

The fix isn't complicated once you understand the root cause. You need to explicitly tell the JVM how much memory it can use, which garbage collector to run, and match those settings to the actual resources your container has been granted. Browse all Microsoft fix guides →

The Quick Fix, Try This First

If your Java container is crashing or running slowly on Azure and you want a single fix to try right now, this is it. For most general-purpose Java microservices, Spring Boot APIs, REST services, background workers, the following JVM startup flags will resolve the majority of container memory and performance issues in one shot:

-XX:+UseParallelGC -XX:MaxRAMPercentage=75

Add those two flags to your container's JVM startup parameters, redeploy, and watch the crash rate drop. Here's exactly what each flag does and why it matters for Java container performance on Azure:

-XX:+UseParallelGC explicitly selects the Parallel garbage collector. Without this, the JVM uses its own heuristics to pick a GC. In a container with limited RAM, those heuristics often pick SerialGC (single-threaded, high latency) or G1GC (more overhead than you need for a small heap). Parallel GC hits the sweet spot for microservices: it's multi-threaded, works well on heaps under 4 GB, and has minimal overhead compared to G1GC or ZGC.

-XX:MaxRAMPercentage=75 tells the JVM to cap its heap at 75% of the container's available memory. This is the key fix for JVM heap size container configuration problems. If your container has 4 GB of RAM, the JVM heap tops out at 3 GB, leaving 1 GB for the JVM's own internal structures, thread stacks, class metadata, off-heap memory, and the OS. Set this too high (say 95%) and you're right back to OOM crashes. Set it too low and you're leaving performance on the table.

If you're running on Kubernetes, your deployment YAML should look something like this:

env:
  - name: JAVA_OPTS
    value: "-XX:+UseParallelGC -XX:MaxRAMPercentage=75"
resources:
  requests:
    memory: "4Gi"
    cpu: "2"
  limits:
    memory: "4Gi"
    cpu: "2"

Note that requests and limits should match. If limits are higher than requests, the JVM might see a different memory ceiling than what actually gets enforced under pressure.

Pro Tip
Always set both -XX:MaxRAMPercentage and -XX:InitialRAMPercentage to the same value (75) when running in a container environment where memory is guaranteed. This tells the JVM not to bother shrinking and growing the heap dynamically, it can start at full size immediately, which cuts down on GC overhead during application warm-up and avoids a common source of slow cold-start performance on Azure Container Apps and Azure Spring Apps.
1
Audit Your Container Memory Allocation

Before touching a single JVM flag, you need to know what memory your container actually has. This sounds obvious, but I've seen teams spend hours tuning JVM parameters only to realize their container was misconfigured with 512 MB when the app needed 4 GB.

On Kubernetes (AKS), check your current resource limits:

kubectl describe pod <your-pod-name> -n <namespace>

Look for the Limits section under your container. On Azure Container Apps, open the Azure Portal, navigate to your Container App → ContainersEdit & deploy → select your container image → check CPU and Memory.

Now ask yourself: is this enough for your workload? If you're not sure, Microsoft's official guidance is to start with 4 GB of container memory. That's a good default for most Java container memory allocation decisions, especially for Spring Boot applications. If your application builds large object graphs, processes lots of data in-memory, or caches heavily, you'll need more.

A quick way to check what the JVM actually sees when it starts is to add these diagnostic flags temporarily:

-XX:+PrintFlagsFinal -XX:+PrintGCDetails -Xlog:gc*

These will dump GC configuration and memory settings at startup. Look for MaxHeapSize in the output, if it's anywhere near your container's total memory limit, you have a misconfiguration and an OOM crash is just a matter of time.

For existing applications migrating from on-premises: start with the same memory the app currently uses on its VM or bare-metal host. If the old server had 8 GB allocated to this Java process, start your container at 8 GB. Don't assume smaller is fine just because "containers are lightweight."

When the step is done correctly, you should know exactly how much container memory you have and roughly how much heap the JVM is currently trying to claim.

2
Set JVM Heap Size Using the Right Flags

This is where most of the confusion lives. There are multiple ways to set JVM heap size in containers, and picking the wrong flag, or misreading what a flag actually does, will cause crashes even when your numbers look right on paper.

Here's the breakdown of the flags you'll actually use:

# Fixed maximum heap, use when you know exactly how much you want
-Xmx4g

# Dynamic maximum heap, use this in containers (preferred)
-XX:MaxRAMPercentage=75

# Set initial heap to match max (prevents heap resizing at startup)
-XX:InitialRAMPercentage=75

# Minimum heap size (use with -XX:InitialRAMPercentage for containers)
-Xms3g   # absolute value alternative

The one flag I need to specifically call out: -XX:MinRAMPercentage. Despite what the name suggests, this flag does not set the minimum heap size for your container. It sets the default maximum RAM percentage for systems with 256 MB or less of available memory. If you use this flag thinking it controls minimum heap on a normal container, you'll be confused why nothing changes. Use -XX:InitialRAMPercentage for minimum/initial heap instead.

The golden rule for JVM heap size container configuration: never set max heap equal to container memory. The JVM needs memory beyond the heap, for thread stacks, class metadata (Metaspace), JIT compilation buffers, GC internal data, and native libraries. Set the heap to 100% of container memory and you will hit container OOM errors. The container's OOM killer doesn't care about JVM internals; it just sees memory usage exceeding the limit and kills the process.

75% is the recommended allocation. On a 4 GB container that means 3 GB for heap and 1 GB breathing room for everything else. On an 8 GB container, 6 GB for heap. Use -XX:MaxRAMPercentage=75 rather than hardcoded -Xmx values whenever possible, it scales automatically if you change container size later, which saves a configuration headache down the road.

You'll know this step worked when your pod no longer shows exit code 137 (OOM kill) in the event log and JVM startup logs show a MaxHeapSize that's approximately 75% of your container limit.

3
Choose the Right Garbage Collector for Your Workload

Picking the wrong GC for Java applications in containers is one of the most common, and most impactful, configuration mistakes I see on Azure deployments. The JVM will pick a GC for you if you don't specify one, but its choice is based on heuristics that don't always translate well to containerized environments.

Here's a plain-English breakdown of which GC to use and when:

SerialGC, single-threaded, meant for single-core or very small-heap situations (under 4 GB). High pause times. Not great for any production workload that handles concurrent requests. You'd only choose this deliberately for a very constrained sidecar container or a batch job with no latency requirements.

ParallelGC, multi-threaded, great for heaps under 4 GB, minimal overhead. This is the recommended starting point for most Java microservices on Azure. If your app is a REST API, a Spring Boot service, or a standard microservice, start here.

G1GC, designed for heaps of 4 GB and above. More overhead than ParallelGC but better at keeping pause times predictable. If your heap needs to be 4 GB or larger, G1GC is a good choice. Needs at least 2 vCPU cores to function well.

ZGC, available from JDK 17+. Sub-millisecond pause times, designed for very large heaps. Best for latency-sensitive workloads like APIs with strict SLAs. More CPU overhead than G1GC. Needs 2+ cores.

ShenandoahGC, available from JDK 11+. Pause times under 10ms, handles medium to large heaps well. Good for request-response and database-heavy workloads where ZGC's JDK 17 requirement is a blocker.

For most teams reading this guide: start with ParallelGC. The flag is:

-XX:+UseParallelGC

If you need G1GC:

-XX:+UseG1GC

For ZGC on JDK 17+:

-XX:+UseZGC

Two cores are required for any GC besides SerialGC. On Kubernetes, make sure your cpu_limit is at least 2000m before choosing ParallelGC, G1GC, ZGC, or ShenandoahGC. Running a multi-threaded GC on less than one vCPU core is a recipe for CPU throttling and degraded throughput.

You'll know this step worked when GC pause times stabilize in your application logs and you stop seeing unexpected latency spikes correlated with GC events.

4
Configure CPU Cores and Set Replica Count

Memory gets most of the attention in Java container performance tuning, but CPU configuration is just as important, and just as frequently misconfigured. The JVM uses the number of "available processors" to make decisions about GC thread counts, JIT compiler parallelism, and even the default GC selection. Get this wrong and you'll see CPU throttling, degraded throughput, and erratic behavior.

The official recommendation is to start with 2 vCPU cores. On Kubernetes, that looks like this:

resources:
  requests:
    cpu: "2"
  limits:
    cpu: "2"

On Azure Container Apps, go to your container's resource settings and set CPU to 2.0 cores. On Azure Spring Apps, this is set at the app level under Scale and replicas.

One thing to be aware of: if your container has a CPU quota applied (common in shared AKS clusters), the JVM might still see all the node's CPUs when it counts "available processors." Newer JVM versions (OpenJDK 11+) are container-aware and will use the CPU quota to calculate available processors, but this depends on the JVM being run with container support enabled. Verify your JDK version is 11 or later and that the JVM isn't being launched with -XX:-UseContainerSupport (that flag disables container awareness and causes the JVM to see host CPUs instead of the container quota).

To check container support status at startup, add this diagnostic flag temporarily:

-XX:+PrintFlagsFinal | grep UseContainerSupport

It should show true. If it doesn't, remove any flag that disables it.

For replicas: Microsoft recommends starting with 2 replicas in any container orchestration environment, AKS, OpenShift, Azure Spring Apps, Azure Container Apps, or Azure App Service. A single replica is a single point of failure. Two gives you redundancy and lets you roll updates without downtime. The full recommended starting configuration is: 2 vCPU cores, 4 GB container memory, 75% heap, ParallelGC, 2 replicas.

After setting this, watch your pod CPU metrics in Azure Monitor for a few minutes of load. If you see CPU consistently throttled (Kubernetes will report this as cpu_throttled_seconds_total in Prometheus metrics), increase your CPU limit before increasing replica count.

5
Establish a Baseline with Azure Application Insights

Configuration is half the battle. The other half is knowing whether your changes actually made a difference, and whether the resources you've allocated are too much, too little, or just right. An over-provisioned container wastes money. An under-provisioned one degrades performance. Finding the right balance requires a baseline.

Azure Application Insights is the recommended tool for establishing a baseline for containerized Java applications. It gives you visibility into heap usage over time, GC activity, CPU consumption, and request latency, all the signals you need to tune your configuration with data instead of guesswork.

To set up Application Insights for a Java container, use the OpenTelemetry-based auto-instrumentation agent. Download the agent JAR from the Azure portal or Maven Central and add it to your container startup:

-javaagent:/path/to/applicationinsights-agent.jar

Set the connection string as an environment variable in your container:

env:
  - name: APPLICATIONINSIGHTS_CONNECTION_STRING
    value: "InstrumentationKey=your-key-here;IngestionEndpoint=https://..."

Once connected, navigate to Azure Portal → Application Insights → Performance to see request latency and throughput. For JVM-specific metrics, go to Metrics → Add metric and look for metrics under the azure.applicationinsights namespace, you'll find JVM heap used, GC count, GC time, and more.

What you're looking for in the baseline: does heap usage grow unboundedly over time (memory leak) or does it stabilize? Are GC pause times consistent or are there occasional long pauses that correlate with latency spikes? Is CPU usage near the limit (suggesting you need more cores) or near zero (suggesting you're over-provisioned)?

Run your application under realistic load, not just idle, for at least 15–30 minutes before drawing conclusions. JVM JIT compilation and class loading front-load CPU usage at startup, so early metrics won't represent steady-state behavior. Once you have steady-state data, you have a real baseline to compare against when you make future configuration changes.

You'll know this step is working when you can see live heap, CPU, and GC metrics in the Application Insights dashboard and your container is no longer restarting unexpectedly.

Advanced Troubleshooting

If the steps above didn't fully resolve your Java container issues on Azure, or if you're dealing with a more complex deployment, multi-container pods, enterprise AKS clusters with strict quotas, or legacy apps with unusual memory patterns, here's where to dig deeper.

Diagnosing Exit Code 137 (Container OOM Kill)

Exit code 137 means the container was killed by the Linux OOM killer, not a JVM crash, but a hard OS-level termination. This happens when total container memory usage (heap + non-heap + OS overhead) exceeds the container's memory limit. Even if your JVM heap is correctly bounded, non-heap memory, Metaspace, thread stacks, direct byte buffers, native libraries, can push total usage over the limit.

Check Metaspace usage with:

-XX:MaxMetaspaceSize=256m

Without a limit, Metaspace grows unboundedly on OpenJDK. For most applications, 256m is sufficient. Watch the value in Application Insights or GC logs and adjust upward if you see OutOfMemoryError: Metaspace in the JVM logs.

Analyzing GC Logs for Performance Problems

Enable GC logging to get detailed visibility into garbage collection behavior:

-Xlog:gc*:file=/var/log/app/gc.log:time,uptime:filecount=5,filesize=20m

This writes rotating GC logs to a file. Look for GC pause (G1 Humongous Allocation) entries in G1GC logs, these indicate objects larger than 50% of a G1 region being allocated, which causes stop-the-world pauses. If you see these, either switch to a larger region size (-XX:G1HeapRegionSize=32m) or reconsider whether G1GC is the right collector for your workload.

Kubernetes-Specific: CPU Throttling and Quota Mismatch

On AKS, if you set cpu_limit lower than cpu_request, or if the cluster has a ResourceQuota that caps CPU below what your deployment requests, you'll get CPU throttling. Throttling causes the JVM's GC threads to run slower than expected, which can make GC pauses appear much longer than they should be. Check throttling with:

kubectl top pod <pod-name> -n <namespace>

And verify ResourceQuota with:

kubectl describe resourcequota -n <namespace>

Azure Spring Apps and Azure Container Apps: Platform-Level Settings

Both Azure Spring Apps and Azure Container Apps have their own resource configuration UIs that override or interact with what you set in your JVM flags. In Azure Spring Apps, JVM options are set under Apps → [App Name] → Configuration → General settings → JVM options. Make sure platform-level memory settings don't conflict with your -XX:MaxRAMPercentage flag. In Azure Container Apps, scale rules that increase replicas dynamically can change the effective per-instance resources, verify that auto-scaled instances still have adequate CPU and memory per container.

Existing On-Premises Apps: Don't Downsize Blindly

If you're migrating a Java application from an on-premises server to Azure containers, resist the temptation to cut resources because containers "feel lighter." Start with the same vCPU count and memory the application used on-premises. If the old server ran your app with 4 cores and 16 GB, start your container at 4 vCPUs and 16 GB. Establish a baseline first, then right-size down based on actual observed usage. Cutting resources before you have data is the most common reason I see on-premises Java migrations fail immediately after containerization.

When to Call Microsoft Support
If you've followed all the steps in this guide, your JVM flags are correct, your GC is configured properly, and you're still seeing unexplained container crashes or Java heap corruption errors, it's time to escalate. Microsoft Support can help with Azure-platform-level issues like AKS node memory pressure, Azure Container Apps platform bugs, or Application Insights instrumentation problems that you can't resolve through configuration alone. Open a support ticket at Microsoft Support and bring your GC logs, Application Insights data, and a description of the JVM flags you're using. The more data you bring, the faster they can help.

Prevention & Best Practices

The best Java container performance problem is one you never have. Once you've stabilized your current deployment, here's how to stay out of trouble going forward.

Standardize your JVM flags across teams. The single biggest source of container misconfiguration I see in enterprise environments is inconsistency, different teams using different flags, some with no flags at all, relying on JVM defaults that aren't designed for containers. Establish a standard baseline configuration (-XX:+UseParallelGC -XX:MaxRAMPercentage=75 -XX:InitialRAMPercentage=75) and put it in your base Docker image or a shared Helm chart values file. Make deviation from it require a conscious, documented decision.

Always set both requests and limits to the same value on Kubernetes. Burstable pods (where limits exceed requests) are unpredictable under cluster memory pressure. The node might evict your pod to reclaim resources even when your app is behaving perfectly normally. For Java applications in production on AKS, match requests and limits on both CPU and memory.

Test under realistic load before going to production. A container that passes smoke tests at idle can still be drastically under-resourced once real traffic hits. Run load tests with a tool like Azure Load Testing, target at least 80% of expected peak traffic, and watch heap usage, GC pause frequency, and CPU utilization in Application Insights before you declare the configuration production-ready.

Pin your JDK version and don't rely on the OS default. Container base images sometimes update JDKs in minor releases that change default ergonomics. Pin to a specific JDK version tag in your Dockerfile (e.g., eclipse-temurin:21.0.3, not eclipse-temurin:21) and treat JDK updates as a deliberate, tested change, not a side effect of rebuilding your image.

Monitor Metaspace, not just heap. Most Java container monitoring setups track heap utilization, which is important. But Metaspace, the memory area for class metadata, is separate from heap and has no default limit. In applications that do heavy reflection, proxy generation (Spring, Hibernate), or dynamic class loading, Metaspace can grow to hundreds of megabytes and push total container memory usage over the limit. Always set -XX:MaxMetaspaceSize explicitly and track it in your monitoring.

Quick Wins
  • Add -XX:+UseParallelGC -XX:MaxRAMPercentage=75 -XX:InitialRAMPercentage=75 to every Java container as a starting configuration, not just the ones currently causing problems.
  • Set -XX:MaxMetaspaceSize=256m to prevent unbounded Metaspace growth in Spring-based applications.
  • Use a liveness probe with a generous initial delay (initialDelaySeconds: 60) so Kubernetes doesn't kill your pod during JVM JIT warmup, which can spike CPU usage significantly.
  • Enable Azure Monitor alerts on container restart count, if a pod restarts more than twice in 10 minutes, you want to know before your users do.

Frequently Asked Questions

Why does my Java container keep getting OOM killed even though I set -Xmx correctly?

The most common reason is that -Xmx controls only the heap, not total JVM memory usage. The JVM also needs memory for Metaspace (class metadata), thread stacks (~1 MB per thread), JIT compilation buffers, GC internal structures, and any off-heap allocations your code makes with ByteBuffer.allocateDirect(). If you set -Xmx equal to your container's memory limit, all that non-heap memory pushes total usage over the limit and the container's OOM killer fires. The fix is to cap the heap at 75% of container memory using -XX:MaxRAMPercentage=75, which leaves room for the JVM's overhead. Also add -XX:MaxMetaspaceSize=256m to prevent unbounded Metaspace growth from silently consuming what you thought was free memory.

What is the difference between -XX:MaxRAMPercentage and -XX:MinRAMPercentage?

This one trips up almost everyone. Despite the names, -XX:MinRAMPercentage does not set a minimum heap size, it sets the default maximum RAM percentage for systems with 256 MB or less of available memory. On a normal container with 1 GB or more of RAM, this flag has no effect on heap sizing. To set a minimum or initial heap size as a percentage of available memory, use -XX:InitialRAMPercentage. To set the maximum heap as a percentage, use -XX:MaxRAMPercentage. These flags are available on OpenJDK 11 and later, including all major distributions used with Azure.

Should I use G1GC or ParallelGC for my Java microservice on Azure Container Apps?

Start with ParallelGC for most microservices. It's multi-threaded, has minimal overhead, and performs well on heaps under 4 GB, which covers the majority of containerized Java workloads. G1GC makes more sense when your heap needs to be 4 GB or larger, since G1GC is specifically optimized for larger heaps and provides more predictable pause times at that scale. Both require at least 2 vCPU cores to work properly; running either on a single core or with a very low CPU limit will cause CPU throttling that makes pauses appear longer than they actually are. For latency-sensitive workloads where pause time is measured in single-digit milliseconds, consider ZGC (JDK 17+) or ShenandoahGC (JDK 11+).

How much memory should I give my Java container on Azure if I have no idea where to start?

Start with 4 GB. That's the official Microsoft recommendation for new Java container deployments with no existing baseline data. From there, pair it with 2 vCPU cores, set your JVM heap to 75% with -XX:MaxRAMPercentage=75, use ParallelGC, and deploy 2 replicas. This configuration gives you a stable starting point that works for most general-purpose Java microservices. Then connect Azure Application Insights, run load tests, and watch the heap and CPU metrics over time. If heap usage stays consistently below 50% of max, you can try reducing container memory. If it frequently approaches 90%, increase memory or look for a memory leak.

My Java app worked fine on-premises but crashes immediately in a container, what changed?

Almost certainly the JVM is misconfiguring itself based on the host machine's resources instead of the container's limits. On a bare-metal or VM deployment, the JVM can see all available RAM and CPUs and sets its defaults accordingly, often claiming 25% of system RAM for the heap, which on a 32 GB server is 8 GB. When that same JVM starts inside a container with a 2 GB memory limit but no explicit flags, it still tries to claim 8 GB, the container OOM killer fires, and the process dies. The fix is to set explicit JVM flags (-XX:MaxRAMPercentage=75) that bound the heap to what the container actually has. For existing on-premises apps, also start by matching the container's CPU and memory to what the app used on-premises, then right-size down once you have baseline data.

What is the next step after configuring my Java container's JVM settings?

After getting your JVM flags right, the next step is establishing a baseline, measuring what your application actually does under load so you can make informed decisions about resourcing and future tuning. Microsoft recommends using Azure Application Insights with the OpenTelemetry-based Java auto-instrumentation agent for this. Add the agent JAR to your container startup with -javaagent:/path/to/applicationinsights-agent.jar and set your connection string as an environment variable. Run your app under realistic load for 15–30 minutes and watch heap usage, GC metrics, and CPU utilization. The data you collect becomes your baseline, any future configuration change can be compared against it to confirm whether it helped, hurt, or had no effect.

Related Microsoft Fix Guides

H
Sai Kiran Pandrala
Our team includes certified Microsoft engineers, Azure architects, and system administrators with 10+ years of enterprise IT experience. Every guide is written from hands-on troubleshooting, not guesswork. We test every fix before publishing.