Vanilla Java is a series of posts advocating for pure Java first, then purpose built utilities with JDK, and only adding third-party dependencies as a last resort.
Be lean: Why binary footprint matters?
Starting a greenfield Java project? Be it a web application or a RESTful service perhaps deployed as a microservice, initial steps usually involve:
- firing up your IDE of choice or a terminal
- stepping through a Maven / Gradle / J2EE setup wizard or a framework specific tool such as Initializr in case of a Spring Boot managed application
- running the resulting
jar
in case of an embedded server or deploying the war
into a container (Apache Tomcat or Eclipse Jetty), potentially an application server (Wildfly or Glassfish)
A minimal Spring Boot + Maven project structure generated by Initializr binds to port
8080
and listens for HTTP requests responding with a
404
(there are no predefined endpoints):
# uninteresting fragments such as mvn scripts and transitive build artifacts were removed
spring-boot-sample $ tree
├── pom.xml
├── src
│ ├── main
│ │ ├── java
│ │ │ └── eu
│ │ │ └── freshmen
│ │ │ └── tomcat
│ │ │ ├── ServletInitializer.java
│ │ │ └── TomcatApplication.java
│ │ └── resources
│ │ ├── application.properties
│ │ ├── static
│ │ └── templates
│ └── test
│ └── java
│ └── eu
│ └── freshmen
│ └── tomcat
│ └── TomcatApplicationTests.java
└── target
└── tomcat-0.0.1-SNAPSHOT.war
15 directories, 6 files
spring-boot-sample $ du -sh .
18M
apache-tomcat-10.1.5 $ du -sh .
18M
Total 36MB for a starter application deployed into a standalone Apache Tomcat. Although this number includes a JSON parser, logging libraries, etc., the stub is only going to grow as more features are added. Therefore it's crutial to minimize its size right from the get-go.
In today's world of horizontally scaled microservices a medium sized business commonly deploys hundreds of application instances. Any unnecessary bloat should be considered plain and simply unacceptable. My main reasons why:
- Slower builds meaning lower productivity.
- Slower deployments due to more data being copied, spending more time in transport and increasing network congestion.
- Larger disk space overhead for each host, that needs to hold a copy of the final image. This is true for both container and non-container world.
- Larger memory footprint, since JVM needs to load and hold the extra libraries in memory, thus reducing the useful part of RAM. This in even more apparent in co-hosted environments, where many instances run on the same physical host.
Have you met com.sun.net.httpserver.HttpServer
?
Yes, com.sun.*
is a perfectly accessible package, since Java 9 the containing module name is jdk.httpserver
. In contrast sun.*
is considered internal.
I only learned about JDK's built-in HttpServer
in 2013, while interviewing for a gaming company in Stockholm. Since then I successfully used the HttpServer
to deploy lean services serving thousands of TPS into production. Here is an equivalent started application built with vanilla Java and HttpServer
. Similartly, it has no default endpoints and simply responds with a 404
to any incoming HTTP request:
vanilla-java-serdi $ tree
├── pom.xml
├── src
│ ├── main
│ │ ├── java
│ │ │ ├── eu
│ │ │ │ └── freshmen
│ │ │ │ └── srdi
│ │ │ │ ├── Application.java
│ │ │ │ └── sender
│ │ │ │ ├── Logger.java
│ │ │ │ ├── Properties.java
│ │ │ │ └── Server.java
│ │ │ └── module-info.java
│ │ └── resources
│ │ └── eu
│ │ └── freshmen
│ │ └── srdi
│ │ └── application.properties
│ └── test
│ └── java
│ └── eu
│ └── freshmen
│ └── srdi
│ └── ApplicationTest.java
└── target
└── sender-receiver-dependency-injection-1.0-SNAPSHOT.jar
17 directories, 9 files
# no third-party dependencies
vanilla-java-serdi $ du -sh .
140K
# add GSON for JSON parsing and logging via logback + slf4j
vanilla-java-serdi $ du -sh .
1.3M .
140KB without any third-party dependencies, that's
277x less than Spring Boot + Apache Tomcat sample app! Add logging via
logback + slf4j and JSON via
GSON parser to make the comparison fair. This brings the vanilla sample app to
1.3MB, somewhat less dramatic although still significant
29x difference.
Naive benchmark
Using ApacheBench, Version 2.3 <$Revision: 1901567 $> ab
with 3 rounds of warm-up. 10 thousand HTTP requests with concurrency 10 each round. Benchmark 1 million HTTP requests with concurrency 20. While serving a 404 page in real life is not very useful, this benchmark is showing a baseline mean TPS and resident memory consumption as measured by top
on Gentoo Linux, Kernel 5.15.80-gentoo-x86_64
and Intel(R) Core(TM) i7-8650U
CPU with 16GB RAM for both applications (downloadable as .zip
from attachments).
Both applications were left using default settings using JDK 17 defaults.
# send 1 million HTTP requests
with concurrency 20: Spring Boot + Tomcat started app
$ ab -n 1000000 -c 20 http://localhost:8080/
...
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /
Document Length: 89 bytes
Concurrency Level: 20
Time taken for tests: 57.752 seconds
Complete requests: 1000000
Failed requests: 0
Non-2xx responses: 1000000
Total transferred: 283000000 bytes
HTML transferred: 89000000 bytes
Requests per second: 17315.32 [#/sec] (mean)
Time per request: 1.155 [ms] (mean)
Time per request: 0.058 [ms] (mean, across all concurrent requests)
Transfer rate: 4785.39 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 6
Processing: 0 1 0.7 1 29
Waiting: 0 1 0.6 1 29
Total: 0 1 0.6 1 29
Percentage of the requests served within a certain time (ms)
50% 1
66% 1
75% 1
80% 1
90% 2
95% 2
98% 3
99% 4
100% 29 (longest request)
# take a single sample by using top mid-run
$ top -p 18840 -n 1
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 39.7 us, 27.6 sy, 0.0 ni, 24.1 id, 0.0 wa, 0.0 hi, 8.6 si, 0.0 st
MiB Mem : 15885.3 total, 774.6 free, 9342.1 used, 5768.6 buff/cache
MiB Swap: 16384.0 total, 16032.6 free, 351.4 used. 4233.6 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18840 vega 20 0 7849728 388052 27868 S 180.0 2.4 0:55.65 java
#
send 1 million HTTP requests with concurrency 20: vanilla Java
$ ab -n 1000000 -c 20 http://localhost:8080/
...
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /
Document Length: 50 bytes
Concurrency Level: 20
Time taken for tests: 40.778 seconds
Complete requests: 1000000
Failed requests: 0
Non-2xx responses: 1000000
Total transferred: 121000000 bytes
HTML transferred: 50000000 bytes
Requests per second: 24522.88 [#/sec] (mean)
Time per request: 0.816 [ms] (mean)
Time per request: 0.041 [ms] (mean, across all concurrent requests)
Transfer rate: 2897.72 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.2 0 6
Processing: 0 0 0.3 0 12
Waiting: 0 0 0.3 0 11
Total: 0 1 0.5 1 12
Percentage of the requests served within a certain time (ms)
50% 1
66% 1
75% 1
80% 1
90% 1
95% 1
98% 2
99% 2
100% 12 (longest request)
# take a single sample by using top mid-run
$ top -p 30062 -n 1
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
%Cpu(s): 14.5 us, 27.4 sy, 0.0 ni, 48.4 id, 0.0 wa, 0.0 hi, 9.7 si, 0.0 st
MiB Mem : 15885.3 total, 921.5 free, 9159.0 used, 5804.7 buff/cache
MiB Swap: 16384.0 total, 16032.6 free, 351.4 used. 4377.2 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30062 vega 20 0 7331512 222160 27600 S 73.3 1.4 0:30.33 java
Table 2: Naive benchmark highlights
Starter application name |
Spring Boot + Apache Tomcat |
Vanilla Java |
CPU% (lower is better) |
180 |
73.3 |
Resident memory MB (lower is better) |
388 |
222 |
TPS average (higher is better) |
17315 |
24523 |
Percentile where median latency doubles (higher is better) |
90 |
98 |
100th percentile multiple of median latency (lower is better) |
29 |
12 |
Vanilla Java is both faster and less resource hungry compared to the Spring Boot + Apache Tomcat starter app. Some of the performance numbers may be explained by larger payload for the 404 page on Spring Boot + Apache Tomcat. |
Conclusion
In a resource savvy world of lean services - think vanilla Java first - your reward will be a pleasant 140KB application stub that's light on your CPU, RAM and disk space.
Beware of frameworks that promise boosting your productivity. It's rarely true and their resource cost is too high. Only opt for a Servlet API container or a full blown J2EE application server if you truly must (e.g. when you are already stuck with a framework that requires it). Use an embedded container before a standalone one.
Measure everything - disk space, RAM consumption, CPU consumption, throughput and latency (at least 50th, 90th and 99th percentile).
Sources
- com.sun.net.httpserver API doc
- Stack Overflow: Simple HTTP server in Java only using Java SE API
- Stack Overflow: Why is the Java 11 base Docker image so large?
- Spring Initializr reference guide
- Spring Boot quick start
- ab - Apache HTTP server benchmarking tool