The UNIX Way vs. the Java Enterprise Edition Way

Matteo Vaccari http://matteo.vaccari.name/ [email protected] @xpmatteo Agile Coach Camp Italy 2012
The UNIX Way Vs the Java Enterprise Edition Way 1

Warning! Opinionated developer Interactions welcome 2 The goal of this
talk is to share my observations about what works and what doesn’t work in application architecture.

Starting, stopping, changing conﬁguration Antipasto 3

$ # Starting a process the UNIX way $ java
-jar foo.jar 4 This is how you’d start a Java program in the Unix way.

$ # Starting a process the JEE way $ $
# Stop the container $ catalina stop Using CATALINA_BASE: /usr/local/Cellar/tomcat/7.0.50/libexec ... $ # Remove old installation $ rm -rf tomcat/webapps/myproject* $ $ # Copy artifact $ cp target/myproject-1.0-SNAPSHOT.war tomcat/webapps $ $ # Restart container $ catalina start Using CATALINA_BASE: /usr/local/Cellar/tomcat/7.0.50/libexec .... INFO: Server startup in 3040 ms 5 This is how you typically start a Java program the Java Enterprise way. Details will vary by your choice of application container but the general complication remains.

$ # Stopping a process the UNIX way $ java
-jar foo.jar & [1] 97402 $ $ kill 97402 $ [1]+ Exit 143 java -jar foo.jar $ 6 I want to be able to stop an application by sending it a TERM signal.

$ # Stopping a process the JEE way $ catalina
stop Using CATALINA_BASE: /usr/local/Cellar/tomcat/7.0.50/libexec ... INFO: Stopping service Catalina INFO - Application - [wicket.myproject] destroy: Wicket co feb 18, 2014 6:52:24 PM org.apache.coyote.AbstractProtocol stop INFO: Stopping ProtocolHandler ["http-bio-8080"] feb 18, 2014 6:52:24 PM org.apache.coyote.AbstractProtocol stop INFO: Stopping ProtocolHandler ["ajp-bio-8009"] feb 18, 2014 6:52:24 PM org.apache.coyote.AbstractProtocol destroy INFO: Destroying ProtocolHandler ["http-bio-8080"] feb 18, 2014 6:52:24 PM org.apache.coyote.AbstractProtocol destroy INFO: Destroying ProtocolHandler ["ajp-bio-8009"] $ 7 If you use an application container, you have to negotiate with the container how to stop the application. With Tomcat, you need to send a command through an unsecured TCP connection to a control port. The above utility “catalina” does just that. Then you hope that Tomcat really quits.

$ # Changing configuration the UNIX way $ java -jar
foo.jar --config-file foo.properties & [1] 97807 $ $ vi foo.properties $ kill -HUP 97807 $INFO: Reloading config from /x/y/z/foo.properties 8 In the Unix world, when you start an application you may tell it where to find its configuration file, or you may rely on a conventional path where the file is to be found. When you want to change the configuration, you edit the file and then send a “hang-up” signal to the process. The application, by convention, knows that it should then re-read the configuration file.

$ # Changing configuration the JEE way $ vi src/main/webapp/WEB-INF/web.xml
$ # Recompile and re-package $ mvn package [INFO] Scanning for projects... ... [INFO] Total time: 4.619s $ # Re-deploy $ catalina stop .... $ rm -rf tomcat/webapps/myproject* $ cp target/myproject-1.0-SNAPSHOT.war tomcat/webapps $ catalina start .... INFO: Server startup in 3040 ms 9 In the Java Enterprise world, you change configuration by editing one of the many configuration files inside the application source tree; then you recompile and repackage the application; then you stop the application and redeploy it and restart it. I’m not joking.

Limit damage from programming errors Primo piatto 10 A team
that does all the 12 practices of Extreme Programming will release very few programming errors. All the same, we should assume that there will always be programming errors. Therefore we should design application architecture so that the damage from programming errors is limited.

JVM Tomcat Linux App 1 App 2 App 3 ...See
any problems? The container hides the O.S. 11 The original idea in Java was to hide the O.S. from application code. Everything the running app needs from the environment is provided by the Java Virtual Machine. So far, so good. Then the Java Enterprise standard says that starting, stopping, running application should also be done inside an “application container”. You are then able to avoid dependencies from variations in O.S., but you are now depending on variations in application container. The idea of shielding the app from the O.S. makes sense when you want to distribute a desktop application to a variety of O.S. But there is no value in shielding a custom server application from the O.S. In the server there is no need to have variations in O.S.; you choose the O.S. and you don’t have usually change it. It goes to your advantage to exploit the O.S. for all the services it can provide. You are not going to change Linux for Windows anyway, are you? :o) Apart from all this, what I ﬁnd completely unacceptable is the idea that different applications should be running inside the same container, which means inside the same O.S. process. What can go wrong?

JVM Tomcat Linux App 2 App 3 App 1 Ooops.
12 What can go wrong is that a programming error in one app can bring down all the other apps. An application can start consuming all CPU, all memory, all ﬁle descriptors or some other resources. And there’s NO WAY the application server can contain the damage to a single app. This is because the various apps run in different threads inside the same O.S. proces.

JVM Tomcat Linux App One app per process 13 The
only way to contain damage to a single app is to have each app run in a separate O.S. process. The O.S. process is a very powerful tool: the O.S. guarantees isolation. You can limit resource consumption per process in an extremely robust way by using standard Unix tools like “nice” and “rlimit”.

JVM Tomcat Linux Request App Request Request Request Request Request
Request Request Request Request Request Request Request Request ...What can go wrong? One thread per request 14 But still, the JEE way suggests that we should run our application in one large, monolithic JVM process. All client requests are served in separate threads. What can go wrong?

JVM Tomcat Linux Request App Request Request Request Request Request
Request Request Request Request Request Request Request Request Request Ooops. 15 Again, there is the possibility of programming errors. One thread could start consuming all capacity of some resources and bring down the whole application.

Load balancer Linux App App App Limit damage from programming
errors 16 The Unix solution is to partition the application in many instances, each of which runs in a separate process. A programming error in one instance will then reduce the capacity of the application, but will not be able to shutdown the service completely. Other advantages of many JVM processes is that the processes are smaller. A full garbage collection will stop the Java process completely, and this is unavoidable. But if you reduce the size of the JVM process you can bring the duration of full GC from several seconds down to hundreds of milliseconds. By periodically restarting the instances you can even make it unlikely that a full GC will ever happen.

Load balancer Linux App App App Monitor When an instance
crashes, it’s restarted 17 The Apache Httpd preforking architecture is a good example of a robust architecture. There is a highly secure supervisor process that monitors many worker processes, where application code is run. Whenever a worker crashes, it’s immediately restarted, with little or no downtime perceived by the clients. For added robustness, an Apache worker will kill itself after serving 10.000 requests or so. This makes it unlikely that a resource leak will cause instability.

Crash or stop, it makes no difference # Ruby while
waitpid if $?.exited? start_new_instance end end // C while (pid = waitpid(-1, NULL, 0)) { if (errno == ECHILD) { start_new_instance(); } } 18 The core of the monitor process is a very simple loop. The waitpid Unix system call will return whenever one child process exits, be it intentional or the result of a crash. By making crashes and clean exits work the same way, we can simplify application architecture.

public class Main { public static void main(String[] args) {
ReusableJettyApp app = new ReusableJettyApp( new MyAppServlet(new AppStorage())); app.start(8080, "src/main/webapp"); } } Sfuggire alla tirannia del container 19 It’s very easy to let go of the container. For instance Jetty can be used as an embedded web server. The ReusableJettyApp is a 100-lines class that deﬁnes the way I like to use Jetty. The MyAppServlet calls my application code. This main can be started from the command line like any ordinary Unix program. (Sample implementation here: https://github.com/xpmatteo/scopa)

Sessions, Caches and Post-1975 Programming Secondo 20

JVM Tomcat Linux App Back to one app per process
21

Wtf? JVM Tomcat Linux App CACHE App 22 Many programmers
add caches by reﬂex, with no performance measurement. Excess caching will obscure application logic and increase the size of the JVM.

Keep your cache out of my app! Varnish App App
App Load balancer Memcached DB buffers + cache 23 There are other options. First of all, remember that DB servers are always caching results in RAM. Always make sure that your DB servers have enough RAM to hold all the data that the application currently uses. Another option is to cache web pages in a fast HTTP proxy, like Varnish. This will speed up the app with no increased complication. Another option is to keep caches in external dedicated servers running Memcached.

JVM Tomcat Linux App Again one app per process 24
What else?

JVM Tomcat Linux App SESSIONS App Dude, wtf? 25 Sometimes
programmers save lots of data in web sessions. And the container often keeps those sessions in the JVM process. This makes the JVM bigger, and makes it necessary to add server-session affinity in the load balancer, at the same time making it impossible to shut down one server without killing the sessions that are being served on that server.

How the Tomcat session management interface is: public interface Manager
{ public Container getContainer(); public void setContainer(Container container); public boolean getDistributable(); public void setDistributable(boolean distributable); public String getInfo(); public int getMaxInactiveInterval(); public void setMaxInactiveInterval(int interval); public int getSessionIdLength(); public void setSessionIdLength(int idLength); public long getSessionCounter(); public void setSessionCounter(long sessionCounter); public int getMaxActive(); public void setMaxActive(int maxActive); public int getActiveSessions(); public long getExpiredSessions(); public void setExpiredSessions(long expiredSessions); public int getRejectedSessions(); public int getSessionMaxAliveTime(); public void setSessionMaxAliveTime(int sessionMaxAliveTime); public int getSessionAverageAliveTime(); public int getSessionCreateRate(); public int getSessionExpireRate(); public void add(Session session); public void addPropertyChangeListener(PropertyChangeListener listener); public void changeSessionId(Session session); public Session createEmptySession(); public Session createSession(String sessionId); public Session findSession(String id) throws IOException; public Session[] findSessions(); public void load() throws ClassNotFoundException, IOException; public void remove(Session session); public void remove(Session session, boolean update); public void removePropertyChangeListener(PropertyChangeListener listener); public void unload() throws IOException; public void backgroundProcess(); } 26 You might think to solve these problems by implementing a custom session manager. This is the interface that Tomcat would like you to implement.

JVM Tomcat Linux App Sessions "in RAM" App Sessions "on
disk" Programming like it’s 1975 27 One reason for that complicated interface is that Tomcat likes to make a distinction about the sessions that are “active”, that is, “in RAM”, and those that are “inactive”, or “on disk”. The problem with this idea is that since virtual memory became commonplace in the late seventies, there is no difference between “RAM memory” and “disk memory”. Operating Systems offer just one kind of memory, and that is Virtual Memory. The O.S. decides which pieces of data are resident in RAM and which are not, and does that very efficiently.

#FAIL if (session.timeSinceLastUse() > timeout) { manager.persist(session); } 28 The
big #fail of Tomcat’s session management is that it implies that it does something like the pseudocode above. The intent is to save to slow memory the sessions that have been unused for a long time. What really happens is different. Suppose that the session was indeed unused for a long time. Chances are that the O.S. purged the RAM that contained that session and saved it on disk. Now we access the session to ask it how long it was since it was last used. The VM sees that we try to access memory that is not resident, and generates a page fault. The O.S. repairs the page fault by allocating a fresh memory page to the session and loading its content from disk. Then the application resumes execution, and we see that indeed the session was idle for too long. Now we ask the session manager to save it to disk!

How the Tomcat session management interface should be public interface
Manager { Session createSession(); Session findSession(String sessionId); } 29 Now, one of the reasons why Tomcat does all that complicated dance is that it tries to avoid to use too much JVM memory for sessions. But really, it should be way simpler than that. A session manager should only need to implement these two methods. For a more extended analysis of the ﬂaws in Tomcat’s session management, see my article http://matteo.vaccari.name/blog/archives/650 The expression “1975” programming is from the “Varnish Architect Notes”, a valuable article: https://www.varnish-cache.org/trac/wiki/ArchitectNotes.

Say bye-bye to server-session afﬁnity App App App Load balancer
Session storage 30 One reasonable session manager would store sessions in the database. This makes it very easy to share sessions between all servers. Recently-used sessions would be automatically cached in the DB server’s RAM cache. Of course, you wouldn’t save lots of data in the web session, would you? That would be very bad REST karma! A good rule of thumb is that a session should only hold the logged-in user id and nothing else. Everything else should be stored as application state in the DB, or as conversation state in the hypertext. Remember HATEOAS! I don’t mean to say that Tomcat speciﬁcally is crap. For all its limitations, Tomcat is a robust server and will serve a very heavy load. It’s just not very sophisticated.

SRP and OCP Dolce 31 The guy in the picture
is Robert Martin, aka Uncle Bob. I invite you to watch his valuable videos at cleancoders.com.

JVM Tomcat Linux Web frontend Batch Backofﬁce Real-time device control
SCADA Monolithitis Gravis 32 We’ve seen that the “application container” idea has a number of problems. The biggest problem in my opinion is that it promotes a monolithic style of programming. Monolithic is bad! Good programming is when you have many boxes that perform each one service, independently. I once saw an industrial control application where many separate concerns were addressed in a single Java web application. This architecture is fragile! A memory leak in the SCADA should not impact real-time device controls. The correct architecture would be to allocate separate O.S. processes to all these concerns. It’s not that the designers of this system were naive. They were very experienced Java developers. But the Java “container” culture led them to throw many application in one container.

Single Responsibility Principle Open/Closed Principle A box should do just
one thing and do it well New (nonfunctional) requirements are met by adding new boxes, not by modifying existing boxes 33 These are two fundamental principles of good software design. I have reframed them at the system level. For “box” you might read “server” or “process”.

The Operating System is your friend Use it to your
advantage! 34 This is my concluding message. Leveraging the O.S. can make your applications faster, simpler and more robust.

Call to action Find a box that does two things
Analyze the problem Split it! 35

Buona digestione! twitter @xpmatteo blog http://matteo.vaccari.name/ email [email protected] Agile Coach
Camp Italy 2012 Agile Coach Camp Italy 5-7 giugno Cavalese (TN) http://accitaly.wordpress.com Sono freelance! Contattami! 36

The UNIX Way vs. the Java Enterprise Edition Way

The UNIX Way vs. the Java Enterprise Edition Way

More Decks by Matteo Vaccari

Other Decks in Technology

Featured

Transcript