Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The UNIX Way vs. the Java Enterprise Edition Way

The UNIX Way vs. the Java Enterprise Edition Way

My presentation at the Incontro Devops Italia 2014

How to write robust applications in Java and other languages, avoiding the pitfalls of JEE, and similar "enterprisey" ideas.

With explicit annotations to explain what the slides mean.

Matteo Vaccari

February 21, 2014
Tweet

More Decks by Matteo Vaccari

Other Decks in Technology

Transcript

  1. Matteo Vaccari
    http://matteo.vaccari.name/
    [email protected]
    @xpmatteo
    Agile Coach Camp
    Italy 2012
    The UNIX Way
    Vs the Java Enterprise Edition Way
    1

    View full-size slide

  2. Warning!
    Opinionated developer
    Interactions welcome
    2
    The goal of this talk is to share my observations about what works and what doesn’t work in
    application architecture.

    View full-size slide

  3. Starting, stopping,
    changing configuration
    Antipasto
    3

    View full-size slide

  4. $ # Starting a process the UNIX way
    $ java -jar foo.jar
    4
    This is how you’d start a Java program in the Unix way.

    View full-size slide

  5. $ # Starting a process the JEE way
    $
    $ # Stop the container
    $ catalina stop
    Using CATALINA_BASE: /usr/local/Cellar/tomcat/7.0.50/libexec
    ...
    $ # Remove old installation
    $ rm -rf tomcat/webapps/myproject*
    $
    $ # Copy artifact
    $ cp target/myproject-1.0-SNAPSHOT.war tomcat/webapps
    $
    $ # Restart container
    $ catalina start
    Using CATALINA_BASE: /usr/local/Cellar/tomcat/7.0.50/libexec
    ....
    INFO: Server startup in 3040 ms
    5
    This is how you typically start a Java program the Java Enterprise way. Details will vary by your
    choice of application container but the general complication remains.

    View full-size slide

  6. $ # Stopping a process the UNIX way
    $ java -jar foo.jar &
    [1] 97402
    $
    $ kill 97402
    $
    [1]+ Exit 143 java -jar foo.jar
    $
    6
    I want to be able to stop an application by sending it a TERM signal.

    View full-size slide

  7. $ # Stopping a process the JEE way
    $ catalina stop
    Using CATALINA_BASE: /usr/local/Cellar/tomcat/7.0.50/libexec
    ...
    INFO: Stopping service Catalina
    INFO - Application - [wicket.myproject] destroy: Wicket co
    feb 18, 2014 6:52:24 PM org.apache.coyote.AbstractProtocol stop
    INFO: Stopping ProtocolHandler ["http-bio-8080"]
    feb 18, 2014 6:52:24 PM org.apache.coyote.AbstractProtocol stop
    INFO: Stopping ProtocolHandler ["ajp-bio-8009"]
    feb 18, 2014 6:52:24 PM org.apache.coyote.AbstractProtocol destroy
    INFO: Destroying ProtocolHandler ["http-bio-8080"]
    feb 18, 2014 6:52:24 PM org.apache.coyote.AbstractProtocol destroy
    INFO: Destroying ProtocolHandler ["ajp-bio-8009"]
    $
    7
    If you use an application container, you have to negotiate with the container how to stop the
    application. With Tomcat, you need to send a command through an unsecured TCP
    connection to a control port. The above utility “catalina” does just that. Then you hope that
    Tomcat really quits.

    View full-size slide

  8. $ # Changing configuration the UNIX way
    $ java -jar foo.jar --config-file foo.properties &
    [1] 97807
    $
    $ vi foo.properties
    $ kill -HUP 97807
    $INFO: Reloading config from /x/y/z/foo.properties
    8
    In the Unix world, when you start an application you may tell it where to find its configuration
    file, or you may rely on a conventional path where the file is to be found. When you want to
    change the configuration, you edit the file and then send a “hang-up” signal to the process.
    The application, by convention, knows that it should then re-read the configuration file.

    View full-size slide

  9. $ # Changing configuration the JEE way
    $ vi src/main/webapp/WEB-INF/web.xml
    $ # Recompile and re-package
    $ mvn package
    [INFO] Scanning for projects...
    ...
    [INFO] Total time: 4.619s
    $ # Re-deploy
    $ catalina stop
    ....
    $ rm -rf tomcat/webapps/myproject*
    $ cp target/myproject-1.0-SNAPSHOT.war tomcat/webapps
    $ catalina start
    ....
    INFO: Server startup in 3040 ms
    9
    In the Java Enterprise world, you change configuration by editing one of the many
    configuration files inside the application source tree; then you recompile and repackage the
    application; then you stop the application and redeploy it and restart it. I’m not joking.

    View full-size slide

  10. Limit damage from
    programming errors
    Primo piatto
    10
    A team that does all the 12 practices of Extreme Programming will release very few
    programming errors. All the same, we should assume that there will always be programming
    errors. Therefore we should design application architecture so that the damage from
    programming errors is limited.

    View full-size slide

  11. JVM
    Tomcat
    Linux
    App 1
    App 2
    App 3
    ...See any problems?
    The container hides the O.S.
    11
    The original idea in Java was to hide the O.S. from application code. Everything the running
    app needs from the environment is provided by the Java Virtual Machine. So far, so good.
    Then the Java Enterprise standard says that starting, stopping, running application should
    also be done inside an “application container”. You are then able to avoid dependencies from
    variations in O.S., but you are now depending on variations in application container.
    The idea of shielding the app from the O.S. makes sense when you want to distribute a
    desktop application to a variety of O.S. But there is no value in shielding a custom server
    application from the O.S. In the server there is no need to have variations in O.S.; you choose
    the O.S. and you don’t have usually change it. It goes to your advantage to exploit the O.S.
    for all the services it can provide. You are not going to change Linux for Windows anyway,
    are you? :o)
    Apart from all this, what I find completely unacceptable is the idea that different applications
    should be running inside the same container, which means inside the same O.S. process.
    What can go wrong?

    View full-size slide

  12. JVM
    Tomcat
    Linux
    App 2
    App 3
    App 1
    Ooops.
    12
    What can go wrong is that a programming error in one app can bring down all the other apps.
    An application can start consuming all CPU, all memory, all file descriptors or some other
    resources. And there’s NO WAY the application server can contain the damage to a single
    app. This is because the various apps run in different threads inside the same O.S. proces.

    View full-size slide

  13. JVM
    Tomcat
    Linux
    App
    One app per process
    13
    The only way to contain damage to a single app is to have each app run in a separate O.S.
    process. The O.S. process is a very powerful tool: the O.S. guarantees isolation. You can
    limit resource consumption per process in an extremely robust way by using standard Unix
    tools like “nice” and “rlimit”.

    View full-size slide

  14. JVM
    Tomcat
    Linux
    Request
    App
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    ...What can go wrong?
    One thread per request
    14
    But still, the JEE way suggests that we should run our application in one large, monolithic JVM
    process. All client requests are served in separate threads. What can go wrong?

    View full-size slide

  15. JVM
    Tomcat
    Linux
    Request
    App
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Request
    Ooops.
    15
    Again, there is the possibility of programming errors. One thread could start consuming all
    capacity of some resources and bring down the whole application.

    View full-size slide

  16. Load
    balancer
    Linux
    App
    App
    App
    Limit damage from
    programming errors
    16
    The Unix solution is to partition the application in many instances, each of which runs in a
    separate process. A programming error in one instance will then reduce the capacity of the
    application, but will not be able to shutdown the service completely.
    Other advantages of many JVM processes is that the processes are smaller. A full garbage
    collection will stop the Java process completely, and this is unavoidable. But if you reduce
    the size of the JVM process you can bring the duration of full GC from several seconds down
    to hundreds of milliseconds. By periodically restarting the instances you can even make it
    unlikely that a full GC will ever happen.

    View full-size slide

  17. Load
    balancer
    Linux
    App
    App
    App
    Monitor
    When an instance
    crashes, it’s restarted
    17
    The Apache Httpd preforking architecture is a good example of a robust architecture. There
    is a highly secure supervisor process that monitors many worker processes, where
    application code is run. Whenever a worker crashes, it’s immediately restarted, with little or
    no downtime perceived by the clients. For added robustness, an Apache worker will kill itself
    after serving 10.000 requests or so. This makes it unlikely that a resource leak will cause
    instability.

    View full-size slide

  18. Crash or stop, it makes
    no difference
    # Ruby
    while waitpid
    if $?.exited?
    start_new_instance
    end
    end
    // C
    while (pid = waitpid(-1, NULL, 0)) {
    if (errno == ECHILD) {
    start_new_instance();
    }
    }
    18
    The core of the monitor process is a very simple loop. The waitpid Unix system call will return
    whenever one child process exits, be it intentional or the result of a crash. By making
    crashes and clean exits work the same way, we can simplify application architecture.

    View full-size slide

  19. public class Main {
    public static void main(String[] args) {
    ReusableJettyApp app =
    new ReusableJettyApp(
    new MyAppServlet(new AppStorage()));
    app.start(8080, "src/main/webapp");
    }
    }
    Sfuggire alla tirannia del
    container
    19
    It’s very easy to let go of the container. For instance Jetty can be used as an embedded web
    server. The ReusableJettyApp is a 100-lines class that defines the way I like to use Jetty. The
    MyAppServlet calls my application code. This main can be started from the command line like
    any ordinary Unix program. (Sample implementation here:
    https://github.com/xpmatteo/scopa)

    View full-size slide

  20. Sessions, Caches and
    Post-1975 Programming
    Secondo
    20

    View full-size slide

  21. JVM
    Tomcat
    Linux
    App
    Back to
    one app per process
    21

    View full-size slide

  22. Wtf?
    JVM
    Tomcat
    Linux
    App
    CACHE
    App
    22
    Many programmers add caches by reflex, with no performance measurement. Excess caching
    will obscure application logic and increase the size of the JVM.

    View full-size slide

  23. Keep your cache
    out of my app!
    Varnish
    App
    App
    App
    Load
    balancer
    Memcached
    DB
    buffers
    + cache
    23
    There are other options. First of all, remember that DB servers are always caching results in
    RAM. Always make sure that your DB servers have enough RAM to hold all the data that the
    application currently uses. Another option is to cache web pages in a fast HTTP proxy, like
    Varnish. This will speed up the app with no increased complication. Another option is to keep
    caches in external dedicated servers running Memcached.

    View full-size slide

  24. JVM
    Tomcat
    Linux
    App
    Again
    one app per process
    24
    What else?

    View full-size slide

  25. JVM
    Tomcat
    Linux
    App
    SESSIONS
    App
    Dude, wtf?
    25
    Sometimes programmers save lots of data in web sessions. And the container often keeps
    those sessions in the JVM process. This makes the JVM bigger, and makes it necessary to add
    server-session affinity in the load balancer, at the same time making it impossible to shut
    down one server without killing the sessions that are being served on that server.

    View full-size slide

  26. How the Tomcat session
    management interface is:
    public interface Manager {
    public Container getContainer();
    public void setContainer(Container container);
    public boolean getDistributable();
    public void setDistributable(boolean distributable);
    public String getInfo();
    public int getMaxInactiveInterval();
    public void setMaxInactiveInterval(int interval);
    public int getSessionIdLength();
    public void setSessionIdLength(int idLength);
    public long getSessionCounter();
    public void setSessionCounter(long sessionCounter);
    public int getMaxActive();
    public void setMaxActive(int maxActive);
    public int getActiveSessions();
    public long getExpiredSessions();
    public void setExpiredSessions(long expiredSessions);
    public int getRejectedSessions();
    public int getSessionMaxAliveTime();
    public void setSessionMaxAliveTime(int sessionMaxAliveTime);
    public int getSessionAverageAliveTime();
    public int getSessionCreateRate();
    public int getSessionExpireRate();
    public void add(Session session);
    public void addPropertyChangeListener(PropertyChangeListener listener);
    public void changeSessionId(Session session);
    public Session createEmptySession();
    public Session createSession(String sessionId);
    public Session findSession(String id) throws IOException;
    public Session[] findSessions();
    public void load() throws ClassNotFoundException, IOException;
    public void remove(Session session);
    public void remove(Session session, boolean update);
    public void removePropertyChangeListener(PropertyChangeListener listener);
    public void unload() throws IOException;
    public void backgroundProcess();
    }
    26
    You might think to solve these problems by implementing a custom session manager. This is
    the interface that Tomcat would like you to implement.

    View full-size slide

  27. JVM
    Tomcat
    Linux
    App
    Sessions
    "in RAM"
    App
    Sessions
    "on disk"
    Programming like it’s 1975
    27
    One reason for that complicated interface is that Tomcat likes to make a distinction about the
    sessions that are “active”, that is, “in RAM”, and those that are “inactive”, or “on disk”. The
    problem with this idea is that since virtual memory became commonplace in the late
    seventies, there is no difference between “RAM memory” and “disk memory”. Operating
    Systems offer just one kind of memory, and that is Virtual Memory. The O.S. decides which
    pieces of data are resident in RAM and which are not, and does that very efficiently.

    View full-size slide

  28. #FAIL
    if (session.timeSinceLastUse() > timeout) {
    manager.persist(session);
    }
    28
    The big #fail of Tomcat’s session management is that it implies that it does something like
    the pseudocode above. The intent is to save to slow memory the sessions that have been
    unused for a long time. What really happens is different. Suppose that the session was
    indeed unused for a long time. Chances are that the O.S. purged the RAM that contained that
    session and saved it on disk. Now we access the session to ask it how long it was since it was
    last used. The VM sees that we try to access memory that is not resident, and generates a
    page fault. The O.S. repairs the page fault by allocating a fresh memory page to the session
    and loading its content from disk. Then the application resumes execution, and we see that
    indeed the session was idle for too long. Now we ask the session manager to save it to disk!

    View full-size slide

  29. How the Tomcat session
    management interface should be
    public interface Manager {
    Session createSession();
    Session findSession(String sessionId);
    }
    29
    Now, one of the reasons why Tomcat does all that complicated dance is that it tries to avoid
    to use too much JVM memory for sessions. But really, it should be way simpler than that. A
    session manager should only need to implement these two methods.
    For a more extended analysis of the flaws in Tomcat’s session management, see my article
    http://matteo.vaccari.name/blog/archives/650
    The expression “1975” programming is from the “Varnish Architect Notes”, a valuable article:
    https://www.varnish-cache.org/trac/wiki/ArchitectNotes.

    View full-size slide

  30. Say bye-bye to
    server-session affinity
    App
    App
    App
    Load
    balancer
    Session
    storage
    30
    One reasonable session manager would store sessions in the database. This makes it very
    easy to share sessions between all servers. Recently-used sessions would be automatically
    cached in the DB server’s RAM cache.
    Of course, you wouldn’t save lots of data in the web session, would you? That would be very
    bad REST karma! A good rule of thumb is that a session should only hold the logged-in user
    id and nothing else. Everything else should be stored as application state in the DB, or as
    conversation state in the hypertext. Remember HATEOAS!
    I don’t mean to say that Tomcat specifically is crap. For all its limitations, Tomcat is a robust
    server and will serve a very heavy load. It’s just not very sophisticated.

    View full-size slide

  31. SRP and OCP
    Dolce
    31
    The guy in the picture is Robert Martin, aka Uncle Bob. I invite you to watch his valuable
    videos at cleancoders.com.

    View full-size slide

  32. JVM
    Tomcat
    Linux
    Web frontend Batch
    Backoffice
    Real-time
    device control
    SCADA
    Monolithitis Gravis
    32
    We’ve seen that the “application container” idea has a number of problems. The biggest
    problem in my opinion is that it promotes a monolithic style of programming. Monolithic is
    bad! Good programming is when you have many boxes that perform each one service,
    independently. I once saw an industrial control application where many separate concerns
    were addressed in a single Java web application. This architecture is fragile! A memory leak in
    the SCADA should not impact real-time device controls. The correct architecture would be to
    allocate separate O.S. processes to all these concerns. It’s not that the designers of this
    system were naive. They were very experienced Java developers. But the Java “container”
    culture led them to throw many application in one container.

    View full-size slide

  33. Single Responsibility Principle
    Open/Closed Principle
    A box should do just one thing and do it well
    New (nonfunctional) requirements are met by
    adding new boxes, not by modifying existing boxes
    33
    These are two fundamental principles of good software design. I have reframed them at the
    system level. For “box” you might read “server” or “process”.

    View full-size slide

  34. The Operating System is
    your friend
    Use it to your advantage!
    34
    This is my concluding message. Leveraging the O.S. can make your applications faster,
    simpler and more robust.

    View full-size slide

  35. Call to action
    Find a box that does two things
    Analyze the problem
    Split it!
    35

    View full-size slide

  36. Buona digestione!
    twitter @xpmatteo
    blog http://matteo.vaccari.name/
    email [email protected]
    Agile Coach Camp
    Italy 2012
    Agile Coach Camp Italy
    5-7 giugno Cavalese (TN)
    http://accitaly.wordpress.com
    Sono
    freelance!
    Contattami!
    36

    View full-size slide