Single Threaded To Multi Threaded - A Lot to Consider - Tech Notes
Home » Tech Notes

Single Threaded To Multi Threaded – A Lot to Consider

17 March 2009 No Comment

You have a batch job running in production for ages, and it was doing good job so far. But you see the business growth and anticipate that this program might become a bottleneck to the throughput. After so many thinking rounds, you decide to convert this job into a multithreaded one. Here the actual challenge starts. If it is writing a new program, then it would have been a simple business. But changing and existing program is a tricky job. Here is my experience with an assignment. Finally we managed to convert it into a multithreaded one, but there were rounds of iteration. Most of the experiences would guide you to determine the design changes /considerations (and also the effort) you would need before implementing such a change. This is how the single threaded processing may be working (take it as an example)

Single Threaded Scenario

Polling Input:

When the program was running in single threaded mode, it must have been reading one input message from the input queue. But now there can be a separate thread that reads input messages and stores those in an in-memory queue.  This can be an ArrayList or Vector, Vector will come with additional support of synchronization of it’s alteration, while in case of ArrayList you will have to manage it programmatically.

Additionally, you can make the pool size configurable.  At least two configuration parameters are possible, max and min size. Now the processors, which read the messages from the pool and deletes those, can notify the message reader of size reduction below min size, so that it can start filling the pool again. Playing around with these parameters can give you performance variations.

Thread Pool Management:

Following things can be managed in case of thread pool.

  • Size of pool
  • Creation of new thread if any thread dies out/fails
  • Some cases of deadlock

Singleton Pattern:

If your existing job was a single threaded, then there will be many places where you must have implemented this pattern. You can continue with this pattern in few cases, e.g. in case of the connection manager. Here the connection manager can be singleton, which will return connections synchronously.

But rest of the cases will have to be non-singleton. Otherwise there will be serious performance problems.  As we are going to create multiple instances of the previously singleton objects, we must consider the increasing load on garbage collection. Think of creating object pool to reuse the objects, and of those objects which have more creation cost. Just that this pool will be at thread level and not at application level.

Multi Threaded Program

Static Implementations:

This is another way of ensuring one instance in the JVM. Static classes, methods will result in behavior which is equivalent to synchronized methods. Impact will be same, performance degradation. Treat them similar to the Singleton variables.

Object Sharing Across Threads:

Object sharing results either in garbage data or if you are lucky then in failure of job. If it is second then you are saved, otherwise God save you! It will be very late by the time you will find out that the data corruption is because of object sharing. While designing, try to find out the objects that can get shared and fix those.

Failure/Restart Management:

In single threaded mode, it must have been case that if the program crashes then only one message was in trouble. But now there are many in – memory messages, which you might have deleted from the input queue. All these messages will be lost if the program crashes. Same is the case with restart, you must know where did the program last stopped. One option that I thought was, once you read a message for processing from input queue, then store it in persistent store, and after complete processing remove it as a last step. If the program restarts, then it will first select the items from this store, which are un-processed but removed from the input queue, before proceeding with the new message.

Java Version:

After release of Java 5, this consideration has become important. There are number of improvements around multithreading in Java 5. You can use new thread-state model, and priority based thread scheduler. Also it has improved interaction/thread prioritization with underlying operating system. Thread pool management can be done in better way. If you are working on older versions, then you will have to do all these things on your own.

Operating System:

Multithreading is not totally a function of JVM, but it is mostly dependent on the underlying operating system. How the operating system supports multithreading? Whether it implements preemptive or cooperative multithreading? How it responds to the thread prioritization? What are the parameters that you can tune for the JVM on that operating system? These parameters are also dependent on the Java version.  All these attributes must be considered while designing the job.

Logging:

This has been a performance bugger since ages, hence let the level of be configurable parameter so that you can control the amount of logs your program generates. You need a good log file in case of batch jobs, cause you need to know what happened behind your eyes, as you see nothing working as opposed to a GUI based application. Ensure you log what business each thread is doing.

New Bottlenecks:

It is possible; you will find new bottlenecks which will tell that the performance cannot be improved beyond certain level. These bottlenecks can be around infrastructure – memory, processor speed etc. or can be application level – database interaction, file read time, if you are reading xml then the parser etc. To improve performance further, you will need to fix these aspects of infrastructure or application.

Performance Tuning:

Finally there are few application configuration properties, and few JVM attributes to be tuned. Have fun with those and see the change. Just that if you find that the performance is exactly same as the single threaded model, then it is possible that you have left some class/method to work single threaded. Finally Good Luck for this interesting task!!

 

More Related Posts in Tech Notes

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.