Kill the performance – Part 1
An engineer was called to fix an expensive piece of machinery.. he came, thought deeply while looking at the machine.. took a hammer and hit the machine and it worked…soon he presented his bill to the customer.. 1000 bucks? for hitting with a hammer? The customer exclaimed..
Nope only 10 for hitting, the rest to figure out where to hit…
Ask any customer how they would like there software to be: Bug free, works all the time & performing.
All the engineers do the same mistake time and again. thats why I thought of writing down some of the mistakes i did to kill the performance of app’s i’ve designed/coded.
Performance blunder #420: More the number of threads, the better. right?
Not so right. Any multi threaded application is best performing when the number of threads is equal to the number of cores in the system. This is true when the application is non blocking.
What exactly is blocking? Aren’t system calls and a simple LOG statement to a file or a printf blocking? You must have noticed that Enabling logging to console(printf) drops the performance of the app by atleast 50%. Calling any system api will cause a software interrupt to change from user to kernel level which enables the underlying OS to context switch out the running thread and prioritize or time slice the cpu between other runnable threads. Hence, there is a chance that when you call a system api your thread gets context switched. but you see, thats not exacly blocking.
Any system api which can potentially make the thread to sleep is a blocking call. For example: Waiting on a socket to read or write(the kernel puts the thread to sleep if the socket buffer is full) is a blocking call. Trying to write to a NFS file in sync mode is a blocking call. But writing down a buffer to a local file may not be a blocking call (citations needed).
IMHO: maximum number of threads spawned in the app should be twice the number of cpu’s in the system. There are Api’s available to find the number of cores (cpu’s) in the system.
Performance blunder #421: Queue up.
For some unknown reason all developers tend to think that delegation of work to a different set of thread pool is a better design.
I’ve seen the same mistake being done at two different organizations.
Designing a network app? One thread pool to recv data, which then queues up new connections to another set of queue. The other thread pool picks up the work and finally queues it up for sending it out. The final set of thread pool does some more processing on the message before sending it out to the destination.
Another app had a user defined pipeline with different components working on a message. Since the pipeline was user defined, any component could be invoked in any order. The design was to have different threapools for each component, and the message was enqueued to the component as the pipeline was processed.
Both the apps were non performing. The bottleneck was simple: Queues.
Thread Context should be dreaded by programmers as much as possible. Every time a work is to be delegated to a different thread. Hords of performance pit falls come into picture. First: the producer thread has to enqueue the msg into some shared queue which obviously will be mutex protected. The cost of involving a mutex itself is very high. Even if you have some sort of lock free queue, the cost of context switch involved will show up big time. All programmers should vow to do most processing inline in the same thread until the os itself decides to schedule them out. That’s the essence of performance.
Take the queue out and get the processing done inline. But the design enthusiasts will ask: If every component is calling other component inline, Won’t it lead to spaghetti code? Yes, it might.. And thats where you need a good design. Design the flow of code inline, without queues, but keep the components segregated. Each component calls the other using an interface. Keep all components sources in different directories with individual .so or static libraries. All common code between components should go into a common/ folder which can be used by all. Keeping the interface and the directory structure (of the code) clean is the key here..
About this entry
You’re currently reading “Kill the performance – Part 1,” an entry on WoOd’s TechLog
- Published:
- July 21, 2008 / 3:17 am
- Category:
- TechPain, Techno-Jazz
- Tags:
- design, network app, performance, tips
No comments yet
Jump to comment form | comments rss [?] | trackback uri [?]