Suprtool

Performance Tuning Suprtool

Suprtool extracts data pretty very quickly; in fact, at times, HP asked me how we can throttle it back. Suprtool tends to push the envelope on Disc IO, as it completes processing of each block it reads so quickly that at times, depending on the system configuration, some resources get pushed to the limit. This resource utilization became more of an issue with MPE/iX 6.5, due to changes in the memory manager and is more prevalent in systems with large amounts of memory. Symptoms vary depending on whether you are working with an N-Class system or a non-N-class.

Happily, HP was most helpful in providing methods to help tune Suprtool for each system. First, some history on some key development that occurred for MPE/iX 6.5, 7.0 and for the N-class servers.

During the development of MPE/iX 6.5, HP contacted us to come down and get Suprtool to work with Large Files and MPE/iX 6.5. It was a requirement that Suprtool be able to handle Large Files and MPE/iX 6.5

This proved to be a challenge, not because the work was particularly technical, but rather because on the first machine we used the network would not stay up long enough for us to work on anything other than the console. The first trip was dedicated to looking at what would fail, detecting MPE/iX 6.5 and turning off certain options until HP had time to get more issues fixed.

The second trip was dedicated to turning back on some of those options and fixing sort to work with Large Files and getting Suprlink and STExport to work. After the second trip, we were pretty much ready for Suprtool to be released with the support for large file features and compliant with MPE/iX 6.5.

However, as MPE/iX 6.5 rolled out, we began to get reports of performance issues, which were linked back to some "near release" changes to the memory manager. With help from some great lab people, such as Kevin Cooper and Bill Cadier, we quickly determined the source of the problem as well as implemented a solution on both sides.

The Problem

HP identified the performance problem to be related to our prefetching of pages of data so that subsequent reads are completed quickly. Simply put, we make data available in memory ahead of where Suprtool disc reads are done, in such a way that IO's are reduced. Because of a memory management change, these pages were being kept in memory longer than previously, which forced the memory manager read down a longer chain of pages and increased the amount of work that the system had to do. At times, the problem presented itself as a high number of disc IO's waiting to complete while the system was doing so much work in-between disc IO's. As a temporary work around, we had customers turn off prefetching with a:

>set prefetch 0

The Solution

The solution was to reduce the number of pages that the system had to chain through, by releasing them from memory earlier. HP made a enhancement to one of their low-level system calls, which allowed us to invoke an option for releasing pages from memory when the prefetch feature in Suprtool is turned on.

Within Suprtool 4.4 and higher, you can turn this feature on with the MakeAbsent setting.

>set makeabsent on
This feature is only relevant if prefetch is turned on with a setting of 1-5, which controls the size of a prefetch. Set MakeAbsent On has no impact if prefetch is turned off.

The MakeAbsent setting was not documented as we wanted to gain metrics on whether or not the setting was effective and under what cases this was effective.

What is Happening Now

As mentioned, we saw the problem as being long waits for Disc-IOs to be completed. This was primarily on non N-Class systems. Over time, the N-class systems have been widely deployed in various customer sites and the symptoms that were observed were slightly different. Due to the dramatic improvement in disc-IO's, the symptom we see on N-class machines is a high utilization of memory. (To get more information on N-class machine Disc-IO, please see my paper Living with 400 I/O's per second.

Some Metrics and Advice

So now that we have a general understanding of the symptoms and some of the settings that you can control to help alleviate the problem, let's discuss how and what to change and when. Keep in mind that these are just general guidelines and your mileage may vary.

2Gb and less

For systems with less than 2Gb of memory you probably do not need to do anything. Of course, you are free to experiment with setting MakeAbsent on, or with changing the prefetch quantum or even turning it off. But the defaults should usually be fine.

Please note that when testing, your results may naturally improve if you repeat the same extract over and over, since more portions of the file or dataset will be in memory. You may want to make a change to the settings and monitor some jobs that you run frequently and at various times of the day, to see thow the changes will impact your system over time.

My gut feeling overall for systems with less than 2Gb memory, the default setting of set prefetch 2 should be adequate and turning Makeabsent on may hurt your system performance.

Greater than 8Gb

For systems with 8Gb of memory and larger, we have found that
>set makeabsent on
has been helpful for memory consumption on N-class systems and has reduced disc IO queues for non N-class systems.

Between 2Gb and 8Gb

The results here have been mixed; as you get closer to 8 Gbs turning Makeabsent on has been better for N-class systems, but I have no real results for non N-class systems. For some systems turning prefetch off altogether proved to be the best option, but these systems were usually CPU bound. I would recommend first trying to turn off prefetch to see if that helps, and then turning it back on and trying makeabsent on. Again, I suggest doing this and leaving the setting for a day or two and measuring the results.

Should I change anything?

If you do not detect a problem, then normally I would not recommend changing anything blindly. Most people who call or e-mail about this issue have discovered it by using either Glance (from HP) or SOS (from Lund).

How do I make changes?

You can make global changes to these settings for every Suprtool process, by putting the various set commands in the file:
SUPRMGR.PUB.SYS
So to turn on the MakeAbsent feature, you could add
>set Makeabsent on
to the suprmgr.pub.sys file. Of course if you see Set Prefetch 0 in the file, MakeAbsent will have no effect. If you do not see a Set Prefetch command, the default setting is Set Prefetch 2 and this is probably reasonable for most systems.

If you do experiment with this feature we would like to hear your results. Please send me an email at: Neil@robelle.com.


Performance Experience from a Customer

Neil Armstrong, Robelle

Recently we received feedback from a customer who had experienced some poor response times, during peak hours after an upgrade from a 989-650 to an N4000-400-500(both with 8Gb memory). With an impeding peak resource demand, the customer was looking to solve the issue with some Suprtool tuning as noted in our article:

http://www.robelle.com/tips/st-performance.html

They implemented SET MAKEABSENT ON, with some very excellent results, with a reduction in IO queue waits and higher throughput, however, the real world impact, was far more important.

"We have an N4000-400-500 with 8 Gb of memory running Ecometry 5.32N on MPE/iX 7.5pp1 using an EMC Symmetrix 8530 over 14 FWD-SCSI channels.

"We used a Disk-Tape-Disk migration path for our transition, and considering that we had been on disk for 9 years our databases went from having several sets with >8k extents to newly restored contiguous sets on the new array.

"When we migrated from our 989-650 to this machine in May we experienced some poor response times at peak hours for our users, generally numbering 400-500 sessions.

"After discussions with both Bill Lancaster at Lund and Sue Horvat at HP suggested that we put SET MAKEABSENT ON in the SUPRMGR file in July.

"Our response time issue went away with minimal negative impact elsewhere.

"Please note also that our annual Sale began in July and we experienced annual peak in throughput during that sale.

"Note also that GLANCE generally shows us running 400 to 1200 in Disk I/O/sec range for most of the day!

"We were VERY pleased with the result going in just before our sale. That made "the MPE/iX system one of the highlights of a very busy time for us.