PFS (Shipping Container Management System)
  Activities: Rectify under-performing system - Software development (T-SQL, .NET, c#, ASP.net)
Client: Patrick Corporation www.patrick.com.au

Impact: PFS had been under-performing for the past 12 months and the vendor was unsuccessful at resolving 1000 timeouts per day for the 200-user system. My first release slashed timeouts down from 1000 per day to 50 per day, released into production in 3 weeks from my start. Performance improvements were achieved well ahead of the 3-month schedule.

Details: The PFS system is an intranet based solution, made up of predominantly Web browsers, web servers and a SQL Server. Some preliminary analysis was complete to suggest the key area for performance improvement was within the SQL Server stored procedures. I quickly identified the most significant bottlenecks and implemented code changes to resolve them.

I developed tools to replay server traces and accurately simulate server load on the test systems and replicate the more complex bottleneck scenarios. Consequently, further improvements to performance have been implemented. Timeouts now down to 9-10 per day and there is a diminishing return on further development.


IACS (Integrated Air Cargo System)
  Impact: IACS had also been under-performing for the past 12 months and the vendor was unsuccessful at resolving performance, with average web page changes taking in excess of 15 seconds for the most common screens. My first release slashed response time down from 15 secs to 2.5 seconds, released into production in 1 week from commencement. Performance improvements were achieved well ahead of the 3-week schedule.

Details: The dramatic turnaround in performance was largely due to the similar environment to PFS and that I was able to use my SQL Server load simulator on the database immediately, identify the current bottlenecks. It took about a day to solve a serious single computational/performance error and 3 days to generate a suitable list of additional indexes. Business was astounded and exuberant that I was able to solve these issues rapidly.


Macquarie Bank CRM
  Provide on-going technical expertise in the area of performance and identifying and rectifying performance and architectural issues. I ‘drilled down’ on various technical issues and liased with the relevant specialist to create action plans for resolution.

Impact: Solving these problems required (1) understanding the technical issues from various perspectives, (2) communicating effectively with the various stakeholders and (3) getting the parties involved to take action. Resolving these performance and instability issues was a team effort and I played an important role to action the activities above.

Details: The Janna system was made up of a customised front-end application, a number of middleware boxes (using DCOM connectivity) and various back-end SQL servers and host applications. The system was being delivered by Accenture but was under-performing in a number of areas and the middleware was unstable.

Remote branches used Citrix to connect to NT Servers in Sydney, due to network propagation delays involved with DCOM conversations, which are half-duplex and quite “chatty”.

One of the key breakthroughs for stability was the consumption of thread resources in the middleware. Arriving at the bottleneck, meant elimination of potential causes such as bandwidth limitations on the WAN, the mix between burst traffic and load traffic, network latency, database performance and workstation performance.

While the above factors may have played a part, performance issues were predominantly due to the poor implementation of middleware messages. A key aspect of all middleware design is that messages should be kept short and release resources quickly. The application developers were trained in middleware design and some messages were coded to complete normally in 30 seconds or more. Sometimes these messages could take 5 mins to complete. This meant that there was a significant drain on worker threads in the middleware when these longer messages were being processed.. In this case the user would reboot their PC as it appeared to hang but the middleware would continue. This exposed some fundamental flaws in the way DCOM was being used and bottle neck for performance was in the middleware.

A partial solution was to increase worker threads, increase the number of middleware servers but the application developers had to rework as best they could their application components.

Macquarie was also looking at restructuring the longer request such that they would be processed in batch mode.

Other performance issues related to Janna not using stored procedures. This meant that its back-end database is truly portable but it also meant that there was very little control over the type of queries that a user could send to the database. Many of these would cause table locks (such as a first name search of two letters). I spent a lot of time in conjunction with the developers and database administrators identifying causes of table locks (thereby impacting all users). Table locks were sometimes taking 5-10 minutes to clear. All active users would be impacted if the Customer table was table-locked. This meant understanding the technical issues involved and negotiating with developers and database administrators to resolve these issues.

Reporting to senior management was through the use of Key Performance Indicators (KPI), that related specifically to business outcomes and reporting to direct management included technical KPI’s. This ensured a balanced approach to tackling business and technical issues of under-performance and accountability.


 
Firm Profile
MSIC
Site Performance
Mobile Communications
Methodologies
Architect
Training
Web Hosting
e-mail me

|Firm Profile| |MSIC| |Site Performance| |Mobile Communications| |Methodologies| |Architect| |Training| |Web Hosting|