Scalability Planning

Scalability is the ability of a system to adapt to increased processing demands in a predictable way, without becoming too complex, expensive, or unmanageable. As you deploy a system to larger numbers of users, often in different locations and time zones and with different language requirements, scalability becomes increasingly important.

Cognos 8 was designed for scalability. It scales vertically using more powerful computers, and horizontally using a greater number of computers. How you install and configure Cognos 8 components can enhance its scalability.

Web Server and Gateway Scalability

All Web communication in Cognos 8 is through a Cognos 8 gateway installed on a Web server. To increase the scalability of your Cognos 8 system, you can run your Web server on a larger computer. You can also install the Cognos 8 gateway on more than one Web server and configure your servers to leverage load balancing features.

Load Balancing

Load balancing spreads tasks among all available processors. It is important in any system, and is a key to processing capacity and scalability. In Cognos 8, load balancing means ensuring that processing requests are distributed appropriately among all the available Cognos 8 servers. Cognos 8 does this automatically, but you can configure load balancing as well.

Automatic Load Balancing

In a distributed environment, Cognos 8 balances request load automatically. By default, as servers are added to the system, each server dispatcher processes the same number of requests. If there is more than one instance of a given service, the dispatcher distributes requests to all the enabled instances of the service that are registered in Content Manager.

Configuring Load Balancing

While automatic load balancing may be appropriate when hardware resources are identical throughout a server topology, it may not be ideal in environments containing a mix of hardware resources with different capacity characteristics. In a hardware environment that contains servers with varying degrees of processing capacity, it is desirable to balance the processing load according the server’s capacity.

In Cognos 8, you can set process capacity settings using server administration options. For example, if you have two servers, one of which has twice the capacity of the other, you might assign the more powerful server a weight of two and the less powerful server a weight of one. Cognos 8 then submits twice as many requests to the more powerful server.

For more information about Cognos 8 dispatcher settings, see the Administration and Security Guide.

Load Balancing Dispatchers

Without a software or hardware load balancing mechanism, each Cognos 8 gateway is aware of only one dispatcher, and distributes all requests to that dispatcher. The dispatcher then distributes the requests among Cognos 8 servers. Because every request initially goes through the same dispatcher on one server, the load on that server is increased. An extra step is needed to automatically balance the load, as shown in the following diagram.

This extra step can be avoided by either implementing load balancing without an external load balancing mechanism, or by using a router or other load balancing mechanism.

Load Balancing Without an External Mechanism

Since gateway servers often have less load than Cognos 8 servers, you may achieve better performance by configuring dispatchers together with the gateways, as shown in the following diagram.

This ensures that the processing capacity of the Cognos 8 servers is directed toward serving report requests rather than load balancing requests.

You can also achieve load balancing by having gateways direct all traffic to a Cognos 8 server computer that is dedicated to dispatching, as shown in the following diagram.

This configuration also removes dispatching load from the Cognos 8 servers. However, it does require separate dispatching computers.

Using External Load-Balancing Mechanisms

You can use external load-balancing mechanisms, such as routers, to further distribute tasks in Cognos 8. Load-balancing routers can be used in either or both of these locations:

between the browser and Tier 1: Web Server
between Tier 1: Web Server and Tier 2: Cognos 8 Server

You can use an external load-balancing mechanism to distribute requests to dispatchers across all available servers, as shown in the following diagram.

You can also use routers with multiple gateways, as shown in the following diagram.

An ideal load-balancing mechanism provides the same capacity awareness as a Cognos 8 dispatcher.

To ensure that requests are not distributed by both an external load-balancing mechanism and the dispatcher, you must configure the dispatchers to not use their built-in load balancing for low affinity requests . This ensures that requests remain at the server where the hardware load balancer directed them.

Request Affinity

Affinity refers to whether a request is assigned to a specific server or whether a load-balancing mechanism can assign it to another server. Affinity between request and server ensures that requests are routed to an appropriate computer for processing. Cognos 8 uses the following types of affinity: absolute, control, high, low, session, and server. The cancel operation is handled with a dedicated connection and does not have an affinity type.

To ensure that requests are managed efficiently and load is balanced, Cognos 8 uses request affinity to route some requests. For example, requests are routed back to the Cognos 8 server that handled earlier, related requests. Cognos 8 does this automatically. The use of one or more load-balancing mechanisms does not disrupt request affinity processing.

ReportService connections can be defined as AffineConnections or NonAffineConnections. AffineConnections accept only absolute and high affinity requests. NonAffineConnections accept all types of reportService requests.

Absolute Affinity

Absolute affinity requests are always routed back to the server that processed the original request. If the server is not available, the request fails. For example, when a user cancels a running report, absolute affinity routes the cancel request back to the executing process. Absolute affinity is used to create an association between the client and the executing server to ensure that long-running requests do not time out.

Cognos 8 routes absolute affinity requests to a specific server, regardless of the load balancing used. An absolute affinity request is used with operations such as getOutput and release.

Control Affinity

Control affinity requests are routed in the same way as absolute affinity requests. A control affinity request is reserved for system operations such as wait and cancel.

High Affinity

High affinity requests can be processed on any of a number of servers, but resource consumption is minimized if the request is routed back to the executing process. The dispatcher routes a high affinity request to the server that is specified by the conversation context node ID. If the specified server is not available, the request is routed to any available server.

For example, when a pageDown command is run while reading a report, the command can be run most efficiently by using the process that served up the page that is shown. If that process is not available because the administrator shut down the computer or there was a network failure, the request is routed to another available process. The next page can still be served up, although the process will be slower.

Cognos 8 routes high affinity requests to a specific server regardless of the load balancing used. A high affinity request is used with the following operations: back, email, firstPage, forward, lastPage, nextPage, previousPage, print, render, save, and saveAs.

Low Affinity

Low affinity requests will operate just as efficiently on any computer. For example, a report request can run on any computer in the Cognos 8 system.

A low affinity request is used with the following operations: add, collectParameterValues, execute, getMetadata, getParameters, query, testDataSourceConnection, update, and validate.

For more information about affinity in Cognos 8, see Setting Affinity Connections.

Session Affinity

Session affinity requests are routed according to the conversation context node ID. If the node ID is present, they are routed in the same way as a high affinity request. If the node ID is absent, they are routed in the same way as a low affinity request. Session affinity is used with the query reuse feature: when query reuse is turned on and you run a report for the first time, the query is stored in the cache of your current session and reused the next time you run the report. For more information, see the Framework Manager User Guide.

Server Affinity

Server affinity requests are routed in the same way as absolute affinity requests.Server affinity is used for data source testing in administration: an administrator can test the connection to a new data source. For more information, see the topic about creating a data source in the Administration and Security Guide.

Cognos 8 Server Scalability

The Cognos 8 application server has one or more Cognos 8 servers. Each Cognos 8 installation contains Content Manager to manage data stored in the content store. Each Cognos 8 server contains a dispatcher that runs the Cognos 8 presentation service, batch report and report services, job and schedule monitor service, and log service.

Only one Content Manager is active at a time. The others are on standby. A standby Content Manager becomes active only if the computer on which the active Content Manager is installed fails.

To improve scalability, you can enable or disable Content Manager and the dispatcher services on individual application servers to balance the load for a given computer by request type. For example, if you have three application server computers, you might dedicate one to running interactive report requests, another to Content Manager, and the third to the other Cognos 8 services.

By targeting processing at specific computers in this way, you can control the load on each computer. For example, putting Content Manager on its own computer ensures that other requests do not downgrade its performance by competing for resources. Because report runs tend to be resource intensive, we recommend that you isolate the report services from other activities, especially in larger Cognos 8 deployments. However, before making this type of tuning configuration, analyze your user requirements carefully.

Cognos 8 Services Scalability

Cognos 8 services operate as threads within the dispatcher. The report services differ from the other services in the way they contribute to scalability.

Report and Batch Report Services

The report and batch report services are multi-instance components of Cognos 8. As a result, one or more instances can be configured to operate on each Cognos 8 computer.

The same program is used for both the report service, which handles interactive requests, and the batch report service, which handles scheduled tasks. For information about configuring the number of instances of the report services and the number of threads that each instance handles, see the Administration and Security Guide.

Content Manager

Content Manager, which can be installed in Tier 2 or 3 of Cognos 8, stores information in the content store. To allow fast retrieval, Content Manager builds an in-memory cache to service many requests. This ensures optimal performance and enhances scalability by limiting the number of database reads required to meet user requests.

In the single Content Manager process, multiple threads can concurrently service requests for content. Content Manager creates one or more threads for each user request. Performance depends on the power of the central processor unit (CPU) of the computer on which Content Manager is installed.

To increase scalability, use a larger computer capable of managing more concurrent request threads. When scaling up Content Manager, be sure to scale up the content store relational database management system so that it does not impede Content Manager performance.

Other Services

The scalability of the presentation service, job and schedule monitor service, and log service, is primarily dependant on the CPU size and the available memory. These components can be scaled in two ways:

vertically, by using a larger computer capable of managing more concurrent request threads
horizontally, by running the services on additional computers

There is no specific configuration to tune these components. However, you can view the load-balancing configuration set by the server administrator to determine how much load is given to each computer running these services.