Clustering in mq

Clustering in mq DEFAULT

Working with MQ cluster queue objects

Active Messaging allows you to use Adaptive Server as a client to communicate with the WebSphere MQ cluster feature. You can use msgsend to send messages to all the cluster queues on any cluster that is connected to a queue manager.

NoteThe msgrecv function does not support remote queue connections.

A cluster can have more than one queue manager hosting an instance of the same queue. For example, two queue managers, named MASTER_MQ1 and SLAVE_MQ1, both host cluster queue CQ1. Both queue managers then join cluster INV_CQ1, resulting in two instances of the CQ1 cluster queue in the cluster INV_CQ1.�

To specify your remote queue manager, use remote_qmgr in your endpoint syntax segment. Ignore this remote_qmgr option if you are sending a message to the cluster queue that holds multiple instances, and you do not care which instance the destination is or do not need to balance the workload between cluster queue instances. In such cases, WebSphere MQ balances the workload on its own:�

  • If there is a instance on the connected queue manager, WebSphere MQ automatically chooses it.�

  • If there is no instance on the connected queue manager, WebSphere MQ determines which instance is suitable.�

If you prefer not to use the default algorithm, define a cluster workload exit. An exit is a feature of WebSphere MQ that is similar to a trigger in a database. For more information on exits and how to define them, see your IBM WebSphere MQ documentation.�

By using clusters with multiple instances of the same queue, you can route a message to any queue manager that hosts a copy of the correct queue. However, this adversely affects users who have multiple messages that need to maintain their sequential integrity. For example, a customer sends the following messages to a vendor:

  1. “Send 100 widgets,” sent at 9:00 a.m.

  2. “Send 50 widgets,” sent at 9:30 a.m.

  3. “Cancel the first request,” sent at 10:00 a.m.

In this example, the messages must maintain the correct sequence for the vendor to know that the final quantity the customer wishes to purchase is 50 widgets (that is, 100 + 50 – 100 = 50). If message 2 were to arrive before message 1, the vendor would erroneously believe the customer wished to purchase 100 widgets.

Users can adress this issue by putting these messages in the same instance by specifying clustQBinding, an option_string type in the msgsend function. The options for clustQBinding are bind, nobind, and default.� For a full description of these options as well as examples, see the reference pages for msgsend.

Sours: https://infocenter.sybase.com/help/topic/com.sybase.infocenter.dc01120.1570/html/aseamug/CHDCDHFI.htm

Cluster deployment for high availability

A cluster deployment is a logical grouping of three RabbitMQ broker nodes behind a Network Load Balancer, each sharing users, queues, and a distributed state across multiple Availability Zones (AZ).

In a cluster deployment, Amazon MQ automatically manages broker policies to enable classic mirroring across all nodes, ensuring high availability (HA). Each mirrored queue consists of one main node and one or more mirrors. Each queue has its own main node. All operations for a given queue are first applied on the queue's main node and then propagated to mirrors. Amazon MQ creates a default system policy that sets the to and to . This ensures that data is replicated to all nodes in the cluster across different Availability Zones for better durability.

Note

During a maintenance window, all maintenance to a cluster is performed one node at a time, keeping at least two running nodes at all times. Each time a node is brought down, client connections to that node are severed and need to be re-established. You must ensure that your client code is designed to automatically reconnect to your cluster. For more information about connection recovery, see Automatically recover from network failures.

Because Amazon MQ sets , during a maintenance window, queues will synchronize when each node re-joins the cluster. Queue synchronization blocks all other queue operations. You can mitigate the impact of queue synchronization during maintenance windows by keeping queues short.

The default policy should not be deleted. If you do delete this policy, Amazon MQ will be automatically recreate it. Amazon MQ will also ensure that HA properties are applied to all other policies that you create on a clustered broker. If you add a policy without the HA properties, Amazon MQ will add them for you. If you add a policy with different high availability properties, Amazon MQ will replace them. For more information about classic mirroring, see Classic mirrored queues.

Important

Amazon MQ does not support quorum queues. Enabling the quorum queue feature flag and creating quorum queues will result in data loss.

The following diagram illustrates a RabbitMQ cluster broker deployment with three nodes in three Availability Zones (AZ), each with its own Amazon EBS volume and a shared state. Amazon EBS provides block level storage optimized for low-latency and high throughput.

Sours: https://docs.aws.amazon.com/amazon-mq/latest/developer-guide/rabbitmq-broker-architecture-cluster.html
  1. Nissan maxima starter price
  2. Summit racing 383
  3. Performance tire topeka

WebSphere Remote MQ Clustering

MQ does not have a 'remote get' capability - ie you cannot use local bindings to a queue manager and get a message from another queue manager. If you want to do this, you need to use client bindings to go to the queue manger where the message resides directly.

At MQPUT time, a decision has to be made (on the putting queue manager), where to forward the message to (e.g. which local queue, or which transmission queue to pass it to another queue manager).

In a cluster setup, if you have a queue defined on one queue manager and put it the cluster, anyone from any of the clustered queue managers can put to it as though it was a local queue. However their MQPUTs result in the message arriving (via the cluster channels), onto the one particular instance. Therefore from a different queue manager whilst you can put the message to the queue, you cannot get it.

You could have a queue with the same name defined on multiple queue managers and clustered, as per @JoshMc's suggestion, but this means that at MQPUT time, the message is routed to one, and only one, instance of that queue - if it was routed to the remote queue manager clustered definition you still would not be able to get it from the local queue manager. Imagine you had a cluster of 3 qmgrs. You can create a clustered queue called 'FRED' in 2 of them. All of them can put to FRED - but 2 of them will default to put to their local queue only (unless you set CLWLUSEQ=ANY), the other will (usually) alternate between the 2 remote instances. Each queue will definitely have different messages on.

https://www.ibm.com/developerworks/community/blogs/messaging/entry/Undestanding_on_MQ_Cluster_Work_Load_Management_Algorithm_and_Attributes?lang=en

Sours: https://stackoverflow.com/questions/48150492/websphere-remote-mq-clustering
Clustering in MQ Tutorial

 

WebSphere MQ commands for work with clusters


This section introduces MQSC commands that apply specifically to work with WebSphere MQ clusters:

  • DISPLAY CLUSQMGR
  • SUSPEND QMGR
  • RESUME QMGR
  • REFRESH CLUSTER
  • RESET CLUSTER

The PCF equivalents to these commands are:

  • MQCMD_INQUIRE_CLUSTER_Q_MGR
  • MQCMD_SUSPEND_Q_MGR_CLUSTER
  • MQCMD_RESUME_Q_MGR_CLUSTER
  • MQCMD_REFRESH_CLUSTER
  • MQCMD_RESET_CLUSTER

 

DISPLAY CLUSQMGR

Use the DISPLAY CLUSQMGR command to display cluster information about queue managers in a cluster. If you issue this command from a queue manager with a full repository, the information returned pertains to every queue manager in the cluster. If you issue this command from a queue manager that does not have a full repository, the information returned pertains only to the queue managers in which it has an interest. That is, every queue manager to which it has tried to send a message and every queue manager that holds a full repository.

The information includes most channel attributes that apply to cluster-sender and cluster-receiver channels, such as:

DEFTYPE How the queue manager was defined. DEFTYPE can be one of the following:

CLUSSDR
Defined explicitly as a cluster-sender channel

CLUSSDRA
Defined by auto-definition as a cluster-sender channel

CLUSSDRB
Defined as a cluster-sender channel, both explicitly and by auto-definition

CLUSRCVR
Defined as a cluster-receiver channel

QMTYPE Whether it holds a full repository or only a partial repository.
CLUSDATE The date at which the definition became available to the local queue manager.
CLUSTIME The time at which the definition became available to the local queue manager.
STATUS The current status of the cluster-sender channel for this queue manager.
SUSPEND Whether the queue manager is suspended.
CLUSTER What clusters the queue manager is in.
CHANNEL The cluster-receiver channel name for the queue manager.

 

SUSPEND QMGR and RESUME QMGR

Use the SUSPEND QMGR command and RESUME QMGR command to remove a queue manager from a cluster temporarily, for example for maintenance, and then to reinstate it. Use of these commands is discussed in Maintaining a queue manager.

 

REFRESH CLUSTER

You are unlikely to need to use this command, except in exceptional circumstances. Issue the REFRESH CLUSTER command from a queue manager to discard all locally held information about a cluster.

There are two forms of this command using the REPOS parameter.

Using REFRESH CLUSTER(clustername) REPOS(NO) provides the default behavior. The queue manager will retain knowledge of all cluster queue manager and cluster queues marked as locally defined and all cluster queue managers that are marked as full repositories. In addition, if the queue manager is a full repository for the cluster it will also retain knowledge of the other cluster queue managers in the cluster. Everything else will be removed from the local copy of the repository and rebuilt from the other full repositories in the cluster. Cluster channels will not be stopped if REPOS(NO) is used, a full repository will use its CLUSSDR channels to inform the rest of the cluster that it has completed its refresh.

Using REFRESH CLUSTER(clustername) REPOS(YES) specifies that in addition to the default behavior, objects representing full repository cluster queue managers are also refreshed. This option may not be used if the queue manager is itself a full repository. If it is a full repository, you must first alter it so that it is not a full repository for the cluster in question. The full repository location will be recovered from the manually defined CLUSSDR definitions. After the refresh with REPOS(YES) has been issued the queue manager can be altered so that it is once again a full repository, if required.

You can issue REFRESH CLUSTER(*). This refreshes the queue manager in all of the clusters it is a member of. If used with REPOS(YES) this has the additional effect of forcing the queue manager to restart its search for full repositories from the information in the local CLUSSDR definitions, even if the CLUSSDR connects the queue manager to several clusters.

For information on resolving problems with the REFRESH CLUSTER command see Resolving Problems.

 

RESET CLUSTER

You are unlikely to need to use this command, except in exceptional circumstances. Use the RESET CLUSTER command to forcibly remove a queue manager from a cluster. You can do this from a full repository queue manager by issuing either the command:

RESET CLUSTER(clustername) QMNAME(qmname) ACTION(FORCEREMOVE) QUEUES(NO)

or the command:

RESET CLUSTER(clustername) QMID(qmid) ACTION(FORCEREMOVE) QUEUES(NO)

You cannot specify both QMNAME and QMID.

Specifying QUEUES(NO) on a RESET CLUSTER command is the default. Specifying QUEUES(YES) means that reference to cluster queue or queues owned by the queue manager being force removed are removed from the cluster in addition to the cluster queue manager itself. The cluster queues are removed even if the cluster queue manager is not visible in the cluster, perhaps because it was previously force removed without the QUEUES option.

You might use the RESET CLUSTER command if, for example, a queue manager has been deleted but still has cluster-receiver channels defined to the cluster. Instead of waiting for WebSphere MQ to remove these definitions (which it does automatically) you can issue the RESET CLUSTER command to tidy up sooner. All other queue managers in the cluster are then informed that the queue manager is no longer available.

In an emergency where a queue manager is temporarily damaged, you might want to inform the rest of the cluster before the other queue managers try to send it messages. RESET CLUSTER can be used to remove the damaged queue manager. Later when the damaged queue manager is working again, you can use the REFRESH CLUSTER command to reverse the effect of RESET CLUSTER and put it back in the cluster again.

Using the RESET CLUSTER command is the only way to delete auto-defined cluster-sender channels. You are unlikely to need this command in normal circumstances, but your IBM(R) Support Center might advise you to issue the command to tidy up the cluster information held by cluster queue managers. Do not use this command as a short cut to removing a queue manager from a cluster. The correct way to do this is described in Removing a queue manager from a cluster.

You can issue the RESET CLUSTER command only from full repository queue managers.

If you use QMNAME, and there is more than one queue manager in the cluster with that name, the command is not actioned. Use QMID instead of QMNAME to ensure the RESET CLUSTER command is actioned.

Because repositories retain information for only 90 days, after that time a queue manager that was forcibly removed can reconnect to a cluster. It does this automatically (unless it has been deleted). If you want to prevent a queue manager from rejoining a cluster, you need to take appropriate security measures. See Preventing queue managers joining a cluster.

All cluster commands (except DISPLAY CLUSQMGR) work asynchronously. Commands that change object attributes involving clustering will update the object and send a request to the repository processor . Commands for working with clusters will be checked for syntax, and a request will be sent to the repository processor.

The requests sent to the repository processor are processed asynchronously, along with cluster requests received from other members of the cluster. In some cases, processing may take a considerable time if they have to be propagated around the whole cluster to determine if they are successful or not.

 

On z/OS

In both cases, message CSPARIS30I will be sent to the command issuer indicating that a request has been sent. This message is followed by message CSQ9022I to indicate that the command has completed successfully, in that a request has been sent. It does not indicate that the cluster request has been completed successfully.

Any errors are reported to the z/OS console on the system where the channel initiator is running, they are not sent to the command issuer.

This is not like channel operation commands, which operate synchronously, but in two stages.

 

WebSphere is a trademark of the IBM Corporation in the United States, other countries, or both.

 

IBM is a trademark of the IBM Corporation in the United States, other countries, or both.
Sours: http://www.setgetweb.com/p/WAS51/MQ/clusters/csqzah0615.html

Mq clustering in

July 12, 2019

In this post, we will take a look at enabling TLS for IBM MQ inter-cluster communication. Enabling TLS on the MQ cluster is one of the easiest things you can do for the security of your MQ infrastructure.

Prerequisites

First of all, you need to have your key stores setup on all queue managers in the cluster and certificates loaded into them.

Benefits

Locks on doors

Second of all: why? Well, after you authenticated the users or applications connecting to the cluster and then checked their authorizations against OAM for various objects in your MQ infrastructure, they will start sending messages and the MQ cluster will start routing them. Those messages will be transmitted across the wire in the clear, and can be easily intercepted and reconstructed. For this very reason, whenever a message is in flight, it’s a good idea to secure both the origin and the destination with TLS v1.2 or above. If a message is intercepted, it will be encrypted and will take a very long time to decrypt.

Enabling TLS for IBM MQ

 

An MQ cluster consists of one or more full repository queue managers and one or more partial repository queue managers. Securing the communication in the MQ cluster means enabling TLS on the channels that queue managers use to talk to each other. Each member of the cluster has a cluster-receiver channel and 1 or more cluster-sender channels connecting it to full repositories.

Technical Steps

These steps have been tested on MQ v8 and v9.

  1. Update cluster-receivers first. Pick a TLS v1.2 cipher and make sure it’s the same on all channels. Run this command for every member of the cluster: full repositories as well as partial repositories.

ALTER CHANNEL (<channel name>) CHLTYPE (CLUSRCVR)
SSLCIPH (ECDHE_RSA_AES_256_GCM_SHA384)

  1. Take 10-15 minute pause to allow MQ to catch up.
  2. Update cluster-senders second. Make sure you use the same cipher you used on cluster-receiver channels. Run this command for every member of the cluster: full repositories as well as partial repositories. If you have two full repositories serving your cluster, make sure you run this command on both cluster-sender channels on your partial repositories.

ALTER CHANNEL (<channel name>) CHLTYPE (CLUSSDR)
SSLCIPH (ECDHE_RSA_AES_256_GCM_SHA384)

That’s it! Now your cluster traffic is encrypted. Let me know if these instructions worked for you!

EDIT July 19, 2019:

A user pointed out in the comments that they are getting AMQ9288 after performing steps outlined in the post.
It required further investigation and, I believe, I found a solution. Appending it here for those that run into this issue.

In version 8.0.0.4 IBM changed how GCM ciphers behave (including the one I used in the example).
If you send a certain number of TLS records with the same cipher key the channel will end with AMQ9288.

In order to avoid that, you have two choices:

  1. Change the number of bytes sent across the channel before a session key is renegotiated by MQ. The default value is 0, which means: don’t renegotiate. Change it with this command:

ALTER QMGR SSLRKEYC(10000000)

It will now be renegotiated after every, approximately, 10MB of data is passed on the channel.
You can specify your own number of bytes up to 999,999,999.

You can also change it in MQ Explorer, under queue manager properties, in SSL tab.
Remember to refresh security of type SSL after you make changes to SSL configuration of the queue manager:

REFRESH SECURITY TYPE(SSL)

  1. Use a different cipher, e.g.: ECDHE_RSA_AES_256_CBC_SHA384.

Here is a table  of all supported MQ v9 ciphers you can pick from: Enabling CipherSpecs

But you need to remember that some of them might require different digital signature algorithms when requesting your certificate, e.g. ECDHE_ECDSA_* ciphers.

You can also enable an environment variable to circumvent this restriction (not recommended for production deployments).

Please refer to this article by IBM for the source of this explanation: Change in behavior for channels using GCM based CipherSpecs

Documentation on how to change SSL reset count: Resetting SSL and TLS secret keys

About the Author

Graduated from San Francisco State University with a Bachelor of Science degree in Computer Science. Worked at IBM as a WebSphere Technical Specialist from 2005 to 2010. Proud to be a team member at Perficient since 2014!

More from this Author

Sours: https://blogs.perficient.com/2019/07/12/enabling-tls-for-ibm-mq-inter-cluster-communication/
IBM MQ Administration 9.2 - L3 Admin - Clustering Architecture

Archives

In MQ 91.2. there is a new function called Uniform Clustering, which I thought looked interesting (with my background in performance and real customer usage of MQ).

Ive had a play with it, and written up what I have found.

What is it?

When Uniform Clustering is active and it detects an imbalance in the number of conversations across queue managers, it can send a request to a connected application to request disconnect and reconnect. This happens under the covers, and it means you do not need to write code to handle this.

MQ has supported client reconnect for a few years. In V8.0 you can stop a channel, or use endmqm -r to get the channels to automagically disconnect and reconnect to a different queue manager with no application code.

I would call it conversation balancing with a side effect of workload balancing. It helps solve the problem where one server is getting most of the work and other servers are under utilized.

By having the connections for an application spread across all of the available queue managers, it should spread the workload across the available queue managers, but the workload balancing depends on the spread of work on each connection.

The documentation originally talked about application balancing – which I think was confusing, as is does not balance applications, it balances where the applications connect to.

A good client has the following characteristics

  1. It connects for a long time, and avoids frequent short lived connections.
  2. It periodically disconnects and reconnects, so over time the connections are spread across all servers.
  3. More instances can be started if needed to service the queues. These instances can be spread around the available servers.
  4. Instances can shut down if there is no work for them. For example MQGET wait for 10 minutes and no message arrives.

The Uniform Clustering helps automate the periodic disconnect and reconnect (situation 2 above).

The IBM documentation says it simplifies the administration and set up – I cannot see how this helps, as you have to define the queues and channels anyway – they do not need to be clustered.

The IBM documentation says Uniform Clustering moves reconnection logic from the application to the queue manager. This is true, but production ready applications need to have additional logic in them to support this (see below).

You should not just turn on Uniform Clustering, you need to review your applications to check they can run in this environment. If you just turn it on, it may appear to work; the problems may be subtle, show up at a later date, and also make trouble shooting harder.

How does it work?

Once the queue managers have been set up, they monitor the number of instances of applications connected to the queue manager. If you have two queue managers and have 20 instances of serverprog connected to QMA, and 0 instances connected to QMC, then over time some of the connections to QMA will be told to disconnect and reconnect, some may reconnect to QMA, and some may reconnect to QMC. Over time the number of conversations should balance out across the available queue managers.

Below are some charts of showing how this balancing works. I had a number of “server” program connected as a client. They started and all sessions connected to QMA. They did not process any messages. From the reports produced by my MQCB program, I could see when application instances were asked to disconnect and reconnect.

The chart below shows the rate of reconnecting for 20 servers connecting as clients to 2 queue managers – doing no work. After 300 seconds there were 10 connections to each queue manager.undefined

The chart below shows the rate of reconnecting for 80 servers connecting as clients to 2 queue managers – doing no work. After 468 seconds there were 40 connections to each queue manager.

We can see that balancing requests are sent out every minute or two. The number of conversations moved depends on how unbalanced the configuration is. The time before the connections were balanced varied from run to run, but the above charts are typical.

What gets balanced.

I had two applications running into my queue managers. If you use DIS CONN(*) APPLTAG, it shows you the names of the programs running.

My client programs had APPLTAG(myclient), my server programs had APPLTAG(serverprog).

The uniform clustering will balance myclient programs as a group, and serverprog programs as a group.

You may have many client programs, for example hundreds of sessions in a web server, and only a few server programs processing the requests from the clients, so they may get balanced at different rates.

This looks like a really useful capability, but you need to be careful.

The MQ reconnection code will open the queue names you were using, and it is transparent to the application.

A thread may get a request to disconnect and reconnect, while the application is processing an MQ request, waiting for a message, or doing other work. For some application patterns this may not matter, for others you may need to take action.

Where’s my reply?

For a server application which does MQGET, MQPUT MQCOMMIT. If the reconnect request happens, the work can get backed out. Another application can process the work. Great – no problems.

For a client application, these do (MQPUT to server queue, MQCOMMIT), (MQGET wait on reply-to-queue, MQCOMMIT). The reconnection request can happen during the MQGET wait. The MQPUT request specified a reply-to queue, and reply-to queue manager. If the application has a reconnect request, it may connected to a different queue manager, so will not be able to get the reply message (as the message is on the original queue manager).

This problem is due to the reconnection support, and has been around for a long time, so most people will have a process in place to handle this. Uniform Clustering makes no difference to this, it happens without you knowing.

Reporting the wrong queue manager.

Good applications report problems with enough information to identify the problems. For example queue manager name, queue and unexpected return code. If you did MQINQ to find the queue manager name at startup, and if your application instance has been reconnected, the queue manager name may now be wrong.

  1. You can use MQCB to capture and report these queue manager changes, so the reconnects and new queue manager name are written to the application log.
  2. You could issue MQINQ for the queue manager name when you report an problem, but the connection may have moved by the time you report an problem.
  3. You also need to handle which queue manager the MQPUT was done on, as this could be different to where the MQGET completed. This might just be a matter of saving the queue manager name in a MQPUT_QM variable every time you do an MQPUT. You need to do this when tracking down missing messages – you need to know which system the MQPUT was done on.
  4. You could keep the time of the MQPUT, report “Reply not received, MQPUT was put at 12:33:44” and then review the application log (1 above) to see what it was connected to at that time.

What gets balanced

Conversations get balanced. So if you have a channel with 4 shared conversations, (DIS CHS gives CURSHRCNV(4)), you might end up with a channel to QMA with one conversation, a channel to QMB with two conversations and a channel to QMC with one conversation. Some channels may have only one conversation per channel instance.

Are there any new commands?

I could not find any new commands.

Can I turn it off this automatic rebalancing?

To put your queue manager in and out of maintenance mode, see here

This is a “challenge” with reconnection, not with Uniform Cluster support. If you change the qm.ini file and remove the

TuningParameters:
UniformClusterName=MYCLUSTER

statements, this just means the applications connected to this queue manager will not get told to rebalance. You will still get applications trying to connect to the queue manager.

Like this:

LikeLoading...

Related

Published by colin paice

I retired from IBM where I worked on MQ on z/OS, and did customer stuff. I retired, and keep my hand in with MQ, by playing with it! View all posts by colin paice

Published

Sours: https://colinpaice.blog/2019/04/05/uniform-clustering-in-9-1-2-gets-a-tick-and-a-caution-from-me/

You will also be interested:

MQ cluster


home / infca / mq / mq clustering Either he's dead or my watch has stopped


Concepts

Purpose : workload balancing & simplified administration & scalability

Problem : stuck message(s) @ Xmit Q(s)!

Requirement : "define ql(nom) DEFBIND(NOTFIXED)" or MQOO_BIND_NOT_FIXED, instead of "DEFBIND(OPEN)"

Arquitecture

A MQ cluster is ... You don't define a cluster as such; you define cluster attributes in each queue manager, and each queue manager becomes a member of a logical entity that is referred to as a "queue manager cluster".

It has 2 FRs and lots of PRs.

FR holds info about the cluster topology = participant qmgrs and shared queues.

How do you add a queue manager as a partial repository except by creating a cluster sender and a cluster receiver channel? A qmgr becomes a partial repository (PR) when an object (queue, channel, etc.) is defined with the CLUSTER() attribute name of the cluster in which the object is to be known.

Cluster routing and operation

At SMQ qmgr we can display the cluster sender channels, listening on one transmit queue :

display qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) APPLTAG CHANNEL 6 : display qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) APPLTAG CHANNEL AMQ8450: Display queue status details. QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) TYPE(HANDLE) APPLTAG(C:\MQ\bin\amqrmppa.exe) CHANNEL(TO.SMQ2) AMQ8450: Display queue status details. QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) TYPE(HANDLE) APPLTAG(C:\MQ\bin\amqrmppa.exe) CHANNEL(TO.SMQ3) AMQ8450: Display queue status details. QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) TYPE(HANDLE) APPLTAG(C:\MQ\bin\amqrmppa.exe) CHANNEL(TO.IB9QMGR)

SYSTEM.CLUSTER.TRANSMIT.QUEUE

  • holds outboud administrative messages
  • holds outbound user messages
  • CorrelId in MQMD added on transmission queue will contain the name of the channel that the message should be sent down

Para optimizar su vaciado se puede introducir el parámetro PipeLineLength=2 en qm.ini
Pipelinelength=2 enable overlap of putting messages onto TCP while waiting for acknowledgment of previous batch. This enables overlap of sending messages while waiting for Batch synchronization at remote system.

URL : To allow an MCA to transfer messages using multiple threads, type the number of concurrent threads that the channel will use. The default is 1; if you type a value greater than 1, it is treated as 2. Make sure that you configure the queue manager at both ends of the channel to have a Pipeline length that is greater than 1. Pipelining is effective only for TCP/IP channels.

Curiós:

dis qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) all 5 : dis qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) all AMQ8101: WebSphere MQ error (18EBF0) has occurred.

Hem de engegar els canals i obtenim:

dis qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) all 11 : dis qstatus(SYSTEM.CLUSTER.TRANSMIT.QUEUE) type(handle) all AMQ8450: Display queue status details. QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) TYPE(HANDLE) APPLDESC(WebSphere MQ Channel) APPLTAG(C:\MQ\bin\amqrmppa.exe) APPLTYPE(SYSTEM) BROWSE(YES) CHANNEL(TO.SMQ2) CONNAME(127.0.0.1(2417)) ASTATE(NONE) HSTATE(ACTIVE) INPUT(SHARED) INQUIRE(YES) OUTPUT(YES) PID(9060) QMURID(0.155) SET(YES) TID(9) URID(XA_FORMATID[] XA_GTRID[] XA_BQUAL[]) URTYPE(QMGR)


What data do I need to join a cluster

  • cluster name [1]
  • repository qmgr ip & port [2]
  • cluster-channel name [3]

DEFINE CHANNEL ('CLUSTER-NAME.MY-QM-NAME') + CHLTYPE(CLUSRCVR) + TRPTYPE(TCP) + CLUSTER('OUR-CLUSTER') + [1] DEFINE CHANNEL ('CLUSTER-NAME.FR-QM-NAME') + [3] CHLTYPE(CLUSSDR) + TRPTYPE(TCP) + CLUSTER('OUR-CLUSTER') + [1] CONNAME('remotehost.domain(1482)') + [2]

Real sample :

*** Se crea el canal CLUSTER RCVR *** def channel(TO.QM01) + chltype(CLUSRCVR) + trptype(TCP) + conname('my.hostname(1414)') + cluster(CLUSTERNAME) + [1] maxmsgl(104857600) + replace *** Se crea el canal CLUSTER SDR *** def channel(CLUSTERNAME.QMFR) + [3] chltype(CLUSSDR) + trptype(TCP) + conname('host.remoto.fr(1415)') + [2] cluster(CLUSTERNAME) + [1] maxmsgl(104857600) + replace

Para recibir información de la configuración del cluster - las colas que hay ofrecidas / visibles al cluster - solo nos hace falta [1], el nombre del cluster
Pero para activar el canal CLUSRCVR hace falta activar el canal CLUSSDR, o sea que hace falta la ip/port remotos [2], por lo que hace falta el nombre del canal [3]

Preventing queue managers joining a cluster

It is difficult to stop a queue manager that is a member of a cluster from defining a queue. Therefore, there is a danger that a rogue queue manager can join a cluster, learn what queues are in it, define its own instance of one of those queues, and so receive messages that it should not be authorized to receive.

To prevent a queue manager receiving messages that it should not, you can write:

  • a channel exit program on each cluster-sender channel, which uses the connection name to determine the suitability of the destination queue manager to be sent the messages.
  • a cluster workload exit program, which uses the destination records to determine the suitability of the destination queue and queue manager to be sent the messages
  • a channel auto-definition exit program, which uses the connection name to determine the suitability of defining channels to the destination queue manager

MQ v7 queue manager clusters, csqzah09.pdf, pg 77


Naming / nomenclatura

First idea is to name

  • TO.LOCAL_QMGR_NAME = CLUSRCVR, cluster receiver channel
  • TO.REMOTE_QMGR_NAME = CLUSSDR, cluster sender channel
but when it comes to overlapping clusters, it is not good enough {see PDF}

Advanced naming convention

[TR] Naming convention I try to use instead is <cluster name>.<qmgr name>, as CLUSNAME.QMGRNAME, meaning "only one cluster per channel"

MQ cluster best practices {sagpdf}


Clustering demo

See commands in "\\MQ\Eines\Clustering_Demo\"

  1. instalació MQ - producte as is
  2. configuració del cluster [administració] - facilitat incorporacio cues noves - no cal QREMOTE
  3. funcionament bàsic del cluster - as MQ
  4. incorporació de un nou QM [escalabilitat] - sols 2 canals (+ cues que ofereix)
  5. prova de càrrega - 200 msg/segon, de 1 KB
  6. prova de balanceig de càrrega [balanceig de càrrega] - sense / amb coeficients - CLWLWGHT/CLWLPRTY/CLWLRANK
  7. caiguda de un servidor [alta disponibilitat] - caiguda de QM o cua no disponible (Put disabled)
  8. accés al cluster des un MQ Client, exterior al cluster [MQ client] - QALIAS at entry node

Minimum actions - create a cluster

LO qm ALTER QMGR REPOS(INVENTORY) NY qm ALTER QMGR REPOS(INVENTORY) LO qm DEFINE CHANNEL(TO.LONDON) CHLTYPE(CLUSRCVR) TRPTYPE(TCP) CONNAME(LONDON.CHSTORE.COM) CLUSTER(INVENTORY) NY qm DEFINE CHANNEL(TO.NEWYORK) CHLTYPE(CLUSRCVR) TRPTYPE(TCP) CONNAME(NEWYORK.CHSTORE.COM) CLUSTER(INVENTORY) LO qm DEFINE CHANNEL(TO.NEWYORK) CHLTYPE(CLUSSDR) TRPTYPE(TCP) CONNAME(NEWYORK.CHSTORE.COM) CLUSTER(INVENTORY) NY qm DEFINE CHANNEL(TO.LONDON) CHLTYPE(CLUSSDR) TRPTYPE(TCP) CONNAME(LONDON.CHSTORE.COM) CLUSTER(INVENTORY) NY qm DEFINE QLOCAL(INVENTQ) CLUSTER(INVENTORY) Add a QM (Paris) to the Cluster. PA qm DEFINE CHANNEL(TO.PARIS) CHLTYPE(CLUSRCVR) TRPTYPE(TCP) CONNAME(PARIS.CHSTORE.COM) CLUSTER(INVENTORY) // clusRCVR must go first PA qm DEFINE CHANNEL(TO.LONDON) CHLTYPE(CLUSSDR) TRPTYPE(TCP) CONNAME(LONDON.CHSTORE.COM) CLUSTER(INVENTORY) // clusSDR must go second Add a QM+Q (Toronto + INVENTQ) to the Cluster. TO qm DEFINE CHANNEL(TO.TORONTO) CHLTYPE(CLUSRCVR) TRPTYPE(TCP) CONNAME(TORONTO.CHSTORE.COM) CLUSTER(INVENTORY) TO qm DEFINE CHANNEL(TO.NEWYORK) CHLTYPE(CLUSSDR) TRPTYPE(TCP) CONNAME(NEWYORK.CHSTORE.COM) CLUSTER(INVENTORY) TO qm DEFINE QLOCAL(INVENTQ) CLUSTER(INVENTORY) Verify NY qm DIS QCLUSTER(*) CLUSTER (INVENTORY) NY qm DIS CLUSQMGR(*) CLUSTER (INVENTORY) TO qm DIS QCLUSTER(*) CLUSTER (INVENTORY) TO qm DIS CLUSQMGR(*) CLUSTER (INVENTORY) Load Balance ( LA gets twice as many messages as NY ) LA qm DEFINE CHANNEL(TO.LA) CHLTYPE(CLUSRCVR) TRPTYPE(TCP) CONNAME(LA.CHSTORE.COM) CLUSTER(INVENTORY) CLWLWGHT(2) NY qm ALTER CHANNEL(TO.NEWYORK) CHLTYPE(CLUSRCVR) CLWLWGHT(1)

Minimum actions - display the cluster

[fr/pr?] dis clusqmgr(*) conname qmtype status

Minimum actions - delete a cluster

You don't, as such. A Cluster isn't an "entity" that can be deleted.
Once you have altered the clustered objects to remove the cluster atribute, you can issue the REFRESH CLUSTER command on that qmgr.

Minimum actions - Remove a queue from a cluster


Sample cluster : 2xFR, 1 GW, 1 external qmgr

echo "DEFINE QLOCAL(INVENTQ) CLUSTER(INVENTORY) DEFBIND(NOTFIXED)" | runmqsc TQM4 echo "DEFINE QLOCAL(INVENTQ) CLUSTER(INVENTORY) DEFBIND(NOTFIXED)" | runmqsc TQM2 echo "DEFINE QREMOTE(ANY.INVENTQ) RNAME(' ') RQMNAME(' ')" | runmqsc TQM1 echo "ALTER QREMOTE(INVENTQ) RNAME(INVENTQ) RQMNAME(ANY.INVENTQ) XMITQ(TQM1)" | runmqsc TQM3

You have to be able to deduce that QM2 and 4 are the FR's, QM1 is the gateway, and QM3 is external to the cluster ...


Complete cluster : 2xFR, Nx ENT, Nx MB

See GNF

cluster GNF


Cluster resource's availability

  • endmqm -p qmgrname
    • first, messages go to the other cluster queue manager
    • finally, messages get stuck in SYSTEM.CLUSTER.TRANSMIT.QUEUE
  • alter ql(qname) put(disabled)
    • first, messages go to the other cluster queue manager
    • when all cluster queues are "put-disabed", client application gets compcode '2' ('MQCC_FAILED') reason '2268' ('MQRC_CLUSTER_PUT_INHIBITED')

Cluster commands

Available commands are :

DISPLAY QCLUSTER(*) CLUSQMGR- displays queues in cluster

1 : display qcluster(*) clusqmgr AMQ8409: Ver detalles de la cola. QUEUE(QL.CLSAG.CLFR1.SEBAS) TYPE(QCLUSTER) CLUSQMGR(CLFR1) QUEUE(QL.CLSAG.CLFR2.SEBAS) TYPE(QCLUSTER) CLUSQMGR(CLFR2) QUEUE(QL.DELPHI.GRAW.IN) TYPE(QCLUSTER) CLUSQMGR(QMAS) QUEUE(QL.DELPHI.GRAW.IN) TYPE(QCLUSTER) CLUSQMGR(CLFR1) QUEUE(QL.DELPHI.GRAW.IN) TYPE(QCLUSTER) CLUSQMGR(CLFR2)

DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS- display queue managers in cluster

1 : DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS AMQ8441: Ver detalles del gestor de colas de cl�ster. CLUSQMGR(CLFR1) CHANNEL(SAGCLUSTER.CLFR1) CLUSTER(SAGCLUSTER) CONNAME(99.137.164.25(2401)) QMTYPE(REPOS) STATUS(RUNNING) AMQ8441: Ver detalles del gestor de colas de cl�ster. CLUSQMGR(CLFR2) CHANNEL(SAGCLUSTER.CLFR2) CLUSTER(SAGCLUSTER) CONNAME(99.137.164.153(2401)) QMTYPE(REPOS) STATUS(RUNNING) AMQ8441: Ver detalles del gestor de colas de cl�ster. CLUSQMGR(QMAS) CHANNEL(SAGCLUSTER.QMAS) CLUSTER(SAGCLUSTER) CONNAME(6q(1491)) QMTYPE(NORMAL) STATUS(RUNNING)

SUSPEND QMGR- use the SUSPEND QMGR command to remove a queue manager from a cluster temporarily, for example for maintenance

Syntax is : SUSPEND QMGR CLUSTER (cluster_name) [ MODE( QUIESCE | FORCE ) ]

I always conclude removing a QM from a cluster by issuing the REFRESH CLUSTER command on that QM leaving the cluster.

RESUME QMGR- use the RESUME QMGR command to reinstate a queue manager to a cluster, after temporarily having removed it

Syntax is : RESUME QMGR CLUSTER (cluster_name)

REFRESH CLUSTER

Issue the REFRESH CLUSTER command from a queue manager to discard all locally held information about a cluster.
Using REFRESH CLUSTER(clustername) REPOS(YES) specifies that in addition to the default behavior, objects representing full repository cluster queue managers are also refreshed. This option may not be used if the queue manager is itself a full repository.

Issuing REFRESH CLUSTER is disruptive to the cluster.
It is strongly recommended that all cluster sender channels for the cluster are stopped before the REFRESH CLUSTER command is issued.

RESET CLUSTER- used to forcibly remove a queue manager from a cluster. You can do this from a full repository queue manager by issuing either the command:

RESET CLUSTER(clustername) QMNAME(qmname) ACTION(FORCEREMOVE) QUEUES(NO) or the command RESET CLUSTER(clustername) QMID(qmid) ACTION(FORCEREMOVE) QUEUES(NO)

  publib RESET CLUSTER

Using the RESET CLUSTER command is the only way to delete auto-defined cluster-sender channels

Chapter 6, "Queue Manager Clusters", SC34-6589-00.

UK 2013 :

RESET CLUSTER(PC.ECOMM) QMID(QMN.USR.2_2009-08-24_13.20.39) ACTION(FORCEREMOVE) QUEUES(YES) // on FR REFRESH CLUSTER(PC.ECOMM) REPOS(YES) // on QMN

Resum by mr Saper

RESET is used to forcibly remove the information from the Full repository.
SUSPEND leaves the information in the Full repository intact and merely signals that you do no longer want to be included in the load balancing.

The same way: REFRESH checks the information in the FR and tries to add you if you are not there.
RESUME tells the FR that you a ready to rejoin the load balancing.

Cluster specific actions

Com es fa per convertir un FR en PR ?

alter qmgr repos(' ')

Com es fa per convertir un PR en FR ?

alter qmgr repos('mycluster')

Perform a "cold-start" of the cluster, this is, refresh cluster config (use on PR qmgr)

REFRESH CLUSTER REPOS(YES)


Cluster troubleshooting

(one of the) Full Repository QM fails. When back, it does not see the remote cluster queues.

Sol : SUSPEND qmgr + RESUME qmgr

Com saber qui es/son el "Full Repository" de un cluster ?

Use the DISPLAY CLUSQMGR command to display cluster information about queue managers in a cluster. If you issue this command from a queue manager with a full repository, the information returned pertains to every queue manager in the cluster. If you issue this command from a queue manager that does not have a full repository, the information returned pertains only to the queue managers in which it has an interest. That is, every queue manager to which it has tried to send a message and every queue manager that holds a full repository.

Use the SUSPEND QMGR command and RESUME QMGR command to remove a queue manager from a cluster temporarily, for example for maintenance, and then to reinstate it.

In an emergency where a queue manager is temporarily damaged, you might want to inform the rest of the cluster before the other queue managers try to send it messages. RESET CLUSTER can be used to remove the damaged queue manager. Later when the damaged queue manager is working again, you can use the REFRESH CLUSTER command to reverse the effect of RESET CLUSTER and put it back in the cluster again.

Use the DISPLAY QCLUSTER(*) command to display all queues visible from a given cluster queue manager.

The DISPLAY QUEUE or DISPLAY QCLUSTER command returns the name of the queue manager that hosts the queue (or the names of all queue managers if there is more than one instance of the queue). It also returns the system name for each queue manager that hosts the queue, the queue type represented, and the date and time at which the definition became available to the local queue manager.

Cluster symptoms and solutions :

  • Symptom - Applications get rc=2085 MQRC_UNKNOWN_OBJECT_NAME when trying to open a queue in the cluster.
    Description : The queue manager where the object exists or this queue manager may not have successfully entered the cluster. Make sure that they can each display all of the full repositories in the cluster. Also make sure that the CLUSSDR channels to the full repositories are not in retry state.
    Solution : issue display clusqmgr(*) qmtype status command.
  • Symptom - Messages are not appearing on the destination queues.
    Description : The messages may be stuck at their origin queue manager. Make sure that the SYSTEM.CLUSTER.TRANSMIT.QUEUE is empty and also that the channel to the destination queue manager is running.
    Solution : issue display ql(SYSTEM.CLUSTER.TRANSMIT.QUEUE) curdepth command
  • Symptom - No changes in the cluster are being reflected in the local queue manager.
    Description : The repository manager process in not processing repository commands. Check that the SYSTEM.CLUSTER.COMMAND.QUEUE is empty.
    Solution : issue display ql(SYSTEM.CLUSTER.COMMAND.QUEUE) curdepth command

"Queue Manager Clusters", SC34-6589-00, csqzah07.pdf, apendix A.

Cluster change propagation - RESET CLUSTER

Quite often what we have seen occurring is that a queue manager is removed without first deleting its cluster resources. This leaves a situation where the rest of the cluster thinks the queue manager still exists. If you find this has occurred, you will need to use the RESET CLUSTER command to force the removed queue managers definitions out of the cluster.

TMM10 - Introduction to WMQ Clustering.

2189 MQRC CLUSTER RESOLUTION ERROR

url

The queue is being opened for the first time and the queue manager cannot make contact with any full repositories. Make sure that the CLUSSDR channels to the full repositories are not in retry state.

1 : display clusqmgr(*) qmtype status AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(QM1) CLUSTER(DEMO) CHANNEL(TO.QM1) QMTYPE(NORMAL) AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(QM2) CLUSTER(DEMO) CHANNEL(TO.QM2) QMTYPE(REPOS) STATUS(RUNNING) AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(QM3) CLUSTER(DEMO) CHANNEL(TO.QM3) QMTYPE(REPOS) STATUS(RUNNING)

url

Qmgr (new) values not updated in Cluster

If a queue manager has some values (as listener port) at the moment the cluster is created, a change in those values shall not be propagated to the Cluster (repository), unless the following procedure is used :

  1. alter QM1 to be a Partial Repository (suposing it was FR)
  2. REFRESH CLUSTER with the REPOS(YES) option
  3. make QM1 to be Full Repository again (if needed)

Repeat with QM2, the other FR.

Problems with clustering when changing IP

pending to expand

Problem : display CLUSQMGR shows SYSTEM.TEMPQMGR.*

This is temporary type of situation in that this temporary name goes away once the repositories are brought in sync with each other. This is documented in MQ Queue Managers Clusters manual.

url, url

1 : DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(P7029) CHANNEL(SAGCLUSTER.P7029) CLUSTER(SAGCLUSTER) CONNAME(9.137.166.87(2415)) QMTYPE(NORMAL) STATUS(INACTIVE) AMQ8441: Display Cluster Queue Manager details. CLUSQMGR(SYSTEM.TEMPQMGR.9.137.164.25(2401)) CHANNEL(SAGCLUSTER.CLFR1) CLUSTER(SAGCLUSTER) CONNAME(9.137.164.25(2401)) QMTYPE(REPOS) STATUS(RUNNING) One MQSC command read.

Let the cluster settle down while you verify all cluster channel status


Heartbeat and Keep Alive

When you are defining cluster-sender channels and cluster-receiver channels choose a value for HBINT or KAINT that will detect a network or queue manager failure in a useful amount of time but not burden the network with too many heartbeat or keep alive flows.

MQ v 5.3, "Clustering", SC34-6061-02, page 79 [95/183]

On platforms other than z/OS, if you need the functionality provided by the KAINT parameter (Keep Alive), use the Heartbeat Interval (HBINT) parameter,

MQ v 6.0, "MQSC Reference", SC34-6597-00, page 130 [150/501]


What about my applications?

You need not alter any of your applications if you are going to set up a simple MQ cluster. The applications name the target queue on the MQOPEN(queue_name) call as usual and need not be concerned about the location of the queue manager [MQCONNECT(qmgr_name)]


Using clusters for workload management + more than one instance of a queue

Clustering, SC34-6061-02, page 63/183

You can organize your cluster such that the queue managers in it are clones of each other, able to run the same applications and have local definitions of the same queues.

The advantages of using clusters in this way are:

  • increased availability of your queues and applications
  • faster throughput of messages
  • more even distribution of workload in your network

Any one of the queue managers that hosts an instance of a particular queue can handle messages destined for that queue. This means that applications need not explicitly name the queue manager when sending messages. A workload management algorithm determines which queue manager should handle the message.


Workload balancing

When you have clusters containing more than one instance of the same queue, MQ uses a workload management algorithm to determine the best queue manager to route a message to. The workload management algorithm selects the local queue manager as the destination whenever possible. If there is no instance of the queue on the local queue manager, the algorithm determines which destinations are suitable. Suitability is based on the state of the channel (including any priority you might have assigned to the channel), and also the availability of the queue manager and queue. The algorithm uses a round-robin approach to finalize its choice between the suitable queue managers.

If an application opens a target queue so that it can write messages to it, the MQOPEN call chooses between all available instances of the queue. Any local version of the queue is chosen in preference to other instances. This might limit the ability of your applications to exploit clustering.

If it is not appropriate to modify your applications to remove message affinities, there are a number of other possible solutions to the problem. For example, you can

  • Name a specific destination on the MQOPEN call.
    One solution is to specify the remote-queue name and the queue manager name on each MQOPEN call. If you do this, all messages put to the queue using that object handle go to the same queue manager, which might be the local queue manager.
  • Return the queue-manager name in the reply-to queue manager field.
  • Use the MQOO_BIND_ON_OPEN option on the MQOPEN call.

Clustering, SC34-6061-02, page 65 to 70/183
v6, pg 60 [78/201]


The cluster workload management algorithm

If a local queue within the cluster becomes unavailable while a message is in transit, the message is forwarded to another instance of the queue but only if the queue was opened (MQOPEN) with the MQOO_BIND_NOT_FIXED open option, of the MQ_Open() specified "MQOO_BIND_AS_Q_DEF" and DEFBIND queue param value is NOTFIXED.

MQ 6.0 Queue Manager Clusters, csqzah07.pdf, SC34-6589-00, page 51

To route all messages put to a queue using MQPUT to the same queue manager by the same route, use the MQOO_BIND_ON_OPEN option on the MQOPEN call. To specify that a destination is to be selected at MQPUT time, that is, on a message-by-message basis, use the MQOO_BIND_NOT_FIXED option on the MQOPEN call.

MQ 6.0 Programming Guide, page 96 [116/601]

The workload management algorithm selects the local queue manager as the destination whenever possible.

from MQ 5.3 Clustering, SC34-6061-02, page 49

On v6 you can change the workload balancing algorithm so that it does not use a preffered-local strategy.

On v5.x, you can use a cluster workload exit, or you can use a different queue manager for your PUTS than you do for your GETS, and this other qmgr would be in the cluster but not have a qlocal X.

CLWLUSEQ := ANY ; { Local, Any, Queue Manager }
The queue manager treats the local queue as another instance of the cluster queue for the purposes of workload distribution.

MQ v6 "MQSC" SC34-6587-00, pg 50 [70/501]

WorkLoad algorithm detailed

  • [4] (CLWLRANK)
    All queues (not queue manager aliases) with a rank (CLWLRANK) less than the maximum rank of all remaining queues are eliminated.
  • [11] (CLWLPRTY)
    If a queue is being chosen: all queues other than those with the highest priority (CLWLPRTY) are eliminated, and channels are kept.

WorkLoad algorithm


Client access to a cluster

SET MQSERVER=QMS3.SVRCONN/tcp/localhost(1423) DEFINE QALIAS(QSAGCLU) TARGQ(QSEBAS) amqsputc QSAGCLU ... .. [ server_1 ] [ client ] <---> . [ gw ] .. [ server_2 ] ...

Com codificar


Straight WLB

Let's make it run !

  • TQM1 has shared queue WLMQ1
  • TQM2 has shared queue WLMQ1
  • TQM3 sees "two" remote cluster queues WLMQ1
  • an external application (WLG.EXE) does :
    • MQCONN() to queue manager TQM3
    • MQOPEN() to queue WLMQ1
    • MQPUT() messages addressed to WLMQ1 on TQM3
  • mesages get split between TQM1 and TQM2 (better if larger that 512 bytes)

Let's use Alias

If TQM3 has an alias queue WLMAQ, whose TARGETQ is WLMQ1, the WLG.EXE can write to it, and the messages still get to (split) queues.

External access (fail)

If another (external to the cluster) qm TQM4 writes into RMQ99, a remote queue pointing to queue WLMAQ and manager TQM3, the messages go into TQM3DLQ, TQM3's Dead Letter Queue, with Reason d'2082 = MQRC_UNKNOWN_ALIAS_BASE_Q in Dead-Letter Header, because the message carries the destination Queue Manager field ... and there is no such queue there !

Solution : in the Gateway queue manager (TQM3), set a queue manager alias

See Put & Destination !


SAGCLUSTER

The cluster I have for testing is like this

hostname IP Port Op Sys MQ version Qmgr Name MB version MB name FR/PR --------- -------- ----- ------------------ ----------- ---------- ----------- -------- ---------------------- patan .164.249 2401 wxp 7.5.0.1 CLSPATAN - - PR (server) lab005 .164.25 2401 wxp 7.5.0.1 CLFR1 - - FR (main) 6Q .164.234 1491 w2008 SR2 {64-bit} 7.5.0.1 QMAS 7.0.0.1 BKAS PR (moves to BISC net) p9111 .166.86 2416 SLES 10 (ppc) 7.5 P9111 - - PR p7029 .166.87 2415 SLES 10 (ppc) 7.5 P7029 - - PR labss2 . . RH v4 7.5 CLSSS2 - - PR rhv6-64b .164.32 2401 RH v6.1 {64-bit} 7.5.0.1 CLFR2 8.0.0.2 MB64B FR (mix 32/64 bits) t400 .165.248 1491 wxp 7.5.0.1 (MB7QMGR) 7.0.0.1 MB7BROKER PR (or Client)

Shared queues are QL.DELPHI.GRAW.IN & QL.DELPHI.GRAW.OUT, user is MQ_USER_RAW of group MQ_GROUP_RAW

.---- ( SAGCLUSTER ) -------. | | | .-------. | | p9111 | QL.IN | .-------. | | | .-------. .--------. .----------. | p7029 | QL.IN | MQ | | | .-------. QR.MH | client | ------- | patan | | | T400 | | CLSPATAN | .------. .--------. .----------. | 6q | QL.RSP | | QMAS | | .------. | | | .-------. .-------. | .---| 005 |----| rh64b |--. | CLFR1 | | CLFR2 | .-------. .-------.

Some definitions I have

p7029 : DEFINE QREMOTE(QR.MH) RNAME(QL.RSP) RQMNAME(' ') p7029 : define ql(QL.IN) CLUSTER(SAGCLUSTER) QMAS : define ql(QL.RSP) CLUSTER(SAGCLUSTER) p9111 : define qalias(QL.DADES.TRIG) target(QL.P9111) cluster(SAGCLUSTER) defbind(notfixed) replace

Message sent by T400 into cluster is addressed to queue QL.IN so gets to p7029 using cluster. It has ReplyToQueue(QR.MH) and ReplyToQmgr(PATAN), so we get "mqrc = 2087", as there is no QL.RSP at PATAN qmgr

Peter idea :
Have the putting application, the psuedo requester, specify the real reply queue name in the Reply To Queue field of the MQMD of the 'request' message, and fill in the Reply To QM field with a value called VITOR_WUZ_HERE, or any other value you like. Just don't leave it blank or don't fill it in with the name of a real QM.
The message will arrive at the 'replying' app with the reply to queue field filled in with the real reply q name, and the Reply To QM filled in with VITOR_WUZ_HERE. When the app 'replies', it opens the reply queu specifying both the destination queue (the real reply q) and the destination QM (VITOR_WUZ_HERE).
Insure there is a QM Alias called VITOR_WUZ_HERE that routes messages to an XMITQ that gets you back to a Queue Manager in the cluster. I'm assuming the replying app is connected to a QM outside the cluster. On the QM in the cluster that has the RCVR channel from the QM outside the cluster create a QM Alias called VITOR_WUZ_HERE that has a blank Remote Q, blank Remote QM Name and blank XMITQ attribute. As messages arrive destined for a QM called VITOR_WUZ_HERE, this alias will blank out the destination QM and MQ name resolution kicks in looking for that reply queue without a specific QM, and the message will load balance inside the cluster.


Cluster Scripting - discovering cluster params

On any qmgr:

set QMN=QMAS echo 1 echo DISPLAY CHANNEL(*) CHLTYPE(CLUSSDR) ALL | runmqsc %QMN% echo 2 echo DISPLAY CHANNEL(*) CHLTYPE(CLUSSDR) ALL | runmqsc %QMN% | find "CLUSTER(" echo 3 echo DISPLAY CHANNEL(*) CHLTYPE(CLUSSDR) ALL | runmqsc %QMN% | find "CONNAME("

On FR:

set QMN=QMFR1 echo DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS | runmqsc %QMN% - display queue managers in cluster echo DISPLAY QCLUSTER(*) CLUSQMGR | runmqsc %QMN% - d isplay queues in cluster


Cluster monitoring

  • as all messages going off-queue manager will pass through the SYSTEM.CLUSTER.TRANSMIT.QUEUE, monitor its depth appropriatelly :

    DIS CHSTATUS(*) WHERE(XQMSGSA GT 1)

  • SYSTEM.CLUSTER.COMMAND.QUEUE depth should tend to 0
  • monitor the depths of the SYSTEM.CLUSTER.* queues

    display ql(SYSTEM.CLUSTER.*) CURDEPTH 1 : display ql(SYSTEM.CLUSTER.*) CURDEPTH AMQ8409: Display Queue details. QUEUE(SYSTEM.CLUSTER.COMMAND.QUEUE) TYPE(QLOCAL) CURDEPTH(0) AMQ8409: Display Queue details. QUEUE(SYSTEM.CLUSTER.HISTORY.QUEUE) TYPE(QLOCAL) CURDEPTH(0) AMQ8409: Display Queue details. QUEUE(SYSTEM.CLUSTER.REPOSITORY.QUEUE) TYPE(QLOCAL) CURDEPTH(4) AMQ8409: Display Queue details. QUEUE(SYSTEM.CLUSTER.TRANSMIT.QUEUE) TYPE(QLOCAL) CURDEPTH(0)

  • monitor the status of all SYSTEM.CLUSTER.* channels - all of them must be "running"

    display chstatus(*) STATUS 6 : display chstatus(*) STATUS AMQ8417: Display Channel Status details. CHANNEL(TO.IB9QMGR) CHLTYPE(CLUSSDR) CONNAME(127.0.0.1(2415)) CURRENT RQMNAME(IB9QMGR) STATUS(RUNNING) SUBSTATE(MQGET) XMITQ(SYSTEM.CLUSTER.TRANSMIT.QUEUE) AMQ8417: Display Channel Status details. CHANNEL(TO.SMQ) CHLTYPE(CLUSRCVR) CONNAME(127.0.0.1) CURRENT RQMNAME(IB9QMGR) STATUS(RUNNING) SUBSTATE(RECEIVE)

MQ cluster best practices {sagpdf}, publib

Cluster Health monitoring tool (Delphi, of course)

Input - a file with

SET CNAME= // cluster name SET FR1NAME= // full repository (1) qmgr name SET FR2NAME= // full repository (2) qmgr name SET NUMQMCL= // number of queue managers in cluster (apart of FR's) :: "N" times SET QM01NM= // 1-st queue manager - name SET QM01LS= // 1-st queue manager - listener port SET QM02NM= // 2-nd queue manager - name SET QM02LS= // 2-nd queue manager - listener port

Output shall be

Cluster connectivity & availability monitoring tool

In all PR's of the cluster, we can install a "responder" waiting on a specific queue.

The monitor program shall have a list of queue managers and shall send a msg to all of them, verifying a msg can reach there and come back.

This shall assert the cluster shared objects availability to some level.

AMQSCLM - the cluster queue monitoring sample program

*** publib ***


Multiple cluster XMITQ

All you do to use multiple cluster transmission queues is to change the default cluster transmission queue type on the gateway queue manager. Change the value of the queue manager attribute DEFCLXQ

Changing the default to separate cluster transmission queues to isolate message traffic

The default cluster transmission queue is set as a queue manager attribute, DEFCLXQ. Its value is either SCTQ or CHANNEL. New and migrated queue managers are set to SCTQ. You can alter the value to CHANNEL.

Cluster transmission queues and cluster-sender channels

The values of DefClusterXmitQueueType are MQCLXQ_SCTQ or MQCLXQ_CHANNEL.

  • MQCLXQ_SCTQ
    All cluster-sender channels send messages from SYSTEM.CLUSTER.TRANSMIT.QUEUE.
    The correlID of messages placed on the transmission queue identifies which cluster-sender channel the message is destined for.
    SCTQ is set when a queue manager is defined. This behavior is implicit in versions of WebSphere� MQ, earlier than version 7.5. In earlier versions, the queue manager attribute DefClusterXmitQueueType was not present.
  • MQCLXQ_CHANNEL
    Each cluster-sender channel sends messages from a different transmission queue.
    Each transmission queue is created as a permanent dynamic queue from the model queue SYSTEM.CLUSTER.TRANSMIT.MODEL.QUEUE.

DefClusterXmitQueueType (MQLONG)

You have some choices to make when you are planning how to configure a queue manager to select a cluster transmission queue.

Clustering: Planning how to configure cluster transmission queues

If you set the queue manager attribute DEFCLXQ to CHANNEL, a different cluster transmission queue is created automatically from SYSTEM.CLUSTER.TRANSMIT.MODEL.QUEUE for each cluster-sender channel.

Cluster queues

display qmgr 1 : display qmgr AMQ8408: Display Queue Manager details. QMNAME(SMQ) ACCTCONO(DISABLED) DEADQ(QL.DLQ) DEFCLXQ(SCTQ)


Enabling SSL in an existing WebSphere MQ cluster, developerWorks, Ian Vanstone : runmqckm commands, complete sample

About cluster security on a tricky configuration


{bestp}

Clustering best practices, hints, etc

  • all full repositories must have senders manually defined to all others (if more than 2)
  • "Time to settle" - a good hint that updates have been processed is no messages waiting on the SYSTEM.CLUSTER.COMMAND.QUEUE on any repositories (full or partial)
  • if taking a queue manager down temporarily (e.g. for this kind of maintenance) remember to use SUSPEND first
    If it is a FR, release it from that role first : alter qmgr repos(' ')
  • [TR] use a dedicated queue manager when you are unable to obtain a dedicated host (for a FR)
  • [TR] only one explicitly defined CLUSSDR (on all PR's)
  • ReplyToQueue can not be a clustered queue
    Solution (Peter [email protected]) :
    1. on Request message, specify a fake qmgr name, as "VITOR_WUZ_HERE"
    2. on responding qmgr, define QREMOTE so Response message uses clustering again :

      define qremote(VITOR_WUZ_HERE) RNAME(' ') RQMNAME(' ') XMITQ(' ') replace

  • MQ cluster best practices {sagpdf}, as
    • general cluster "hygiene"
    • performance
    • avoiding problems before they arise

    Also "moving full repositories"

    Read this:
    Never pretend that two different installations are the same queue manager (by trying to give a new installation the same QMGR name, IP address etc)

    • this is one of the most common mistakes in working with clusters. The cache knows about QMID and state may end up corrupted
    • if accidentally end up with this scenario, RESET CLUSTER is your friend

  • to stop gracefully a cluster queue manager, follow this procedure:
    • disable cluster queues
    • stop cluster channels
    • if you have to modify it, remove qmgr from cluster (make FR first)

Repository Query command

If you want to have a look into cluster repository, use this command:

c:\> amqrfdm /? WebSphere MQ Repository Query Program written by Paul Clarke Usage : amqrfdm [-m QMgrName] [-d]

Interesting summary

#1 Regardless of how many FRs you have, each FR should have a manual CLUSSNDR defined to every other FR.

#2 If every FR has a CLUSSNDR to every other FR, each FR will know about every cluster attribute on every QM in the cluster.

#3 A PR will only ever publish info to 2 FRs. A PR will only ever subscribe to 2 FRs. Period. It doesn't matter how many manual CLUSSNDRs you define on that PR. A PR will only ever send its info (publish) to 2 FRs and will only get updates (subscribe) from 2 FRs.

#4 You should only define one CLUSSNDR to one FR from a PR.

#5 If 2 FRs go down in your cluster, your cluster will be able to send messages just fine. But any changes to cluster definitions become a problem. Any PRs that used both of these down FRs will still function for messaging, but they will not be made aware of any changes in the cluster because both of it's FRs are N/A.

#6 If two of your FRs are down, and you still have other FRs, you could go to your PRs and delete the CLUSSNDR to the down FR, define a CLUSSNDR to an available FR and issue REFRESH CLUSTER(*) REPOS(YES). This would cause your PR to register with an available FR and thus pick up cluster changes.

#7 In a properly designed system the likelihood of 2 FRs being down is next to zero, so the need for more than 2 FRs is next to zero. And even if both FRs are down it doesn't mean your cluster will come to a screeching halt.

Just use 2 FRs.

Replace FR steps

If you want to keep IP or QMGR name, keep in mind QMID (includes CRDATE and CRTIME) will certainly be different.

On the local qmgr, use DISPLAY Q(*) WHERE (CLUSTER NE ' ') to see which queues are shared in the cluster.

  1. stop sharing objects by setting CLUSTER(' ')
  2. make sure all queues are empty (as probably this qmgr will never come back)
  3. (pend) stop CLUSSDR channel (?)
  4. remove qmgr from cluster, using SUSPEND command
  5. take down qmgr
  6. add new qmgr (with same name) to cluster
  7. set CLUSTER('cluster-name') property on objects that need to be shared

Garbage collector problems

When objects in the cluster repository cache are modified (for example, changing an attribute on a cluster queue), the details for that object are republished to the cluster. Previous records for the object may persist for some time in the cluster cache, so that applications currently using them (for instance having opened the queue for output) can continue processing without interruption.
Periodically, the repository process attempts to 'garbage collect' these older records, checking whether they are still in use. Where multiple such records exist for a particular cluster queue manager object (the record in the cache which stores information about the channel definition to reach a remote queue manager), and these are held in use for a prolonged period, an error in the logic leads to the possibility that the storage for parts of these queue manager records can be reused (for example overwritten to hold another object) while actually still required.

Solution:

REFRESH CLUSTER(*) REPOS(YES)

wwqa

Few Q&A

  • how many FR ? 2
  • DISCINT=0 ? no
  • 2 qmgrs just to be FR ? yes, on a separate server
  • does the "response" message use the cluster ? No

Els meus dubtes del clustering

  • QUEUE(SYSTEM.CLUSTER.REPOSITORY.QUEUE) - mante la configuracio ?
  • quina diferencia hi ha entre SUSPEND i RESET CLUSTER ? Treuen el gestor del cluster ...
    Similarment, diferencia entre RESUME i REFRESH ...
  • com saber quants gestors hi ha al cluster ... i quina IP tenen ?
    Resp.- "DISPLAY CLUSQMGR(*) CONNAME QMTYPE STATUS"
  • com saber quins objectes s'estan oferint al cluster i qui és el seu propietari ?
    Resp.- "DISPLAY QCLUSTER(*) CLUSQMGR" per veure les cues
  • si tenim una mateixa cua oferta al cluster des diversos gestors, i una d'elles es posa PUT_INHIBITED, les dades es distribuiran entre les cues restants ?
  • com treure un gestor del cluster ? i si és un FR ?
    Resp.- "suspend qmgrname"

Links

  • advanced tasks, as how to modify a simple cluster in various ways
  • task 10: Removing a queue manager from a cluster [*****]
  • A. Beardsmore "migrating v6 qmgr to v7" (developerWorks)
  • good intro : getting started with queue manager clusters
  • good article : cluster design and operation
  • hints and tips, as
    • stay current on MQ maintenance
    • remember to include the cluster parameter on all of the cluster object definitions (queues, channels, queue manager, and so on)
    • verify that your cluster channels connecting to the repository queue manager go to running status when the queue manager starts
  • T Rob's cluster design and operation, cluster health check, as
    • unique CLUSRCVR names per cluster : dont use "TO.<qmn>" but "<cluster name>.<qmgr name>"
    • use MCAUSER on CLUSRCVR channel to restrict administrative access
    • also on RCVR, RQSTR, CLUSRCVR or SVRCONN channel, including the ones named SYSTEM.DEF.* and SYSTEM.AUTO.*, even on queue managers where channel auto-definition is disabled
  • MD05 - design considerations for large clusters
  • configuring request/reply to a cluster
  • collect and analyze WebSphere MQ data to solve problems with clusters
  • use Omegamon and Tivoli to set PUT(DISABLED) and change the cluster workload algorithm
    I think managing CLWLPRTY queue attribute is far better - you never end up with "no queue available" situation

    We need a MB admin agent flow !

Books

  • SC34-6061 : Queue Manager Clusters. Online url (v 5.3)
  • [\\MQ\Books\V6] SC34-6589 : Queue Manager Clusters. PDF (v 6.0)
  • [\\MQ\MQ_V7\Llibres] Clustering_and_HA_in_ESB_22491360.pdf
  • [\\MQ\MQ_V7\Llibres] WebSphere_MQ_7.0_Creating_a_Cluster.pdf
  • [\\dep03x\SWU\BCN_WSTC_2008\Materials] TMM02 - Advanced_Clustering.pdf
  • [\\dep03x\SWU\BCN_WSTC_2008\Materials] TMM10 - Introduction to WMQ Clustering.pdf


Sours: https://usuaris.tinet.cat/sag/mq_cluster.htm


286 287 288 289 290