NSD -monitor on Linux

NSD is a great tool to find the cause of a crash or hang ( if you are familiar with how to do the analysis ).
But sometimes the log file created by NSD does not include enough information to find the root cause of a problem.

In this case, you can start NSD in monitor mode.

If you are on Windows you would

  • stop the Domino server
  • from inside the Domino data directory invoke the command nsd -monitor
  • start the Domino server ( as normal application )

After you have reproduced the problem

  • type detach to detach NSD from running processes
  • quit NSD
  • restart your Domino server ( as a service )

I tried to do this on LINUX ( and inside a Docker container ) but the NSD -monitor command always returned an error

[notes@serv07 notesdata]$ /opt/hcl/domino/bin/nsd -monitor

INFO: NSD Monitor : Started
ERROR: NSD Monitor : No Processes Found or Specified
/opt/hcl/domino/notes/latest/linux/nsd.sh: line 8020: 11040 User defined signal 1 "$Nsd" "$@" -wrapper

I opened a case with HCL support and got a reply.

Here is what you have to do.

If you are running Domino 12 as participant of the HCL Domino Early Access program, you have to get into your Domino V12 container first

[root@docker ~]# docker exec -it container-id bash

The next steps are the same for Docker and Non-Docker environments

Identify the running Domino processes. Switch to /local/notesdata and issue the command

ps -fu $USER

You will get something like that

[notes@serv07 notesdata]$ ps -fu $USER
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
notes 2636 0.0 0.0 19200 2332 pts/1 Ss 11:58 0:00 bash
notes 2651 0.0 0.0 51808 1928 pts/1 R+ 11:59 0:00 _ ps -fu
notes 1 0.0 0.0 11988 1720 pts/0 Ss+ 11:52 0:00 /bin/bash /local/start.sh
notes 843 0.0 0.0 11988 1032 pts/0 S+ 11:52 0:00 /bin/bash /local/start.sh
notes 895 0.4 1.1 1466332 92908 pts/0 Sl+ 11:52 0:01 _ /opt/hcl/domino/notes/latest/linux/server
notes 903 0.0 0.3 339272 29448 pts/0 Sl+ 11:52 0:00 _ /opt/hcl/domino/notes/latest/linux/logasio NOTESLOGGER reserved
notes 913 0.2 0.6 1374008 48860 pts/0 Sl+ 11:52 0:01 _ /opt/hcl/domino/notes/latest/linux/event
notes 1732 0.0 0.4 417372 38232 pts/0 Sl+ 11:52 0:00 _ /opt/hcl/domino/notes/latest/linux/replica
notes 1733 0.1 0.5 624500 42024 pts/0 Sl+ 11:52 0:00 _ /opt/hcl/domino/notes/latest/linux/router
notes 1734 0.0 0.4 486132 39296 pts/0 Sl+ 11:52 0:00 _ /opt/hcl/domino/notes/latest/linux/update
notes 1735 0.0 0.5 422108 41700 pts/0 Sl+ 11:52 0:00 _ /opt/hcl/domino/notes/latest/linux/amgr -s
notes 1975 0.0 0.4 421952 36988 pts/0 Sl+ 11:52 0:00 | _ /opt/hcl/domino/notes/latest/linux/amgr -e 1
notes 1737 0.1 0.5 619644 43332 pts/0 Sl+ 11:52 0:00 _ /opt/hcl/domino/notes/latest/linux/adminp
notes 1738 0.0 0.4 487588 35024 pts/0 Sl+ 11:52 0:00 _ /opt/hcl/domino/notes/latest/linux/daosmgr
notes 2424 0.0 0.5 417392 42164 pts/0 Sl+ 11:52 0:00 _ /opt/hcl/domino/notes/latest/linux/cldbdir
notes 2595 0.2 0.5 890580 47804 pts/0 Sl+ 11:52 0:01 _ /opt/hcl/domino/notes/latest/linux/clrepl

Write down the PIDs of interest. In my case it was 1735 and 1975 for the Agent Manager (amgr)

Now you can start NSD in monitor mode with

[notes@serv07 notesdata]$ /opt/hcl/domino/bin/nsd -monitor -pidlist 1735,1975

You can now start to reproduce the problem. In addition to the well known nsd … .log files in IBM_TECHNICAL_SUPPORT you will also find additional information in the nsd.notes folder