What is MAT

Main Page

Reducing Administrative Cost

Managing Servers

See MATtool in Action




MATtool has been evolving for over five years based on feedback from users. Your comments and suggestions are important. 

Email your comments, suggestions, and bugs to sblack@ee.ryerson.ca.  Be sure to include MAT in the subject to prevent filtering as spam. Be sure and check the known issues. 

Known Issues: 

- The x86_64 port is very new, so there are probably issues I have not found yet.

- The MATd daemon kills all running jobs if the config file is changed.  For an unknown reason the SIGCHLD is missed on one of the MATd child process. 

- The HPUX port does not support user added functions. 

- The Solaris X86 port is very new, so there are probably issues I have not found yet.

- This is still an BETA release of the tape module.  There are bugs and known limitations: 

  1. End of Tape (EOT) is not handled. 
  2. - Tape library support is not there 
- NIS (YP) netgroup not fully supported. 

- The IRIX MATd gets "odd" data on some of the monitored parameters. 

- On Solaris the crontab does not appear the first time the icon is double-clicked. 

- With shadow passwords it will not show if the account is locked with a version older than 0.28. 

Change Log: 

Version 0.34

- Packaged MATtool as a Rocks roll. The roll will discover and group compute nodes according to rack number.

- Added an ip.allow file. Only those IP addresses listed in this file will be able to connect to the MATtool agent. It is available when the MATtool agent is ran as a daemon, instead of from inetd. For example, an entry like: "192.168.211." would allow all hosts on subnet 192.168.211 to access the agent. If you use this the command relay, and any backup or replication servers MUST be included in the list.

- Added the option of running the agent as a daemon. This is needed for the above feature, and for SSL.

- Added a few sections to the online help.

- Changed the OS command section so you can enter many commands at once. Set the default to non- backgrounded.

- Upgrade no longer depends on uudecode binary being on the hosts. SUSE Linux users can now upgrade from the console without having to install uudecode on all hosts.

- Fixed the password to allow non alphanumeric characters.

- Added MD5 password support. If root used an MD5 password, then all others will use MD5.

- Fixed the network probing so it only gathers data on real physacal interfaces on Linux.

- Added some optimizations to the command relay code to improve the speed. The Nagle algorithm is turned off for connection phase, and on again to transfer multi-line commands.

- Fixed a problem on Linux hosts reporting WIFEXITED when the child was still running. This affected the command relay.

- Fixed buffer overrun, which would have affected many commands.

  Version 0.33

- Fixed a problem with the fast probing on hosts using xinetd. This issue has forced the release of 0.33.

- Enhanced the file manager so you can type in a file of directory name in the position. If it is a file it will open the directory it is in display the file and select it. If it is a directory it will open that directory and display the contents.

- Stopped logging the addition of new MATtool users.

- Fixed bug deleting MATtool user entries.

- Simplified the selection code for finding objects to delete. Hopefully nothing is broken.

- Fixed memory leak in console with probing enabled and a configured command relay.

- Fixed the console updating so it would not run twice in the same minute every other probing cycle.

- Fixed background color in unselected items in the file manager.

  Version 0.32

- Changed the status update to use the command relay instead of it's own probing. This makes the status update much faster on bigger clusters. The default console update period is every 2 minutes

- Changed the MATtool passwords to use an md5 hash.

- Changed the console so you do not have to be in the mat/bin directory to run it. Just put the MATtool/bin directory in your path. NOTE: It writes to the the $MATtool/defaults, and $MATtool/hosts file.

- Changed the search to automatically wrap arround the bottom if a match was found.

- Changed the DNS, Ping, FTP, SMTP, POP3, and HTTP probes to not send emails by default.

- Forced the display to update after selecting a possibly large number of hosts.

- Added the ability to select based on OS.

- Added database and web server start/stop scripts for future use.

- Added a check to validate the session password if a command relay is defined. If a command relay does now exist prompt user to add one.

- Increased the bufferspace for user defined macros to allow scripts up to 100KB.

- Fixed the discovery which is triggered on an upgrade to keep any existing MATd tasks and there thresholds and only add new ones.

- Fixed the Scan IP option for finding MATtool hosts when the IP address could not be resolved.

- Fixed the Auto Probe so it not works when turned on without restarting console.

- Fixed problem dragging host icons.

- Fixed the console so dead hosts have no thermometer.

- Fixed the MATtool host list so that the ..# after some hosts and containers is only produced when the host or container name appears somewhere else.

- Fixed the probing so it would turn off when you start updating hosts, and restart when you drill into a host. This stops the loss of changes when updating the MATtool hosts.

- Fixed focus lock problem which prevented multi-host updates on MATd monitoring task advanced proprerties.

- Added uudecode into the tcl module for use in the upgrade process for the next releases, because SUSE Linux does not install uudecode by default.

- Fixed missing array problem when changing hosts icon.

Version 0.31

- Added a new file management functions, including a complete new file management UI. The file operations are also performed on all the selected hosts.  Copy, move, delete, change permissions, run, edit files etc on one or 100's of hosts all from a simple UI.

- Added simple hosts file locking. This allows running of multiple consoles from an NFS installation without damaging the hosts file.

- Added support for alternate host and host group icons. You can now use more descriptive icons for hosts, blade systems, and host groups.

- Added Ping probe to set of probes.  With this a host can ping a list of other nodes to see if they are running.  The probe result will affect the hosts status (temperature).  The list of hosts to ping is set in the Instance Arguments property.

- Added a DNS probe to the probes.  If the host is a DNS server, has /etc/named.conf or /etc/named.boot, then a DNS probe will be added.  The probe will by default try to resolve the hosts name, but a list of hosts can be used to test internal and external host name resolution.  The probe result will affect the hosts status (temperature).  The list of hosts to resolve is set in the Instance Arguments property.

- Fixed deletion problem for MATtool hosts which were part of more than one group.

- Fixed the Auto Probe so selected hosts would not be cleared when the probe updated the host status.

- Fixed Email formating problem in MATd monitoring tasks.

Version 0.30

- Ported MATtool to Solaris X86.  Solaris X86 is an excellent OS for clustered computing.  With the right hardware IMHO Solaris X86 requires less administrative effort than Linux., so it is now supported by MATtool.

- Fixed install problem which prevented creation of command macros.

- Fixed install problem which stopped macros from running as other users.

- Fixed macro environment creation problem.

- Fixed problem with xinetd flags for Redhat users.

- Fixed upgrade problem with IRIX.

- Changed the password field for systems with shadow passwords for compatibility with other tools.

- Finished backend work to support file operations for something like windows explorer.  Click on Agent -> What's coming in the console to see a preview.

Version 0.29

- Added a new scheduled task to reprocess the command relay tasks on any host which failed in a previous try.  For example if a task was run on 100 hosts, but 2 were down, the command relay host would retry to run the task on the 2 hosts, at in interval of every hour, or until it succeeds, or the maximum number of retries is exhausted.  It does not retry when the 2 hosts cannot be pinged.

- Added a New UI for about so a user could retry running a task on failed hosts on demand, or prevent the command relay from trying to rerun the task on the down/failed hosts.

- New OS command macros have been added.  Import  your own admin scripts into MATtool, and let it take care of distribution, and running over many hosts at once.

- Changed the data gathering so it does not use a persistent process, so resource usage by Matd appears smaller.

- Changed Matd log cleaner default timeout to 300 seconds.  Previous value would cause the job to fail while waiting for the other tasks to shutdown.

- Changed the file sorter so it is much faster.

- Changed the way command relay user credentials were stored.  The previous method would have allowed an admin to spoof a command relay task.

- Rewrote the command parser so reduce replication of functionality.  This could be a source of bugs in this release.

- The maximum number of hosts in a replication/backup host group is increased to 128.

- Moved the macro editor under the Command Replication area.

- Added a dialog to verify that a command should be run on all selected hosts, to avoid accidents.

- Added an upgrade summary in case some hosts were down during the upgrade.

- Fixed the upgrade so if only one host was selected it would upgrade that host instead of localhost.

- Fixed problem with command replication which caused some hosts to not get the commands when there were over 40+ hosts selected.

- Increased default warning and error thresholds for the ethernet, runqueue and memory use alarms.

- Fixed the matd, so it would work out the location of the MAT directory without needing to be started with the full path. 

- Fixed bug that would leave shared memory when all matd processes had died.

- Fixed logic error in matd which could cause multiple monitoring tasks of the same type to be spawned.

- Fixed bug where moving a host on the screen would move another in a different view.

- Fixed buffer overflow in Matd task command parser.

- Dropping support for Solaris 2.5.1.  The minimum is now Solaris 2.6.

Version 0.28 

- Fixed problem with command replication which caused some hosts to not get the commands when there were over 160 hosts selected. 

- Added the session prompt at the console startup. 

- Added a host down indicator to the host groups. It indicates when one or more hosts in a group is down. 

- Added the ability to resize the different panes of the console independently. 

- Added the ability to create OS command macros for commands that are commonly run over many hosts. 

- Added an administrative log. Changes and fixes applied can be logged in an append only log, as a record of what has been done on a host. 

- Added a Host Notes section to the System Info section. This is for storing host information such as config, serial numbers, warranty info etc. 

- Changed the Search so it would select the line matching the search. 

- Changed the File Editor to stop users from trying to type the file name in the directory box. 

- Fixed Command Relay debug info appearing in the Command Output pane. 

- Fixed the Save Host Locations so it would not strip out the host descriptions. 

- Removed caching of host status. On double-clicking a host it will now check the status of the host every time. 

- Fixed cummulative bar graph so it will draw without requiring user to select type. 

- Fixed the installation so it will get the directory correct for alternate install locations. 

- Fixed the upgrade script so it will try to stop MATd before upgrading, then restart it. 

- Fixed the host editor to stop it converting host group names to all lower case. 

- Fixed the host warning and error indicators so they would reset. 

- Fixed pointer under-run in many file parsing routines. 

- Fixed IRIX upgrade procedure, so it will properly extract the upgrade package. This means upgrades from the console should work on IXIR in versions 0.28 and higher. The timeout can be ignored. 

- Increased default warning and error thresholds for the ethernet, runqueue and memory use alarms. 

Version 0.27 

- Added patch sorting to Solaris patch inventory 

- Added the ability to run OS commands as an alternate user. 

- Added $USER, $HOSTNAME, $HOME, $PWD environment variables to the Run OS command environment. 

- Added the ability to send OS command output to command output window instead of redirected to a file. 

- Added the ability to upgrade all hosts in the cluster just by doing Select-All and Upgrade/ 

- Added the ability to save the environment for OS commands. 

- The install script now has a no questions install Open the script to see how to use it. 

- Changed the MAThosts editor so the parent box (group) displays the parent group name instead of the id. 

- Changed the command relay so it will send the commands to up to 40 hosts at a time. 

- Deleting a monitoring task from MATd will remove the icon on subsiquent visits to that host. 

- Changed the monitoring thresholds so out of the box it would not generate too many false alarm alerts. 

- Fixed the command relay so it would not block on connecting to dead hosts. 

- Fixed the console blocking problem when hosts were down. 

- Fixed a multi-host command Select-all problem when the error or warning thermomitor was showing. 

- Fixed command replication to work with more than 20 hosts at a time. 

- Fixed NIS user add so "0" was removed from login entry. 

- Fixed NIS so after a user add it would not flip back to the /etc/passwd. 

- Fixed NIS user editor so adding works. 

- Fixed Solaris reboot and shutdown functions. 

- Fixed install.agent script. Removed 0x0d characters. 

- Fixed several errors in the Mathost editor 

- Fixed a right-click problem with undefined $labels 

- Fixed upgrade message which displayed RAW file extraction problem message. 

- Fixed division by zero error messages in the line graphs 

Version 0.26 

- Added the ability to run commands on one or more hosts at the same time.  Included command history to the command status on the hosts could be seen.  In the next release this will be used to replay commands to hosts which were down the first time. 

- Added resource status "temperature gauges".  These show when a monitored object is in the Warning, or Error state.  The resource status is propigated to the lowest display level, so a host group will show an error when a host in that group has an error, or a host in a subgroup of the current group has an error. 

- Added the ability to specify the period of historic data to graph from a host.  This makes the graphs faster.  Added X scaling, and different graph types. 

- Added Reboot and shutdown support, for one or more hosts at once. 

- Added status support to the Auto Probe.  This is used to show when a hosts is in trouble i.e. when one or more monitored resource is in warning or error status.  The temperature gauges, an upcomming web status display rely on this. 

- Added System inventory support for some of the OS's. It shows hardward and software on a host. 

- Added TNM module.  This adds multi-protocol support to TCL.  In this release it is used to adds SNMP traps, and syslog messages to the resource error and warn alarms. 

- Added nsswitch.conf support. 

- Added SMTP (Email) Server monitoring script. 

- Added POP3 Server monitoring script. 

- Added FTP Server monitoring script. 

- Added HTTP Server monitoring script. 

- Added Right-click actions for historic objects. 

- Added File sorting to the file lists. 

- Fixed polling bug in Tcl monitoring scripts which caused two monitoring scripts to be called within 1 minute even if the interval were > 1 min. 

- Fixed New button for new MAT hosts. 

- Fixed IP Scanner, to allow start and stop IP's to be the same. 

- Fixed backup host file list selection, to allow selection from other hosts. 

- Created multi-threaded matd for the TCL monitoring scripts, but shelved it, because of lack of support on HPUX 10.20 

- Created hooks for a web status display.  The next release will use these to show the hosts/groups status on a web browser. 

Version 0.25 

- Added an IP scanner to detect MAT clients on a network and automatically build the console hosts file. 

- Added a host status scanner to the console.  This goes out and tests each host known by the console.  If a host is down, when the host view is redrawn the host icon will indicate that the host was unavailable. 

- Re-wrote the console iconic display section.  This was needed to support the new concept of managed objects. Each icon now represents a managed object.  This will be used later to speed the development of new management functions.  Today it allows users to make their own  "status" objects.  These report the current status of some thing. 

- Added a startup splash screen.  It ain't pretty, and a better one is welcome. 

- Added the ability to run MATd tasks as a non-root user. 

- Fixed memory leak in console code.  It used to re-create an image as a user switched between hosts.  The images are now loaded at startup.  The display is now MUCH  faster. 

- Added a work-arround for the xinetd used in Redhat 7.0.  Xinetd was not passing argv[0], so it will be taken as argv[1], if provided. 

- Added new graphing and display functions.  This makes extending the console with your own status, and historic functions easy. 

- Dropped Linux libc 5.x support.  Many strains of Linux are using glibc instead.  Lost the hard-disk on my libc 5.x machine. 

- Dropped SunOS 4.1.x support.  My Sun SLC died. 

Version 0.24 

- Changed host configuration code so it would be faster for large sites. 

- Changed the immediate replication and backup so they would not terminate when the console connection closed. 

- Re-coded MATd UI to support Tcl scripts. 

- Added a new UI for creating TCL monitoring scripts. 

- Added namespace graphing code for graphing data gathered from monitoring scripts. 

- Added introspection into the matagent.  The remote host not tells the console what it can do.  This allows users to add their own scripts. 
  NOTE:  This will cause an error to appear when using the status section of a new console with on old matagent. 

- Embedded Tcl interpreter into MATd.  MATd can periocically run Tcl scripts to gather performance information. 

- Embedded Tcl interpreter into matagent. 

- Limited encryption to 128 bits.  Higher encryption may have export restrictions. 

Version 0.23 

- Added the ability to replicate files to alternate locations. For example replicate /www/htdocs/ to /usr/local/webroot/ or /opt/Perl/perl_5_005 to /usr/local/bin/perl. 

- Added an encryption library for secure file replication. It can have from 32 to 448 bit encryption. 

- Added the ability to run the console from a browser. The TCL browser plugin from Scriptics is needed, and can be downloaded from http://www.scriptics.com. Invoke the browser from the bin directory, and rename mat to mat.tcl. 

- Updated the User account UI. It is now more intuitive. Added default account values. Added locking reasons i.e. reasons why the user account was locked. 

- Fixed shadow password locking/unlocking for both local and YP shadow files. 

- Fixed post User add/modify so it would not fail if Perl was not available to run the AddUser, or ModUser scripts. 

- Fixed potential buffer overrun in group modify function. 

- Fixed password change error 702. Can now change password from login UI. 

- Fixed host add bug which would cause duplicate icons to appear. 

- Fixed file permissions change when add/mod/delete an entry. It now keeps the file permisions it started with. 

- Changed MAT user UI so you can now edit the root user. 

- Changed local password editor, so you can modify root's password entry, but not delete. 

- Changed the command output window so it is part of the main screen, on a suggestion from JN. 

Version 0.22 

- Added multi-host commands. A single MAT command can be issued to one or more hosts. This is a real time-saver. Example: 
  1. Change the root password on all machines at once!
  2. Update resolv.conf on all your machines with a few clicks!
  3. Edit a file on one machine, and MAT will copy it to all the others.
- Added a more capable mail alias GUI. This one permits alias entries that are more natural. 

- Fixed bug introduced in 0.21. The 0.21 version had many changes to the command syntax, to allow commands to be issued to many hosts. The changes broke: 

  1. MAT passwords could not be changed or added.
  2. The modify button did not work for some commands, but double-clicking did.
  3. Install.agent script was missing/broken in the release files, causing it not to install, or MATd to fail.
  4. Shadow password files were not being updated.
- Fixed Solaris upgrade bug. Upgrading a Solaris version of MAT now works correctly. 

- Fixed MATd core dump. A configuration file was missing from the install.agent script. 

- Fixed error message from the file edit window when you double-click on a file. 

- Fixed shadow password locking for local password/shadow files. 

- Changed the mouse bindings on the host display. The right-button now selects a host or group. The middle button now has the new, and about items. 

- Changed autoLogin read/write to readn/writen. 

- Fixed fgets return type compares, and casting. 

- Fixed shadow password locking/unlocking 

Version 0.21 

- Added a restore GUI.  The GUI gets the backup history from the tape  server and displays it in a logical fashion.  It uses the online catalogs so you can see which tape is appropriate. 

- Added ability to drag host icons into logical groupings, and a save location button so it would recall where you placed the hosts icon. 

- Added a new host grouping scheme.  This permits grouping of hosts into logical/organizational groups.  Updated to MAT host GUI to create them. 

- Added code to select console fonts based on OS.  This was needed for  Win 95/NT. 

- Added backup history logging.  This was needed for the restore GUI and will be used for tape management. 

- Added code so new MAT host will now try to determine the OS type for newly added hosts. 

- Added MAT deamon logging for debugging.  Invoke with:  /var/mat/bin/matd -d {0|5|10|50} 
  Was undocumented in version 0.20. 

- Added code to catch unfinished commands from the console.  These are generated when the user does not wait for the command output before issuing another command.  Problem is more apparent on slow machines. 

- Added code to support MAT license management.  Fore future use. 

- Added file editing for arbitary files.  A GUI browses the files on the remote host, and retrives the file for editing. 

- Changed the Crontab and Mount command set.  This is needed for a future feature which will allow commands to be issued across many hosts. 

- Changed NIS code to allow large groups. 

- Changed Backup GUI, so it would be more intuative. 

- Changed Replication GUI so it looked better. 

- Changed TAPE get cat command so it accepts the real volume number, and not (N - 1)/2. 

- Changed log so it records the user as well as the action. 

- Changed the "cmd" program so it would accept commands as arguements, eg running: 
      ./cmd CON get matpass 
  will get the MAT password file.  This will allow scripting MAT functions. 

- Fixed MAT upgrade feature.  You can upgrade clients directly from the console without having to login to the host. 

- Fixed buffer overflow in timestr.  It was too short in all functions. 

- Fixed bug in mount code which caused user scripts to run before changing file. 

- Fixed bug in mail code which caused newalias script to run before changing file. 

- Fixed month field in MATd date stamp. 

- Fixed time-out bug in backup code.  This caused the backup to stop on files greater that 40Meg. 

- Fixed bug in directory restores.  It would not restore directory contents  if several host were backed-up at the same time. 

- Fixed MAT host editor.  Changing the services offered by a host will not erase the host type now. 

- Fixed logging bug in tape library.  Log file was not always opened. 

Version 0.20 

- A HP UNIX 10.10, 10.20 port is now available.  It includes most of the functionallity of the other ports.  It does not have the user-added-functions support yet., because it does not seem to have dlopen. 

- A file/directory replication module has been added.  It allows a replication server to replicate its files to up to 64 clients.  Replication is done in parallel, and can be set to only replicate changed files.  This is a nifty module.  It can use as much network/disk bandwidth as you've got.  It's 

- The GNU libc is missing the crypt function.  MAT used this function for it's passwords.  This has been replaced with a new password algorithm.  It should now work with Redhat, and other Linux's based on GNU libc. 

- The monitoring scripts have been updated to support 64bit IRIX. 

- The online help has been updated for the replicataion module, and to fix an error in the tape restore example. 

- The online tape catalog structures had to be changed due to a bug in the backup group structure.  This means 0.19 backup records are not compatable with 0.20. 

Version 0.19 

- This is the first release of the network backup module.  This module allows you to backup your UNIX machines accross a network.  It has many of the features of expensive backup packages including: 
  • The ability to backup accross a network.
  • The ability to backup many hosts at the same time
  • Online indexes to simplify restores.
Most of the development work since the last release has been to produce this module.  a easy to use GUI is included to make managing backups simple.  It has been tested backing up 6 different hosts at a time, with IRIX, and Sun tape drives. 

- All scripts now include version numbers.  This will allow you to customize them without fear of later upgrades wiping them. 

- Updates to aliases would fail running newAliases.   Fixed typo, so it works in this release. 

- Bugs in the SGI disk monitoring scripts have been fixed. 

- The robustness of MATd has been greatly increased. 

Version 0.18 

- Added Swap and Memory use monitoring to MATd for Linux.  The alarm threshold is (free - used). 

- Created "hooks" into the GUI to allow users to add there own modules to MAT.  The file userdef1.tcl contains an example and instructions of how to add new administrative functions to MAT.  Files libuser1.[ch] contain an example and instructions for adding the administrative functions to the mat agent.  Used in conjunction with the GUI code it  allows a user to add new administrative functions to MAT. 

- Wrote a parameter discovery script for MATd.  This was worth doing a new release for.  When initially installed MATd will attempt to automatically discover which parameters it can monitor, and will build the configuration file accordingly.  Discovery is also added as a MATd task, so it can detect new hardware/services and configure MATd to monitor them. 

- Fixed bug in the Status screen.  Some queries would timeout.  Just catching SIGCHLD is not reliable, so had to augment with waitpid. 

- Converted email alias tool to use the general config library of functions.  This is a necessary for allowing multilple hosts to be updated from a single command. 

- Changed MATd path so it is relative to $MATHOME instead of bin/$MATHOME. 

- Re-wrote WWW pages. 

- Added MATd task status into MATd, and the MAT agent.  This is for later use so we can see the "health" of a host just by looking at the MAT console.  It will probably be incorporated into the Probe function. 

Version 0.17 

- Show stopping bug fixed.  Symbolic links to the actual config files were not followed, as a result changes were never made to real config files. 

- Changed code so there is no migration needed for: /etc/passwd, /etc/hosts, /etc/group, /etc/services 

- Modify of users password fixed.  Old version put unencrypted copy in password file. 

- Fixed lastlog display 

Version 0.16 

- Changed text display procedure so that on communication problems such as timeouts and "host not found" it does not popup a TK error box. 

- Text display was affected by previous colour scheme and bindings. Changed it so that most items will remove old bindings first.  For performance the larger lists (passwd, YP passwd) do not wipe the bindings first.  This should not be a problem, because they should be replaced with new ones anyway. 

- SunOS 4.1.x disk usage status fixed. 

- Updated logs so they are more meaningful. 

- NIS netgroup fails when entries cannot be parsed.  Changed it so these entries are now ignored. 

- Password did not work with some special characters notably "#". 

- Error in install.agent script. Did not create /etc/inetd.conf entry properly. 

- Modify of some services entries failed. 

Version 0.15 

- Lastlog monitoring does not work (Don't see the cause yet). 

- The diskspace usage reports NFS filesystems as well. 

Version 0.14 

- Mount GUI is not making the mount point, and trying to mount the filesystem. 

- Mount GUI only for Linux 

- Documentation is not available on all topics.