Micro Focus Server Automation helps to perform a comprehensive, compliant server management at the heart of Data Center Automation.
This post will have the monthly Micro Focus Server Automation Tips and Tricks which will be a consolidation of various common issues in Micro Focus Server Automation. Do check out this article for troubleshooting tips and tricks for other tools.
Micro Focus Server Automation – Tips and Tricks – Jan 2021
1. When the user is unable to delete a SA agent on Windows
It is often noticed that users are often unable to delete a SA agent on Windows. It is observed that the uninstall bat file cannot delete all of the folders, files, data, etc.
WatchdogOpswareAgent.log: [ 06/19/20 14:47:09 ] [ 9172 ] Service control manager requests service stop. [ 06/19/20 14:47:09 ] [ 9172 ] Setting watchdog terminate event. [ 06/19/20 14:47:09 ] [ 9172 ] Waiting for an agent to terminate. [ 06/19/20 14:47:09 ] [ 9184 ] Received watchdog service shutdown request, terminating the agent. [ 06/19/20 14:47:09 ] [ 9184 ] Terminating the service (signaling...). [ 06/19/20 14:47:09 ] [ 9172 ] Agent terminated. [ 06/19/20 14:47:09 ] [ 9184 ] Server process terminated with exit status 0. [ 06/19/20 14:47:09 ] [ 9172 ] Set SERVICE_STOP_PENDING ... [ 06/19/20 14:47:09 ] [ 9184 ] Set SERVICE_STOPPED. [ 06/19/20 14:47:09 ] [ 9172 ] Error: unable to set service status to SERVICE_STOPPED: The handle is invalid.
This Error is usually because the Mcafee antivirus is in the lockdown enabled state, which indicates that the “Application change control protection” feature is on. Whenever this mode is enabled without being excluded, one cannot delete or change (through cmd or manually) and any of the files in a directory under C:\ drive. We need to exclude some folders and files(path) for this feature.
You can find a primary answer in the information given below:
%SystemDrive%\Program Files\Common Files\Opsware
In case the user is utilizing PowerShell integration, they may need to whitelist the Folder given below:
2. When an Opsware-agent module is lost on the primary core
Users often encounter an error when the opsware-agent module is lost on primary core. This error can easily be solved by reinstalling the core agent, find the following and take help from the steps given below:
1. # service opsware-sas stop opsware-agent
2. # /opt/opsware/agent/bin/agent_uninstall.sh –no_deactivate
3. # mv /var/opt/opsware/crypto/agent/ /tmp
# mv /etc/opt/opsware/agent/opswgw.args /tmp
4. Follow the below steps
- Then you need to navigate to the Library.
- After that, under Folder, select Opsware.
- Find the Tools and click on Agent Support.
- Then you need to select agent.srv from the list provided.
- Click on the Actions menu, then select Export Software.
- Then you need to place the .srv file in the same directory where the installer was initially placed.
In accordance with your case it would be /var/opt/opsware/agent_installers
5.# /opt/opsware/oi_util/curl/bin/curl -k –cert /var/opt/opsware/crypto/httpsProxy/spin.srv https://localhost:1004/spinrpc.py?method=Device.update&id=10001&allow_recert=1
6.# ./opsware-agent-70.0.76120.1-linux-OEL6 –logfile /tmp/opsware_agent
3. Solving the error “word_uploads have not been completely installed” while upgrading
It is observed among the users when they are initiating an upgrade to SA 2018.08 they come across an error that says:
[H[2JCore Errors Detected
It usually happens because a few important components have not been completely installed. They are given below:
word_uploads on xx.xx.xx.xx
The most prominent cause of this error lies with the /var/opt/Opsware/install_opsware/inv/install.inv file. The user should let the upgrade media look at this specific file and view this problem with word_uploads. When the user checks it manually, they will come across:
%word_uploads build_id: opsware_70.0.74585.0 state: incomplete
It, not the one you need for the system to work properly. You need the following details to be shown:
%word_uploads build_id: opsware_70.0.74585.0 state: complete script: pre script: post
It can often happen due to many issues, like the ones given below:
- Often it is due to the inv file not being updated properly,
- Sometimes it is because the RPM versions were not updated.
You can try changing the INV file and adding the three lines mentioned above. Then you need to restart the system and then Upgrade.
4. Solving the overheaped file system space
Users often come across an overheaped file system space that is too full to take any more of the files or data. The most efficient solution would be to increase the space/memory that is needed by /var/opt/Opsware filesystem. To prevent such similar errors in the future, the user needs to add extra space to those filesystems which need it necessarily.
In /var/opt/Opsware, there are many directories in which the memory consumption increases without the user being able to estimate the needed space. User can start with increasing the filesystem with a much larger amount of disk space to avoid space-related issues.
To make it simple, the longer the user utilizes SA, the fs will increase in the space, it is also affected by the intensive usage.
You can follow the steps given below to clean all the cache and free up some of the space in the filesystem:
For removing all of the not needed files from the word cache:
- You need to edit and change this parameter in /etc/opt/Opsware/mm_wordbot/mm_wordbot.args on all cores and satellites:
cache_max_size – the max size of the cache directory
cache_min_size – the min size of the cache directory
cache_cleanup_rate – The clean-up rate of the cache directory (The basic default value for the clean-up rate is 720 minutes or 12 hours. Users can edit and change it to 1 hour).
Specifically for the /var/log/opsware, it must be due to /var/log/opsware/waybot/debug/, it will be able to securely delete the waybot/debug/ directory.
- Then from the Java client, the user needs to navigate to System Configuration.
- Then they need to find Command Engine (WAY) and search for the way.debug_size.
- After that, they need to set it to “0” if, in any case, it is “1. It is necessary to disable most of the generation of additional logging.
Then a specific number of seconds way.debug_staleness_threshold for these cache/files to be cleared automatically. Kindly make sure to set the value accurately.
5. Fixing the error when the Server Automation failed to find the primary oracle version file during the installation process
It is often noticed among the users that during the installation process of a secondary Server Automation (SA) into the previously present SA environment, an error is displayed upon the screen. The Error on the screen being: “Failed to find the primary oracle version file”.
At the time of installation of a secondary SA core into a previous SA environment, the hpsa_install.sh script is majorly run on the server that is supposed to house the secondary core. When the script fails, it shows the messages given below:
Executing post oracle install step ...... I could not find the database version of primary care. Failed to find the primary oracle version file
This issue can be easily fixed; the user needs to follow the steps given below:
- You need to create the directory /var/opt/Opsware/truth on the specific server that is housing the secondary core
mkdir -p /var/opt/opsware/truth
- Then you need to copy the cdf.<newcore>.xml and truth.<newcore>.tar.gz files to /var/opt/opsware/truth.
After this step, when the files are in place, in case the hpsa_install.sh script is still sitting at the prompt (seen after the Error), then the “previous” option needs to be selected. And then, after reanswering all the questions again, the installation will start and later finish without any error.
If the hpsa_install.sh script has been exited out of after the appearance of the Error, the user needs to run again. But this time, they need to specify the cdf.xml file created by the first aborted attempt.
<sa_iso_directory>/disk001/opsware_installer/hpsa_install.sh -c /var/opt/opsware/install_opsware/cdf/cdf.xml
6. Handling the Twist Heap Size for better performance
The most appropriate way to twist heap size is by increasing the heap size. It is the most efficient way to allocate more memory to the server and for the Twist component. Solution
1. You need to get the output of the commands given below:
free -m cat /proc/meminfo cat /proc/cpuinfo top
2. Then you need to increase the Twist heap size to the minimum of at least 4 GB:
- While using a text editor, you need to open the file. The user also needs to make sure before editing that they have a backup of this file.
- Then you need to modify the following entry to the needed allocation:
twist.mxMem=<memory size> Example: twist.mxMem=8192m
In case the value is already set to 4GB or more, then the user needs to double the current value. They also need to make sure that the memory available on the server can take it.
- Now you need to save the recent changes and then restart the twist:
/etc/init.d/opsware-sas restart twist
This is in order to check if it has acted upon change, and after the restart, the users may:
You need to search for the twist PID by cat /var/opt/Opsware/twist/twist.pid
and then you need to run:
ps –aux | grep <pid>
- Then you need to carefully follow all the steps from a to c on all of the Cores/Slices.
7. Solving the prerequisite failure seen at the time of upgrade
It often happens that at the time of upgrade of a previous Server Automation (SA) environment to SA 2020.11, users face problems during the procedure. The Error displayed on the screen says
:FAILURE Oracle NLS_CHARACTERSET = UTF8 ; required to be = AL32UTF8
The current Oracle database being used by the previous SA version has NLS_CHARACTERSET set to UTF8. Basically, the version SA 2020.11 can run with Oracle databases using any of the both, UTF8 or AL32UTF8. However, AL32UTF8 is a safer version of UTF8. And the SA 2020.11 was structured to run with the databases while utilizing this value.
This could probably be because of a bug, and due to that, this error keeps becoming a recurrent one. This error can be solved by maintaining certain safety measures. In case this is the only failure displayed in the prerequisite checklist failure shown by the hpsa_upgrade.sh, then you need to click on Continue.
Then the user needs to proceed ahead with the upgrade. This failure will be a minor issue; it would not hinder any of the processes and would not be a recurrent one.
8. Instructions to disable buildmgr component
Generally, JKS Keystore utilizes a proprietary format. It is advised to the user that they migrate to PKCS12, which is an industry-standard. It uses a format:
"keytool -importkeystore -srckeystore /opt/opsware/buildmgr/etc/keystore -destkeystore /opt/opsware/buildmgr/etc/keystore -deststoretype pkcs12". Starting Jetty: Console log can be found on /var/log/opsware/buildmgr/console.log nohup: redirecting stderr to stdout Jetty running pid=10008 Verifying "localhost" is serving HTTPS on
This advice is not applicable only if you are using the deprecated OS sequences feature; then you do not need this component as the OS Build Plans (that had replaced OS sequences) do not particularly need the buildmgr component. As advised before, it is better for the user to shut down the buildmgr component.
It is possible to remove the buildmgr from the startup; the users just need to follow the steps given below:
- You need to stop the buildmgr component.
/etc/init.d/opsware-sas stop buildmgr
- Then you need to comment out buildmgr in /opt/opsware/oi_util/startup/components.config
- Then you need to edit and rename /etc/opt/opsware/startup/buildmgr to something like .buildmgr (/etc/opt/opsware/startup/.buildmgr)
The user needs to note that the “.” before the buildmgr name actually hides the file from all the ordinary OS operations. It is crucial that they are aware of the fact that you may need to enable the build manager component again before installing any of the rollup, hotfix bundles, or applying the CORD patches.
9. Instructions on searching and identifying SA versions
Basically, Server Automation (SA) contains many internal components with a variety of version numbers that may seem hard to equate to actual SA versions.
There are many ways to know which SA product version to use:
Through the SA java client
You can use the SA Java client only from a PC that has the SA client installed, follow:
- You need to start up and then log into the SA client.
- Then navigate to the Help tab at the top of the client window.
- You can select the About HPE Server Automation or About Server Automation option.
- Then in the result window, the edition of the SA product with its internal build version will be visible on the second line. It would be visible under the Version tab.
Through the version command
This fix can only be applicable for Linux operating systems, as it acts as one of the SA core servers. You need to execute the command: /opt/Opsware/support/bin/version
If the user comes across the message “No such file or directory,” that means that the SA software support bundle “OPStools” has not been installed on this SA core version. It is much advised to install this specific bundle (which does not need downtime for the product) for getting the version utility and others. It will be very effective in maintaining the SA product.
Through the Cora command
This fix can only be applicable for Linux operating systems; you need to log into it. Then the user needs to execute the command given below:
/opt/Opsware/support/bin/core -c SAVersion -i
The product version will be visible at the top of the output that will be returned. Similar information will be pulled out of a .cora file that was earlier created by the core utility by executing the command given below:
/opt/Opsware/support/bin/core -c SAVersion <cora.file>
And in case, if there is “No such file or directory” error is shown when initiating this command, this shows that the OPSWtools support bundle mentioned earlier is not installed.
Through looking at the component versions found under the
/var/opt/Opsware/install_opsware/inv/install.inv file on a Server Automation core server. You can do that by As different Server Automation core components will be listed with their build versions under the file:
/var/opt/Opsware/install_opsware/inv/install.inv. Then you need to log into any of the SA core servers and initiate the command given below:
“grep -i build_id /var/opt/opsware/install_opsware/inv/install.inv | sort | uniq”.
And the version that will be reported will be the Server Automation build version. The user needs to note that in case multiple versions are reported, the SA product went through many upgrades earlier. The older versions could either be the components that are not used anymore in the new/recent Server Automation version or not upgraded accurately.
Through the Server Automation product iso
When the Server Automation product is downloaded initially, its components create a gzip file known as “T8900-150nn.tgz” (where nn is any two-digit number). When uncompressed/unarchived, this particular file will create a lot of directories, all with names that have a similar “T8900-150nn” part number. The user needs to note that the ISOs for the cord (measly upgrades to a primary SA version which has SA versions 10.21, 10.22, 10.23, and 10.51) will not have files named as “T8900-150xx” name. Instead of that, those files will have the SA build versions mentioned that were mentioned earlier and will be in the format of an.n.nnnnn.n.
10. Fixing the “the system cannot find install_tool_x64.exe” error
It often happens that users come across an error message that hinders their works, some of the message samples are given below:
[01/Aug/2019 12:39:48] [TRACE] RunCommand('install_tool_x64.exe --zap --loglevel=trace') [01/Aug/2019 12:39:48] [ERROR] RunCommand() - Popen failed : 'install_tool_x64.exe --zap --loglevel=trace' : Error : (2) : 'The system cannot find the file specified. [01/Aug/2019 12:39:48] [ERROR] zap_agent: FAIL ... [01/Aug/2019 12:39:48] [ERROR] RunCommand() - Popen failed : 'install_tool_x64.exe --unpack="C:\Users\X18303~1\AppData\Local\Temp\3\~5504-1.WRK\opsware-agent.exe","C:\Program Files\Opsware\agent" --install="C:\Program Files\Opsware\agent","C:\Users\X18303~1\AppData\Local\Temp\3\~5504-1.WRK" --loglevel=trace' : Error : (2) : 'The system cannot find the file specified. [01/Aug/2019 12:39:48] [TRACE] install_tool: FAIL [01/Aug/2019 12:39:48] [ERROR] Opsware agent installation failed.
These errors can easily be solved; the user just needs to follow the steps given below:
On the server that displayed the Error was set like:
And on the test of our servers, it was set like:
- You need to open the CMD as administrator on the Windows server that has the Error.
- Then you need to run the command below:
In case of multiple paths, include your windows admin team, so let them edit and change the value in the registry files. After the values are changed, you can affirm the change from C:\windows\system32\cmd.exe, and it would mean that it is working accurately.