Practice Free NCP-AII Exam Online Questions
You are a network administrator responsible for configuring an East-West (E/W) Spectrum-X fabric using SuperNIC. The Bluefield 3 devices in your network should be set to NIC mode with RoCE enabled to optimize data flow between servers. You have access to the Spectrum-X management tools and the necessary documentation. You need to use specific configuration commands to achieve this setup.
Which of the following steps and commands are necessary to configure the Bluefield 3 devices in NIC mode for the E/W Spectrum-X fabric using SuperNIC? Pick the 2 correct responses below
- A . Use the command sudo mixconfig -d/dev/mst/<device> set INTERNAL^CPU_0FFL0AD_ENGINE=1 to configure the SuperNIC to operate in NIC mode.
- B . Use the command sudo mixconfig -d/dev/mst/<device> set LINK_TYPE_P1 =2 to enable Ethernet on the Bluefield 3 devices.
- C . Use the command sudo mixconfig -d/dev/mst/<device> set DPU_MODE=1 to set up the Bluefield 3 devices in DPU (Data Processing Unit) mode.
- D . Use the command sudo mixconfig -d/dev/mst/<device> set DiSABLE_SPECTRUM_X=1 to reduce overhead.
An engineer needs to validate 400G DAC cable signal integrity in a DGX cluster.
Which CVT metric best identifies marginal cables needing replacement?
- A . Effective BER > 1.5E-254 during a ≤6-hour monitoring window.
- B . Temperature fluctuations > 5°C during validation.
- C . Transceiver model matching QSFP-DD specifications.
- D . Lane power variance < 3dB across all transceivers.
A system administrator noticed a failure on a DGX H100 server. After a reboot, only the BMC is available.
What could be the reason for this behavior?
- A . A boot disk has failed.
- B . The network card has no link / connection.
- C . There are more than two failed power supplies.
- D . Multiple GPUs have failed.
A systems engineer is updating firmware across a large DGX cluster using automation.
What is the best practice for minimizing risk and ensuring cluster health during and after the process?
- A . Drain nodes from the scheduler, update firmware in batches, skip diagnostics and verify health post-update before scaling to the next batch.
- B . Drain nodes from the scheduler, run pre-update diagnostics, update firmware in batches, and verify health post-update before scaling to the next batch.
- C . Update nodes that have reported faults, leaving others on older firmware.
- D . To save time, simultaneously update all nodes in the cluster without draining or diagnostics.
A system administrator receives an alert about a potential hardware fault on an NVIDIA DGX Al 00. The GPU performance seems degraded, and the system fans are operating loudly.
What step should be recommended to identify and troubleshoot the hardware fault?
- A . Increase the fan speed to maximum and check whether the performance improves.
- B . Check the NVIDIA System Management Interface (nvidia-smi) for GPU status and temperatures.
- C . Run a deep learning workload to stress test the GPUs and check whether the issue persists.
- D . Power drain then restart the DGX and check if the performance degradation resolves.
For an NVIDIA Enterprise Al Factory with 256 GPUs, which storage solution characteristic is most critical to validate during scaling tests?
- A . Consistent per-node throughput ≥8 GiB/s.
- B . RAID rebuild times under disk failure.
- C . Single-node write performance during idle clusters.
- D . Maximum 4K random read IOPS exceeding 1 million.
A team is validating a DGX BasePOD deployment. Using cmsh, they run a command to check GPU health across all nodes.
What indicates that the system is ready for Al workloads?
- A . Only the head node’s GPUs need to be healthy.
- B . The command output is ignored if the system powers on without errors.
- C . At least half of the GPUs report Status_Health = OK.
- D . All GPUs report Status_Health = OK and Health = OK for each device.
A network engineer is tasked with configuring the management, storage, and compute networks for a new DGX BasePOD deployment.
Which statement best describes the network segmentation required for optimal operation?
- A . Two networks: one for management and one for compute.
- B . A single VLAN for all types of network traffic.
- C . Four networks: compute, storage, out-of-band, and management.
A leaf switch shows "FW Version Mismatch" alerts for transceivers after cluster expansion.
Which tool validates transceiver firmware against expected versions?
- A . mixconfig
- B . iblinkinfo
- C . ethtool
- D . flint
A financial services firm is deploying an Al model for fraud detection that requires rapid inference and data retrieval across multiple sites.
Which feature should their storage system prioritize?
- A . Multi-protocol data access with low latency.
- B . High capacity with moderate speed.
- C . Tape backup systems.
- D . Low-cost HDD solutions.
