LogoLogo
Our ServiceOther service
English
English
  • 🇬🇧VNG Cloud Help Center
  • Overview
    • About VNG Cloud
    • Product Updates (All)
      • 2024
  • vServer
    • Compute
      • What is vServer?
      • Announcements and updates
        • 2024
        • 2023
      • Getting started
        • UserData
      • Quota Limit
      • Instance
        • Connect to virtual server
          • Connecting a Windows Server by Remote Desktop (RDP)
          • Connecting to a Linux server by SSH Client
        • Flavor
        • Instance Lifecycle
        • Create an instance by using the wizard
        • Resize Instance
        • Restart Instance
        • Compute Encryption Volume
          • Using Compute Encryption Volume
      • Placement Group
      • Image
      • Network
        • Virtual Private Cloud (VPC)
        • DHCP Options Sets
          • DNS Server IP Address
        • Instance IP Address
        • Floating IP
        • External Interface
        • Virtual IP
        • Route Table
        • Peering
        • Test Internet Speed
        • Network ACL
        • Bandwidth
          • Package Bandwidth VNG Dedicated
          • Package Bandwidth Pay As You Go
          • Package Bandwidth Share
          • Package Bandwidth Dedicated
          • Payment Methods
      • Interconnect
        • Starts with Interconnect
        • Interconnect Features
        • Location connect and Bandwidth
        • Multicloud-Connection
        • Connections
          • Create a Dedicated Connection
          • View connection information
          • Update Connection
          • Delete Connection
        • UseCase
          • Multicloud Interconnect
          • Hybrid Interconnect
          • VPN Interconnect
          • Using a combination of Interconnect connection methods
      • Volume
        • Extend volume with Linux OS
        • Extend Volume with Windows OS
        • Volume Types
        • Check the IOPS performance
        • Convert Volume Type
      • Snapshot
        • Activate Snapshot
        • Create Snapshots
        • View Snapshot Information
        • Roll back VM by using a snapshot
        • Roll back a disk by using a snapshot
        • Delete Snapshot
        • How to calculate a Snapshot Service Charges
        • Disable Snapshot Service
        • UseCase Snapshot
          • Disaster Recovery
          • Develop and testing
          • Backup and restore the system periodically
          • Migrate data and applications between environments
          • Resist attacks from Hackers or malware infections
        • Share Snapshots
      • Security
        • SSH Key (Key pairs)
        • Security Groups
      • vBackup
        • Create backups for VM with policy
        • Create backups immediately (Back now)
        • Automatic Backup
        • Changing the backup policy
        • Restore Backup
        • Delete Backup
        • Backup Policies
          • Create, edit, delete backup policies
          • Schedule Structure of the Policy
      • Load Balancer
        • Deployment mode
        • Feature Comparison
        • Application Load Balancer
          • How it works (ALB)
          • Getting Started
          • Manage Load balancer
          • Listener
            • Add a HTTP listener
            • Add a HTTPS listener
            • Update & Delete a Listener
            • Listener Policies
            • Client Certificate Authentication
            • Config IP whitelist to load balancer
            • Config timeout
          • Certificate
            • Upload a certificate
          • Pool
            • Add & Update a Pool
            • Pool Members
              • Attach pool members
            • Config health check setting
            • Enable sticky session
            • Enable TLS encryption
            • Pool's algorithm
        • Network Load Balancer
          • How it works (NLB)
          • Getting Started (NLB)
          • Manage Load Balancer (NLB)
          • Listener (NLB)
            • Add a TCP Listener
            • Add a UDP Listener
            • Update & Delete Listener (NLB)
            • Config IP whitelist to load balancer
            • Config timeout
          • Pool
            • Add & Update a Pool
            • Pool Members
              • Attach pool members
            • Config health check setting (NLB)
            • Pool's algorithm
          • Common use cases
            • Config protocol Proxy with member Nginx
        • Monitor your load balancers
          • Metrics
          • Logs
        • Security
      • APIs & IaC
      • Terraform
        • Install Terraform
        • Manage vServer with Terraform
        • Manage vLB with Terraform
        • Reference Document
        • Argument Intergration with Terraform
      • Identity and Access Management (IAM) for vServer
        • Actions, resources, and required conditions for vServer Access Decentralization
        • Use Cases IAM
      • Pricing
    • vMarketplace
      • Third-party integration
      • Application Software Installation
        • Create & Install App
      • Network Software Installation
        • Juniper vSRX on HCM03
          • Create Juniper vSRX
          • Routing IP Range within VPC
        • Pfsense on HCM03
          • Create Pfsense App
          • Routing IP Range within VPC
          • VPN Client to Server
          • VPN Site to Site
            • Pfsense - AWS Cloud
          • Troublehooting - Disconnect network
          • Notice & Limitation
          • MTU & “DF flag” best practice on VNG Cloud
  • vStorage
    • Object storage
      • Object storage (HCM03, HAN01)
        • What is vStorage?
          • What is Region?
          • What is Farm?
          • Unit of Measurement
        • Announcements and Updates
        • Getting Started with vStorage
          • Step 1: Create a project
          • Step 2: Create a container
          • Step 3: Upload an object
          • Step 4: Download an object
          • Step 5: Copy Object to a Directory
          • Step 6: Delete the object and container
        • Features of vStorage
          • Working with projects
            • Projects overview
            • Project naming rules
            • Create a project
            • Viewing project properties
            • Resize a project
            • Renew a project
            • Auto-renew a project
            • Delete a project
            • Restore a project
            • IP Range ACLs for a project
          • Working with containers
            • Containers overview
            • Containers naming rule
            • Create a container
            • Viewing container properties
            • Search containers
            • Versioning container
            • Make container public
            • Make container private
            • ACLs for a container
            • CORS for a container
            • Container lifecycle
            • Delete a container
            • IP Range ACLs for a container
          • Working with directories and objects
            • Objects overview
            • Objects naming rule
            • Upload objects
            • Viewing directory and object properties
            • Search directories and objects
            • Share objects
            • Move objects
            • Copy objects
            • Rename an object
            • Set tags for objects
            • Set metadatas for objects
            • Download objects
            • Delete objects
            • Working with directories
          • Working with report
            • View summary reports across all regions
            • View summary reports on a specific region
            • View summary reports on a specific project
          • Working with trial project
          • Working with POC project
          • Working with vBackup project
          • Working with Archive project
        • Identity and Access Management
          • Managing vStorage access account
            • Root User Account
            • IAM User Account
              • Create an IAM User Account
              • Create Policies for IAM User Account
              • Attach Policies with IAM User Account
              • Delete an IAM User Account
            • Service Account
              • Create a Service Account
              • Create Policies for Service Account
              • Attach Policies with Service Account
              • vStorage Credentials
                • Create a S3 key
                • Create a Swift user
                • Attach S3 Keys and Swift Users to the Service Account
                • Delete S3 key, Swift user
              • Delete a Service Account
          • Managing Access to vStorage Resources
            • Access Permissions and Working Through vStorage
            • Access Permissions and Working Through IAM
            • Features, vStorage Resources, and Access Permissions
            • Access Permissions and Working Through Root User Account
            • Access Permissions and Working Through IAM User Account
            • Access Permissions and Working Through Service Account
        • 3rd Party Softwares
          • S3cmd
            • Integrating S3cmd with vStorage
            • Using S3cmd
          • Cyberduck
            • Integrating Cyberduck with vStorage
            • Using Cyberduck
          • Rclone
            • Integrating Rclone with vStorage
            • Using Rclone
          • Swift Client
            • Integrating SwiftClient with vStorage
            • Using SwiftClient
          • S3 SDK
            • Integrating S3 SDK with vStorage
            • Using S3 SDK
          • MinIO Client (MC)
            • Integrating MinIO Client with vStorage
            • Using MinIO Client
          • S3 Browser
            • Integrating S3 Browser with vStorage
            • Using S3 Browser
          • AWS CLI
            • Integrating AWS CLI with vStorage
            • Using AWS CLI
        • Resource Quota
        • Feature Limitations
        • Charging Fee
          • Charging for prepaid users
          • Charging for postpaid user
        • Monitoring vStorage
          • Monitoring vStorage Through Metrics
          • Monitoring vStorage Through Logs
        • Security
          • Access Control
          • Data in Transit Security
          • Data Security stored on vStorage
        • Usecase
          • Migrate data
            • [Rclone] Mount vStorage as Local Drive on Linux
            • [Rclone] Mount vStorage to Window server
            • [Rclone] Sync data from AWS S3 to vStorage
          • Optimize performance
        • API developers
          • vStorage API
            • Integrating vStorage API
            • Using vStorage API
          • vStorage Swift REST API
            • Integrating Swift REST API
            • Using Swift REST API
        • Storage gateway
          • Create and Use Storage Gateway
          • Replacing Fileserver with Gateway Application
      • Object storage (HCM04)
        • Getting Started with Object storage
          • Step 1: Create a project
          • Step 2: Create a bucket
          • Step 3: Upload/ Download objects
          • Step 4: Create a S3 Key
          • Step 5: Integrate 3rd party softwares with vStorage
          • Step 6: Use 3rd party softwares to action on vStorage
        • Features of Object Storage
          • Working with project
          • Working with bucket
            • Working with buckets via vStorage Portal
              • Bucket Versioning
              • Object Lock
              • Bucket Policy
              • Bucket ACLs
              • Bucket CORS
              • Bucket Event Notification
              • Bucket Lifecycle
            • Working with buckets via 3rd party software
          • Working with objects and directories
            • Working with objects and directories via vStorage Portal
            • Working with objects and directories via 3rd party software
          • Working with reports
        • Resource Quota
        • Access Management
          • Working with Root User Account
          • Working with IAM User Account
          • Working with Service Account
          • Working with S3 Keys
          • Limitation
        • API Developers
        • 3rd party softwares
          • S3cmd
            • Integrate S3cmd with vStorage
            • Using S3cmd
          • Cyberduck
            • Integrate Cyberduck with vStorage
            • Using Cyberduck
          • Rclone
            • Integrate Rclone with vStorage
            • Using Rclone
          • S3 SDK
            • Integrate S3 SDK with vStorage
            • Using S3 SDK
          • S3 Browser
            • Integrate S3 Browser with vStorage
            • Using S3 Browser
        • Use case
          • Migrate data
            • [Rclone] Mount vStorage on Window server
            • [Rclone] Mount vStorage as Local Drive on Linux
            • [Rclone] Sync data from AWS S3 to vStorage
        • Charging Fee
    • Filestorage
      • What is FileStorage?
      • Announcements and Updates
      • Getting Started with FileStorage
        • Create a Public NFS File Storage
        • Create a Private NFS File Storage
        • Create a Private SMB File Storage
          • Create a Private SMB File Storage without Active Directory
          • Create a Private SMB File Storage with Active Directory
          • Create a Private SMB File Storage without Active Directory
          • Create a Private SMB File Storage with Active Directory
      • Features of FileStorage
        • Create a File Storage
          • Create a NFS File Storage
          • Create a SMB File Storage without AD
          • Create a SMB File Storage with AD
        • Edit a File Storage
        • Resize a File Storage
        • Delete a File Storage
      • Specifications
      • Access Management
        • File Storage features, resources, and access
      • Resource Quota
      • Charging Fee
    • Backup with Veeam
      • Getting started with Veeam
        • Step 1: Install Veeam Backup & Replication
        • Step 2: Initialize Repository
        • Bước 3: Create Job backup
        • Step 4: Data Recovery on Veeam
      • Features of Veeam
      • Access Management
      • Charging Fee
      • Monitoring Service
      • Security
      • Use case
      • Glossary
  • Backup Center
    • Announcements and Updates
    • Cloud Backup
      • Get Started with Backup Server
      • Backup Location
        • Create and Manage backup locations
      • Backup Server
        • Create Backup Plan (Backup Server)
        • Create Backup Server Point
        • Backup Server Point Management
        • Restore resources
        • Change backup policy
        • Change backup location
      • Backup Policy
      • Pricing
      • Use case
        • Migrate backup server from vStorage to Vault (backup location)
    • Disaster Recovery Center (DRC)
      • Operating model
      • Server Disaster Recovery (SDR)
        • Getting Started with SDR
        • SDR Management
          • Automatically activate Snapshot
          • Attach a Server
          • Start Replication
          • Periodic Backup and Recovery Point
          • Test Failover
          • Failover
          • Stop & Resume Replication
          • Restart Replication
          • Recovery Point Retention
        • Pricing
        • Access Management
        • Security
        • Monitoring
        • Service Limits
  • vMonitor Platform
    • What is vMonitor Platform?
      • What is vMonitor Platform Metric?
        • Metric Quota Class
      • What is vMonitor Platform Log?
        • Log Project Class
      • What is vMonitor Platform Synthetic?
        • Synthetic Test Quota Class
    • Announcements and Updates
      • Announcement and Instructions on Switching Packages on the vMonitor Platform
    • Getting Start with vMonitor Platform
      • Getting Start with Metrics
      • Getting Start with Logs
      • Getting Start with Synthetic
    • Features of vMonitor Platform
      • Dashboard
        • Widget
          • Line
          • Bar
          • Stack area
          • Pie
          • Number
          • Table
          • Log search
        • Query
          • Metric query
          • Log query
        • Variable, Save Querying and View
      • Notification
        • Working with SMS Notification Quota
        • Working with Email Notification Quota
        • Working with Notification
          • SMS
          • Email
          • Slack
          • Teams
          • Telegram
          • Webhook
      • Alarm
        • Metric Alarm
        • Log Alarm
      • Metrics
        • Working with Metric Quota
        • Working with Metric Agent
          • Installing Metric Agent on Server
            • Linux OS
            • Linux OS has internet connection limitations
            • Window OS
        • Working with Metric Information
        • Working with Product Metric
          • Working with vServer-Metric
          • Working with vLB-Metric
          • Working with vDB-Metric
          • Working with vStorage-Metric
        • Applications support integration
          • Kubernetes
        • Supported Metrics List
          • List Host's metrics
          • List vServer's metric
          • List vLB's metrics
          • List vDB's metrics
          • List vStorage's metrics
      • Logs
        • Working with Log Project Quota
        • Working with Log Agent
          • Prepare to initiate log push connection
          • Create a Certificate
          • Install Log Agent on OS
            • CentOS
            • Debian/ Ubuntu
            • Windows
          • Install Log Agent on Docker
          • Install Log Agent on Kubernetes
        • Working with Log Project
          • Archive
          • Refill
          • Log mapping
          • Field mapping
        • Working with Log search
          • Search logs
          • Export logs
        • Working with Log pipeline
          • Processor Groups
          • Processor
            • Grok Parser
              • Grok Patterns
            • JSON Parser
            • CSV Parser
            • Field Remapper
            • Date Parser
            • GEO IP Parser
            • User-agent Parser
        • Working with Log2metric
        • Working with Product Logs
          • Working with vLB-Log
          • Working with vStorage-Log
          • Working with vCDN-Log
      • Synthetics
        • Working with Synthetic Test Quota
        • Working with Synthetic API Test
          • API Test with HTTP(s)
          • API Test with Ping
          • API Test with TCP
        • Working with Location
          • Public location
          • Private location
    • Identity and Access Management
    • Resource Quota
    • Pricing
    • Security
      • Access Permissions Security
      • Data Security During Transmission
  • VKS
    • What is VKS?
    • How VKS works?
    • Announcements and Updates
      • Release notes
    • Getting Started with VKS
      • Instructions for installing and configuring the kubectl in Kubenetes
      • Create a Public Cluster
        • Create a Public Cluster with Public Node Group
        • Create a Public Cluster with Private Node Group
          • Palo Alto as a NAT Gateway
          • Pfsense as a NAT Gateway
      • Create a Private Cluster
      • Expose a service through vLB Layer4
      • Expose a service through vLB Layer7
        • Automatically manage Certificates in VKS with Nginx Ingress Controller, Cert-Manager, and Let's Encr
      • Preserve Source IP when using NLB and Nginx LoadBalancer Controller
      • Integrate with Container Storage Interface (CSI)
      • Upgrading Control Plane Version
      • Upgrading Node Group Version
      • Use Terraform to create a Cluster and Node Group
      • Working with NVIDIA GPU Node Group
    • Clusters
      • Public Cluster and Private Cluster
      • Upgrading Control Plane Version
      • Whitelist
      • Stop POC
    • Node Groups
      • Auto Healing
      • Auto Scaling
      • Upgrading Node Group Version
      • Lable and Taint
    • Network
      • Working with Application Load Balancer (ALB)
        • Ingress for an Application Load Balancer
        • Configure for an Application Load Balancer
        • ALB Limitation
      • Working with Network load balancing (NLB)
        • Integrate with Network Load Balancer
        • Configure for a Network Load Balancer
        • NLB Limitation
      • CNI
        • Using CNI Calico Overlay
        • Using CNI Cilium Overlay
        • Using CNI Cilium VPC Native Routing
      • Load Balancer
        • Using Network Load Balancer
        • Using Application Load Balancer
      • Auto Scaling
      • Fleet Management
    • Storage
      • Working with Container Storage Interface (CSI)
        • Integrate with Container Storage Interface (CSI)
        • CSI Limitation
    • Security Group
    • Migration
      • Migrate Cluster from VKS to VKS
      • Migration Cluster from vContainer to VKS
      • Migrate Cluster from another platform to VKS
      • Migrate Limitation
    • Working VKS with Terraform
    • Monitoring
      • Metrics
    • Charging Fee
    • Reference
      • Kubernetes versions
      • Node Flavors
      • System Image
  • vDB
    • Relational Database Service (RDS)
      • Create a RDS Instance
      • Connect to RDS Instance
        • Connect to an RDS Instance via SSH Tunnel
      • Managing RDS Instance Information
      • Backing Up RDS Instance
      • Restoring RDS Instance
      • Managing Configuration Group in RDS Instance
      • Extend the usage period RDS Instance
      • Monitoring vDB with vMonitor Platform
      • Import Data into RDS Instance (MySQL/MariaDB) using mysqldump
      • Creating Read Replicas
      • Promote Read Relica to Standalone
      • vDB PostgreSQL - Supported Extensions
      • Configuring Replication with RDS (MySQL/MariaDB)
      • Attention & Limitations
    • MemoryStore Database Service (MDS)
      • Create MDS Instance
      • Connect MDS Instance
      • Manage MDS Instance
      • Manage MDS Config Group
      • Backup MDS Instance
    • Security (Bảo mật)
  • vCDN
    • Overview
      • What is CDN?
      • Overview Architecture
        • Network Architecture
        • Load Coordination Mechanism
        • Data Distribution Mechanism
          • PULL
          • PUSH
    • Getting Started with vCDN
      • Live Streaming
      • Video On Demand Streaming
      • Object Download
      • Web Accelerator
      • Transcoding and advanced features
        • Operating Model
        • Install Sigma Media Server
        • Use cases
          • Create Live Transcode Channel
          • Live Transcode combines recordings for later VOD playback
          • Create Simultaneous Restream Channels to Multiple Platforms (RTMP)
          • Transcode video files (MP4)
        • Sigma API developers
      • Using OBS Studio to Push Live Stream
    • Feature details
      • Security Link
      • CNAME
      • Cache Time
      • Development Mode
      • Origin
        • HTTP Origin
        • Object Storage S3
        • Host Origin
      • Optimize File Size
      • Cryptography
      • Caching
      • Automatically Redirect from HTTP to HTTPS
      • CDN Purge Cache
      • Page Rule
    • Access Management
    • Pricing
    • API Developers
    • Monitoring
    • Report
    • Security
      • Certificate Management
  • vCloudstack
    • Get Started with vCloudStack
      • Overview of features
      • Initialize VM on vCloudStack
      • Network Configuration
      • Load Balancer in vCloudStack
        • Create Application LB in vCloudStack
          • Listener for Application LB
          • Pool in vCloudstack
          • Certificate in vCloudstack
        • Create Network LB in vCloudStack
          • Listener for Network LB
          • Pool (NLB) in vCloudStack
        • Advanced Features
      • Volumes in vCloudStack
      • Backup in vCloudStack
      • Snapshots in vCloudStack
    • Get start with Admin Site
      • User Management
      • Access Management
      • Track resource usage information
      • Physical Infrastructure Monitoring
  • vContainer Registry
    • Getting Started
    • Repository
      • Create a repository
      • Edit quota limit
      • Manage image
      • Repository History
    • Repository user
      • Create repository user
      • Edit user information
      • Edit user permission
      • Refresh secret key
      • Change user status
  • vColocation
    • Accessing the vColo Customer Portal
    • Dashboard
    • Space list
      • View rack layout
      • View rack detail
      • Filter list
    • Ticket request
      • Open a ticket
      • Ticket list
  • DataSync
    • What is DataSync?
    • Announcements and Updates
    • Getting Start with DataSync
    • Features of DataSync
      • Create a Transfer Job
      • Run a Transfer Job
        • Run one time
        • Run schedule
      • Monitor Transfer Job Results
      • Stop a Transfer Job
      • Edit a Transfer Job
      • Xóa Transfer Job
      • Retry a Transfer Job
    • Identity and Access Management
      • Managing DataSync access account
        • Root User Account
        • IAM User Account
          • Create an IAM User Account
          • Create Policies for IAM User Account
          • Attach Policies with IAM User Account
          • Delete an IAM User Account
      • Managing Access to DataSync Resources
        • Features, DataSync Resources, and Access Permissions
        • Access Permissions and Working Through IAM
        • Access Permissions and Working Through Root User Account
    • Resoure Quota
    • Charging Fee
    • Monitoring
      • Monitoring DataSync Through Metrics
      • Monitoring DataSync Through Logs
    • Security
      • Data in Transit Security
      • Access Control
      • Data Security stored on vStorage
    • Usecase
      • Transfer data from Amazon S3 to vStorage
      • Transfer data from vStorage to vStorage cross account
      • Transfer data from vStorage to vStorage same account
  • vNetwork
    • Endpoint
      • Create Endpoint
      • Rename Endpoint
      • Delete Endpoint
      • View List of Endpoints
    • Public NAT Instance
      • Create NAT
      • Rename NAT
      • Delete NAT
      • Add/ Remove NAT Port
    • Cross Connect
      • Create Cross Connect
      • Create a VPC Connection
      • Delete Cross Connect
      • Resize Bandwidth
      • Bandwidth Packages
      • VPC Connection Conditions
      • UseCase
    • VPN (Virtual Private Network) Site To Site
      • Create VPN Site-to-Site
        • VPN Connect Condition
        • Add/Update/Delete more Site And Tunnel
        • Support IPSEC Configuration
      • Change VPN Bandwidth
      • VPN Packages
      • Delete VPN
      • Demo Site-to-Site VPN
      • FAQ
  • Key Management System
    • Customer Managed Key
    • VNG Cloud Managed Key
  • Service Health
  • Veka.ai
  • Identity and Access Management (IAM)
    • Getting Start with IAM
    • Common Usecases
      • Access control by job function
      • Access control to specific resources
      • Managing Resources with Terraform and Service Account
      • Use Deny permission to deny access
      • Authorization for access between root user accounts with Service Account Impersonate feature
    • IAM for VNG Cloud's Services
      • IAM for vServer
      • IAM for vStorage
      • IAM for vMonitor
      • IAM for DataSync
    • Types of IAM Identifiers
      • User Accounts
        • How to login to VNG Cloud
      • User Groups
      • Service accounts
      • vStorage Credential
      • Identity Providers
    • IAM Access Management
      • Access Management via Policy
      • VNG Managed Policy
    • Audit Logs Management
    • Limitation
    • Security for IAM
  • Billing & Payment
    • vConsole – Management channel for billing and resources
      • What is vConsole
      • Getting Started
    • What's Billing & Payment
    • Experience with Billing & Payment
      • Prepaid & Postpaid Users
      • Resource lifecycle management
        • Create resource
        • Resize resource configuration
        • Renew resource
        • Auto-renew resources - policy & terms
        • Recover resource
        • Delete resource
        • Resource POC
          • Converting resource from POC to Prepaid
      • Payment
        • Online payment
        • Payment of POC resources
        • Credit hold
        • Automatic invoice payment
        • Apply coupon at payment step
      • Invoice management
  • vCalculator - Service estimated tool
  • Partner Portal user guide
    • Partner Portal Overview
    • Partner Registration
    • Registration of Partner Discount
    • Registration of Partner's Customer
    • Set Up Discounts for Customers
    • Top up Credit for Customer
    • View Report on Partner Portal
    • DEAL Registration
    • View List of DEAL
    • View Detail Deal Information
    • Update Deal Stage
    • View Partner Discount by Deal
    • View Customer Discount by Deal
  • Getting start with VNG Cloud account
    • Register
    • Update Profile
    • Two-Factor Authentication (2FA)
    • Change Password Guide
    • Remove Account Guide
    • Change Phone Number Guide
  • FAQ
    • vServer
    • vStorage
    • vNetwork
    • vCDN
    • vDB
    • NTP server
    • DDoS
Powered by GitBook
On this page
  • Overview
  • Create a nodegroup with NVIDIA GPUs in a VKS cluster
  • Installing the GPU Operator
  • Deploy your GPU workload
  • Cuda VectorAdd Test
  • TensorFlow Test
  • Configure GPU Sharing
  • GPU time-slicing
  • Multi-process server (MPS)
  • Applying Multiple Node-Specific Configurations
  • Monitoring GPU Resources
  • Autoscaling GPU Resources
  1. VKS
  2. Getting Started with VKS

Working with NVIDIA GPU Node Group

PreviousUse Terraform to create a Cluster and Node GroupNextClusters

Last updated 10 months ago

LogoLogo

Address

  • VNG Corporation

Contact us

  • support@vngcloud.vn
  • 1900 1549

About us

  • About VNG Cloud
  • Get started our cloud

Overview

  • The NVIDIA GPU Operator is an operator that simplifies the deployment and management of GPU nodes in Kubernetes clusters. It provides a set of Kubernetes custom resources and controllers that work together to automate the management of GPU resources in a Kubernetes cluster.

  • In this guide, we will show you how to:

    • Create a nodegroup with NVIDIA GPUs in a VKS cluster.

    • Install the NVIDIA GPU Operator in a VKS cluster.

    • Deploy your GPU workload in a VKS cluster.

    • Configure GPU Sharing in a VKS cluster.

    • Monitor GPU resources in a VKS cluster.

    • Autoscale GPU resources in a VKS cluster.

Create a nodegroup with NVIDIA GPUs in a VKS cluster

  • A VKS cluster with at least one NVIDIA GPU nodegroup.

  • kubectl command-line tool installed on your machine. For more information, see Install and Set Up kubectl.

  • helm command-line tool installed on your machine. For more information, see Installing Helm.

  • (Optional) Other tools and libraries that you can use to monitor and manage your Kubernetes resources:

    • kubectl-view-allocations plugin for monitoring cluster resources. For more information, see kubectl-view-allocations.

  • The image below shows my machine setup, it will be used in this guide:

    # Check kubectl CLI version
    kubectl version
    
    # Check Helm version
    helm version
    
    # Check kubectl-view-allocations version
    kubectl-view-allocations --version
  • And this is my VKS cluster with 1 NVIDIA GPU nodegroup, it will be used in this guide, execute the following command to check the nodegroup in your cluster:

    kubectl get nodes -owide

Installing the GPU Operator

  • This guide only focus on installing the NVIDIA GPU Operator, for more information about the NVIDIA GPU Operator, see NVIDIA GPU Operator Documentation. We manually install the NVIDIA GPU Operator in a VKS cluster by using Helm charts, execute the following command to install the NVIDIA GPU Operator in your VKS cluster:

    helm install nvidia-gpu-operator --wait --version v24.3.0 \
      -n gpu-operator --create-namespace \
      oci://vcr.vngcloud.vn/81-vks-public/vks-helm-charts/gpu-operator \
      --set dcgmExporter.serviceMonitor.enabled=true
  • You MUST wait for the installation to complete (about 5-10 minutes), execute the following command to check all the pods in the gpu-operator namespace are running:

    kubectl -n gpu-operator get pods -owide
  • The operator will label the node with the nvidia.com/gpu label, which can be used to filter the nodes that have GPUs. The nvidia.com/gpu label is used by the NVIDIA GPU Operator to identify nodes that have GPUs. The NVIDIA GPU Operator will only deploy the NVIDIA GPU device plugin on nodes that have the nvidia.com/gpu label.

    kubectl get node -o json | jq '.items[].metadata.labels' | grep "nvidia.com"
    • For the above result, the single node in the cluster has the nvidia.com/gpu label, which means that the node has GPUs.

    • These labels also tell that this node is using 1 card of RTX 2080Ti GPU, number of available GPUs, the GPU Memory and other information.

  • On the pod nvidia-device-plugin-daemonset in the gpu-operator namespace, you can execute nvidia-smi command to check the GPU information of the node:

    POD_NAME=$(kubectl -n gpu-operator get pods -l app=nvidia-device-plugin-daemonset -o jsonpath='{.items[0].metadata.name}')
    kubectl -n gpu-operator exec -it $POD_NAME -- nvidia-smi

Deploy your GPU workload

Cuda VectorAdd Test

  • In this section, we will show you how to deploy a GPU workload in a VKS cluster. We will use the cuda-vectoradd-test workload as an example. The cuda-vectoradd-test workload is a simple CUDA program that adds two vectors together. The program is provided as a container image that you can deploy in your VKS cluster. See file cuda-vectoradd-test.yaml.

    # Apply the manifest
    kubectl apply -f \
    https://raw.githubusercontent.com/vngcloud/kubernetes-sample-apps/main/nvidia-gpu/manifest/cuda-vectoradd-test.yaml
    
    # Check the pods
    kubectl get pods
    
    # Check the logs of the pod
    kubectl logs cuda-vectoradd
    
    # [Optional] Clean the resources
    kubectl delete deploy cuda-vectoradd

TensorFlow Test

  • In this section, we apply a Deployment manifest for a TensorFlow GPU application. The purpose of this Deployment is to create and manage a single pod running a TensorFlow container that utilizes GPU resource for executing the sum of random values from a normal distribution of size \( 100000 \) by \( 100000 \). For more detail about the manifest, see file tensorflow-gpu.yaml

    # Apply the manifest
    kubectl apply -f \
      https://raw.githubusercontent.com/vngcloud/kubernetes-sample-apps/main/nvidia-gpu/manifest/tensorflow-gpu.yaml
    
    # Check the pods
    kubectl get pods
    
    # Check processes are running using nvidia-smi
    kubectl -n gpu-operator exec -it <put-your-nvidia-driver-daemonset-pod-name> -- nvidia-smi
    
    # Check the logs of the TensorFlow pod
    kubectl logs <put-your-tensorflow-gpu-pod-name> --tail 20
    
    # [Optional] Clean the resources
    kubectl delete deploy tensorflow-gpu

Configure GPU Sharing

  • GPU sharing strategies allow multiple containers to efficiently use your attached GPUs and save running costs. The following tables summarizes the difference between the GPU sharing modes supported by NVIDIA GPUs:

    Sharing mode
    Supported by VKS
    Workload isolation level
    Pros
    Cons
    Suitable for these workloads

    Multi-instance GPU (MIG)

    ❌

    Best

    • Processes are executed in parallel

    • Full isolation (dedicated memory and compute resources)

    • Supported by fewer GPU models (only Ampere or more recent architectures)

    • Coarse-grained control over memory and compute resources

    • Recommended for workloads running in parallel and that need certain resiliency and QoS. For example, when running AI inference workloads, multi-instance GPU multi-instance GPU allows multiple inference queries to run simultaneously for quick responses, without slowing each other down.

    GPU Time-slicing

    ✅

    None

    • Processes are executed concurrently

    • Supported by older GPU architectures (Pascal or newer)

    • No resource limits

    • No memory isolation

    • Lower performance due to context-switching overhead

    • Recommended for bursty and interactive workloads that have idle periods. These workloads are not cost-effective with a fully dedicated GPU. By using time-sharing, workloads get quick access to the GPU when they are in active phases.

    • GPU time-sharing is optimal for scenarios to avoid idling costly GPUs where full isolation and continuous GPU access might not be necessary, for example, when multiple users test or prototype workloads.

    • Workloads that use time-sharing need to tolerate certain performance and latency compromises.

    Multi-process server (MPS)

    ✅

    Medium

    • Processes are executed parallel

    • Fine-grained control over memory and compute resources allocation

    • No error isolation and memory protection

    • Recommended for batch processing for small jobs because MPS maximizes the throughput and concurrent use of a GPU. MPS allows batch jobs to efficiently process in parallel for small to medium sized workloads.

    • NVIDIA MPS is optimal for cooperative processes acting as a single application. For example, MPI jobs with inter-MPI rank parallelism. With these jobs, each small CUDA process (typically MPI ranks) can run concurrently on the GPU to fully saturate the whole GPU.

GPU time-slicing

  • VKS uses the built-in timesharing ability provided by the NVIDIA GPU and the software stack. Starting with the Pascal architecture, NVIDIA GPUs support instruction level preemption. When doing context switching between processes running on a GPU, instruction-level preemption ensures every process gets a fair timeslice. GPU time-sharing provides software-level isolation between the workloads in terms of address space isolation, performance isolation, and error isolation.

Configure GPU time-slicing

  • To enable GPU time-slicing, you need to configure a ConfigMap with the following settings:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: gpu-sharing-config
    data:
      any: |-
        version: v1
        flags:
          migStrategy: none            # Disable MIG, MUST be none in the case your GPU is not supported MIG
        sharing:
          timeSlicing:
            resources:
            - name: nvidia.com/gpu     # Only apply for the node with the node.status contains 'nvidia.com/gpu'
              replicas: 4              # Allow 4 pods to share the GPU, SHOULD less than 48 pods
  • The above manifest allows 4 pods to share the GPU. The replicas field specifies the number of pods that can share the GPU. The replicas field should be less than the number of GPUs on the node. The nvidia.com/gpu label is used to filter the nodes that have GPUs. The migStrategy field is set to none to disable MIG.

  • This configuration will apply to all nodes in the cluster that have the nvidia.com/gpu label. To apply the configuration, execute the following command:

    kubectl -n gpu-operator create -f \
      https://raw.githubusercontent.com/vngcloud/kubernetes-sample-apps/main/nvidia-gpu/manifest/time-slicing-config-all.yaml
  • And then you need to patch the ClusterPolicy to enable GPU time-slicing using the any setting:

    # Patch the ClusterPolicy
    kubectl patch clusterpolicies.nvidia.com/cluster-policy \
      -n gpu-operator --type merge \
      -p '{"spec": {"devicePlugin": {"config": {"name": "gpu-sharing-config", "default": "any"}}}}'
    
    # Disable DCGM exporter, time-slicing not support DCGM exporter
    kubectl patch clusterpolicies.nvidia.com/cluster-policy \
      -n gpu-operator --type merge \
      -p '{"spec": {"dcgmExporter": {"enabled": false}}}'
    • Your new configuration will be applied to all nodes in the cluster that have the nvidia.com/gpu label.

    • The configuration is considered successful if the ClusterPolicy STATUS is ready.

    • Because of the sharing.timeSlicing.resources.replicas is set to 4, you can deploy up to 4 pods that share the GPU.

    • My cluster has only 1 GPU node, so I can deploy up to 4 pods that share the GPU.

Verify GPU time-slicing

  • Until now, we have configured the GPU time-slicing, now we will deploy 5 pods that share the GPU using Deployment, because of only 4 pods can share the GPU, the 5th pod will be in Pending state. See file time-slicing-verification.yaml.

    # Apply the manifest
    kubectl apply -f \
      https://raw.githubusercontent.com/vngcloud/kubernetes-sample-apps/main/nvidia-gpu/manifest/time-slicing-verification.yaml
    
    # Check the pods
    kubectl get pods
    
    # Check the logs of the TensorFlow pod
    kubectl logs <put-your-time-slicing-verification-pod-name> --tail 10
    
    # Get the event of pending pod
    kubectl events | grep "FailedScheduling"
    
    # [Optional] Clean the resources
    kubectl delete deploy time-slicing-verification

Multi-process server (MPS)

  • VKS uses NVIDIA's Multi-Process Service (MPS). NVIDIA MPS is an alternative, binary-compatible implementation of the CUDA API designed to transparently enable co-operative multi-process CUDA workloads to run concurrently on a single GPU device. GPU with NVIDIA MPS provides software-level isolation in terms of resource limits (active thread percentage and pinned device memory).

Configure MPS

  • To enable GPU MPS, you need to update the previous ConfigMap with the following settings:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: gpu-sharing-config
    data:
      any-mps: |-
        version: v1
        flags:
          migStrategy: none            # MIG strategy is not used, this field SHOULD depends on your GPU model
        sharing:
          mps:                         # Enable MPS for the GPU
            resources:
            - name: nvidia.com/gpu     # Only apply for the node with the node.status contains 'nvidia.com/gpu'
              replicas: 4              # Allow 4 pods to share the GPU
  • Now let's apply this new ConfigMap and then patching the ClusterPolicy like the way at the GPU time-slicing section.

    # Delete the old configmap
    kubectl -n gpu-operator delete cm gpu-sharing-config
    kubectl -n gpu-operator create -f \
      https://raw.githubusercontent.com/vngcloud/kubernetes-sample-apps/main/nvidia-gpu/manifest/mps-config-all.yaml
    
    # Patch the ClusterPolicy
    kubectl patch clusterpolicies.nvidia.com/cluster-policy \
      -n gpu-operator --type merge \
      -p '{"spec": {"devicePlugin": {"config": {"name": "gpu-sharing-config", "default": "any-mps"}}}}'
    
    # Disable DCGM exporter, MPS not support DCGM exporter
    kubectl patch clusterpolicies.nvidia.com/cluster-policy \
      -n gpu-operator --type merge \
      -p '{"spec": {"dcgmExporter": {"enabled": false}}}'
    
    # Check MPS server is running or not
    kubectl -n gpu-operator get pods
    • Your new configuration will be applied to all nodes in the cluster that have the nvidia.com/gpu label.

    • The configuration is considered successful if the ClusterPolicy STATUS is ready.

    • Because of the sharing.mps.resources.replicas is set to 4, you can deploy up to 4 pods that share the GPU.

Verify MPS

  • Until now, we have configured the GPU MPS, now we will deploy 5 pods that share the GPU using Deployment, because of only 4 pods can share the GPU, the 5th pod will be in Pending state. See file mps-verification.yaml.

    # Apply the manifest
    kubectl apply -f \
      https://raw.githubusercontent.com/vngcloud/kubernetes-sample-apps/main/nvidia-gpu/manifest/mps-verification.yaml
    
    # Check the pods
    kubectl get pods
    
    # Check the logs of the TensorFlow pod
    kubectl logs -l job-name=nbody-sample
    
    # [Optional] Clean the resources
    kubectl delete job nbody-sample

Applying Multiple Node-Specific Configurations

  • An alternative to applying one cluster-wide configuration is to specify multiple time-slicing configurations in the ConfigMap and to apply labels node-by-node to control which configuration is applied to which nodes.

  • In this guideline, I add a new RTX-4090 into the cluster.

  • This configuration should be greate if your cluster have multiple nodes with different GPU models. For example:

    • NodeGroup 1 includes the instance of GPU RTX 2080Ti.

    • NodeGroup 2 includes the instance of GPU RTX 4090.

  • And if you want to run multiple GPU sharing strategies in the same cluster, you can apply multiple configurations to the same node by using labels. For example:

    • NodeGroup 1 includes the instance of GPU RTX 2080Ti with 4 pods sharing the GPU using time-slicing.

    • NodeGroup 2 includes the instance of GPU RTX 4090 with 8 pods sharing the GPU using MPS.

Configure Multiple Node-Specific Configurations

  • To using this feature, you need to update the previous ConfigMap with the following settings:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: gpu-multi-sharing-config
    data:
      rtx-2080ti: |-                                # Same the name with the name that you label the GPU node before
        version: v1
        flags:
          migStrategy: none                         # MIG strategy is not used, this field SHOULD depends on your GPU model
        sharing:
          timeSlicing:
            resources:
            - name: nvidia.com/gpu
              replicas: 4                           # Allow the node using this GPU to be shared by 4 pods
      rtx-4090: |-                                  # Same the name with the name that you label the GPU node before
        version: v1
        flags:
          migStrategy: none                         # MIG strategy is not used, this field SHOULD depends on your GPU model
        sharing:
          mps:
            resources:
            - name: nvidia.com/gpu
              replicas: 8                           # Allow the node using this GPU to be shared by 8 pods
  • Apply the above configure.

    kubectl -n gpu-operator create -f \
      https://raw.githubusercontent.com/vngcloud/kubernetes-sample-apps/main/nvidia-gpu/manifest/multiple-gpu-sharing.yaml
    
    # Patch the ClusterPolicy
    kubectl patch clusterpolicies.nvidia.com/cluster-policy \
      -n gpu-operator --type merge \
      -p '{"spec": {"devicePlugin": {"config": {"name": "gpu-multi-sharing-config"}}}}'
    
    # Disable DCGM exporter
    kubectl patch clusterpolicies.nvidia.com/cluster-policy \
      -n gpu-operator --type merge \
      -p '{"spec": {"dcgmExporter": {"enabled": false}}}'
    
    # Check the ClusterPolicy
    kubectl get clusterpolicy
  • Now, we need to label the node with the name that you specified in the ConfigMap:

    # Get the node names
    kubectl get nodes
    
    # Label the node with the name that you specified in the ConfigMap
    kubectl label node <node-name> nvidia.com/device-plugin.config=rtx-2080ti
    kubectl label node <node-name> nvidia.com/device-plugin.config=rtx-4090

Verify Multiple Node-Specific Configurations

  • In this example, we will training MNIST model in TensorFlow using the GPU RTX 2080Ti and RTX 4090. The RTX 2080Ti will be shared by 4 pods using time-slicing and the RTX 4090 will be shared by 8 pods using MPS. See file tensorflow-mnist-sample.yaml.

    # Apply the manifest
    kubectl apply -f \
      https://github.com/vngcloud/kubernetes-sample-apps/raw/main/nvidia-gpu/manifest/tensorflow-mnist-sample.yaml
    
    # Check the pods
    kubectl get pods -owide
    
    # Check the logs of the TensorFlow pod
    kubectl logs <put-your-favourite-tensorflow-mnist-pod-name> --tail 20
    
    # [Optional] Clean the resources
    kubectl delete deploy tensorflow-mnist
    • The pods are running on the node with the GPU RTX 2080Ti and RTX 4090 within different GPU sharing strategies.

Monitoring GPU Resources

  • Monitoring NVIDIA GPU resources in a Kubernetes cluster is essential for ensuring optimal performance, efficient resource utilization, and proactive issue resolution. This overview provides a comprehensive guide to setting up and leveraging Prometheus and the NVIDIA Data Center GPU Manager (DCGM) to monitor GPU resources in a Kubernetes environment.

  • Firstly, we need to install Prometheus Stack and Prometheus Adapter to integrate with the Kubernetes API server. Execute the following command to install the Prometheus Stack and Prometheus Adapter in your VKS cluster:

    # Install Prometheus Stack using Helm
    helm install --wait prometheus-stack \
      --namespace prometheus --create-namespace \
      oci://vcr.vngcloud.vn/81-vks-public/vks-helm-charts/kube-prometheus-stack \
      --version 60.0.2 \
      --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false
    
    # Install and configure Prometheus Adapter using Helm 
    prometheus_service=$(kubectl get svc -n prometheus -lapp=kube-prometheus-stack-prometheus -ojsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}')
    helm install --wait prometheus-adapter \
      --namespace prometheus --create-namespace \
      oci://vcr.vngcloud.vn/81-vks-public/vks-helm-charts/prometheus-adapter \
      --version 4.10.0 \
      --set prometheus.url=http://${prometheus_service}.prometheus.svc.cluster.local
  • After the installation is complete, execute the following command to check the resources of Prometheus are running:

    # Check the resources of Prometheus are running
    kubectl -n prometheus get all 
  • Now, we need to enable the DCGM exporter to monitor the GPU resources in the VKS cluster. Execute the following command to enable the DCGM exporter in your VKS cluster:

    # Enable the DCGM exporter
    kubectl patch clusterpolicies.nvidia.com/cluster-policy \
      -n gpu-operator --type merge \
      -p '{"spec": {"dcgmExporter": {"enabled": true}}}'
    
    # Confirm Prometheus can scrape the DCGM exporter metrics, sometime you MUST wait for a few minutes
    # (about 1-3 mins) for the DCGM exporter to be ready
    kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq -r . | grep DCGM
  • Let's forward the Prometheus Adapter to your local machine to check the GPU metrics by visit http://localhost:9090:

    # Forward the Prometheus Adapter to your local machine
    kubectl -n prometheus \
      port-forward svc/prometheus-stack-kube-prom-prometheus 9090:9090
  • The following table lists some observable GPU metrics. For details about more metrics, see Field Identifiers.

    • Table 1: Usage

      Metric Name
      Metric Type
      Unit
      Description

      DCGM_FI_DEV_GPU_UTIL

      Gauge

      Percentage

      GPU usage.

      DCGM_FI_DEV_MEM_COPY_UTIL

      Gauge

      Percentage

      Memory usage.

      DCGM_FI_DEV_ENC_UTIL

      Gauge

      Percentage

      Encoder usage.

      DCGM_FI_DEV_DEC_UTIL

      Gauge

      Percentage

      Decoder usage.

    • Table 2: Memory

      Metric Name
      Metric Type
      Unit
      Description

      DCGM_FI_DEV_FB_FREE

      Gauge

      MB

      Number of remaining frame buffers. The frame buffer is called VRAM.

      DCGM_FI_DEV_FB_USED

      Gauge

      MB

      Number of used frame buffers. The value is the same as the value of memory-usage in the nvidia-smi command.

    • Table 3: Temperature and power

      Metric Name
      Metric Type
      Unit
      Description

      DCGM_FI_DEV_GPU_TEMP

      Gauge

      °C

      Current GPU temperature of the device.

      DCGM_FI_DEV_POWER_USAGE

      Gauge

      W

      Power usage of the device.

Autoscaling GPU Resources

  • To enable this feature, you MUST:

    • Enable Autoscale for GPU Nodegroups that you want to scale on the VKS portal.

    • Install Keda using Helm chart in your VKS cluster.

  • In the case you DO NOT install Keda in your cluster, VKS autoscaler feature will detect the Pending pods and scale the GPU Nodegroup automatically. This happens when the number of replicas of the Deployment is greater than the number of available GPUs that you configured in the ConfigMap.

  • If you already installed Keda in your cluster, you can use the ScaledObject to scale the GPU Nodegroup based on the metrics that you want. For example, you can scale the GPU Nodegroup based on the GPU usage, memory usage, or any other metrics that you want. For example:

    apiVersion: keda.sh/v1alpha1
    kind: ScaledObject
    metadata:
      name: scaled-object
    spec:
      scaleTargetRef:
        name: scaling-app   # The name of the Deployment, MUST in same namespace
      minReplicaCount: 1   # Optional. Default: 0
      maxReplicaCount: 3   # Optional. Default: 100
      triggers: # Will be trigger if either of these triggers is true
        - type: prometheus
          metadata: # prometheus-stack-kube-prom-prometheus
            serverAddress: http://prometheus-stack-kube-prom-prometheus.prometheus.svc.cluster.local:9090
            metricName: engine_active
            query: sum(DCGM_FI_DEV_GPU_UTIL) / count(DCGM_FI_DEV_GPU_UTIL) / 100
            threshold: '0.5'  # Scale the GPU Nodegroup when the GPU usage is greater than 50%
        - type: prometheus
          metadata: # prometheus-stack-kube-prom-prometheus
            serverAddress: http://prometheus-stack-kube-prom-prometheus.prometheus.svc.cluster.local:9090
            metricName: engine_active
            query: sum(DCGM_FI_DEV_MEM_COPY_UTIL) / count(DCGM_FI_DEV_MEM_COPY_UTIL) / 100
            threshold: '0.5'  # Scale the GPU Nodegroup when the GPU memory usage is greater than 50%
  • The above manifest scales the GPU Nodegroup based on the GPU usage and memory usage. The query field specifies the query to fetch the metrics from Prometheus. The threshold field specifies the threshold value to scale the GPU Nodegroup. The minReplicaCount and maxReplicaCount fields specify the minimum and maximum number of replicas that the GPU Nodegroup can scale to.

  • Now let's install Keda in your cluster by executing the below command:

    helm install --wait kedacore \
      --namespace keda --create-namespace \
      oci://vcr.vngcloud.vn/81-vks-public/vks-helm-charts/keda \
      --version 2.14.2
    
    kubectl -n keda get all
  • Apply scaling-app.yaml manifest to generate resources for testing the autoscaling feature. This manifest run 1 pod of CUDA VectorAdd Test and the GPU Nodegroup will be scaled to 3 when the GPU usage is greater than 50%.

    kubectl apply -f \
      https://github.com/vngcloud/kubernetes-sample-apps/raw/main/nvidia-gpu/manifest/scaling-app.yaml
  • Apply scale-gpu.yaml manifest to create the ScaleObject for the above application. This manifest will scale the GPU Nodegroup based on the GPU usage.

    kubectl apply -f \
      https://github.com/vngcloud/kubernetes-sample-apps/raw/main/nvidia-gpu/manifest/scale-gpu.yaml
    
    kubectl get deploy
    
    # Check the ScaledObject
    kubectl get scaledobject
    • When the ScaledObject Ready value is True, the GPU Nodegroup will be scaled based on the GPU usage.

Workloads that use CUDA MPS need to tolerate the .

memory protection and error containment limitations