Gcore, edge AI, cloud, network, and security solutions provider, has launched a new service “Inference at the Edge,” enhancing AI application deployment by bringing machine learning (ML) capabilities closer to end users, ensuring ultra-low latency and seamless real-time performance.
Gcore’s, “Inference at the Edge,” is designed to address the increasing demand for fast, secure, and cost-effective AI model deployment. By leveraging Gcore’s global network, businesses can now deploy pre-trained machine learning models closer to their end users, reducing latency and enhancing performance. “Gcore Inference at the Edge empowers customers to focus on getting their machine learning models trained, rather than worrying about the costs, skills, and infrastructure required to deploy AI applications globally,” said Andre Reitenbach, CEO of Gcore.
This solution is beneficial for industries such as automotive, manufacturing, retail, and technology, where applications like generative AI, object recognition, real-time behavioral analysis, virtual assistants, and production monitoring require immediate and reliable responses.
Ultra-fast AI performance
“Inference at the Edge” ensures real-time inference by utilizing high-performance nodes strategically placed at the edge of Gcore’s network. This service delivers rapid AI processing by deploying powerful nodes near users and employing advanced routing technology to handle requests swiftly, typically achieving response times of less than 30 milliseconds. “At Gcore, we believe the edge is where the best performance and end-user experiences are achieved, and that is why we are continuously innovating to ensure every customer receives unparalleled scale and performance,” said Reitenbach.
“Inference at the Edge delivers all the power with none of the headache, providing a modern, effective, and efficient AI inference experience,” said Reitenbach. NVIDIA L40S GPUs are used, which are powerful chips made for AI tasks, making it easy and efficient to handle complex machine learning tasks.
The platform supports a variety of fundamental machine learning and custom models, including popular open-source foundation models such as LLaMA Pro 8B, Mistral 7B, and Stable-Diffusion XL. This flexibility lets businesses choose and train models that fit their needs.
Efficiency and security
Gcore will offer “Inference at the Edge” through a flexible pricing structure, ensuring that customers only pay for the resources they use. This cost-effective approach makes it easier for businesses of all sizes to leverage advanced AI capabilities without significant upfront investments. Security is a top priority for Gcore, as the solution includes built-in protection against DDoS attacks, ensuring that machine learning endpoints remain secure.
Furthermore, it complies with major standards such as GDPR, PCI DSS, and ISO/IEC 27001, providing peace of mind regarding data privacy and security. To accommodate varying workloads, Gcore will feature auto-scaling capabilities, ensuring that models can handle peak demand and unexpected surges without performance degradation.
Finally, unlimited object storage with scalable S3-compatible cloud storage, allowing businesses to grow their storage needs alongside their evolving model requirements is offered.