With the rapid development of large language models and generative AI technologies, AI inference services are becoming the core business of cloud computing and data centers. This article takes an AI inference platform upgrade project of an Internet enterprise as an example to demonstrate the practical application of the LRSV9501-2E PCIe 5.0 Retimer expansion card in AI server storage expansion scenarios, providing reference for enterprises facing similar infrastructure challenges.
An Internet enterprise operates AI assistant and intelligent customer service platforms for C-end users. With rapid user growth, the platform's AI inference requests have exploded, placing higher demands on the performance and scalability of underlying infrastructure.
1. Storage Performance Bottleneck
AI inference services require rapid loading of large model files (a single model can reach tens of GB) and efficient access to vector databases during inference. The original servers used PCIe 4.0 NVMe SSDs. Although the performance was already excellent, under high-concurrency scenarios, storage access latency became a system bottleneck, affecting inference response speed.
2. Insufficient Storage Capacity
The platform needs to deploy multiple different versions of AI models to support A/B testing and gray releases. Combined with vector databases and log data, the single-machine storage capacity requirement exceeds 10TB. Standard 2U servers have limited drive bays and cannot meet capacity expansion needs.
3. Chassis Space Constraints
The enterprise uses standardized 2U rack-mounted servers as AI inference nodes, each equipped with 4 GPUs. The internal chassis space is already occupied by GPUs and power supplies, leaving only 1 PCIe expansion slot. Traditional storage expansion solutions are not applicable.
4. Signal Integrity Issues
The enterprise plans to place some storage devices externally and connect them via cables to break through chassis space limitations. However, PCIe 5.0 signals attenuate severely during high-speed transmission, requiring signal enhancement solutions to ensure connection stabilit.
Solution Design
Technical evaluation selected the LRSV9501-2E(Click to buy) PCIe 5.0 x16 dual-port MCIO Retimer expansion card as the storage expansion solution. Combined with external NVMe SSD expansion backplanes and PCIe 5.0 NVMe SSDs, a high-performance, high-capacity storage architecture was constructed.
System architecture highlights:
LRSV9501-2E installed in the server's PCIe 5.0 x16 slot, configured in 4x4 lane bifurcation mode
Connected to external NVMe SSD expansion backplane via two MCIO 8i cables
Expander backplane installed with 8 PCIe 5.0 NVMe SSDs (4 SSDs per MCIO cable)
Retimer chip ensures PCIe 5.0 signal integrity during long-distance transmission
Using 4x4 lane bifurcation mode, 16 PCIe 5.0 lanes are divided into four x4 links. Each x4 link connects to two NVMe SSDs (via backplane switching), fully utilizing PCIe bandwidth. The advantages of this configuration are:
High device density: A single expansion card supports connecting 8 NVMe SSDs, significantly improving storage density
Balanced performance: Each SSD receives x4 PCIe 5.0 bandwidth (approximately 16GB/s), meeting high-performance requirements
Flexible expansion: The number of connected devices can be adjusted according to demand without hardware replacement
Complete hardware deployment according to the following steps:
Step 1: Power off the server, disconnect power cables, and take anti-static precautions
Step 2: Open the chassis and locate the available PCIe 5.0 x16 expansion slot
Step 3: Install the LRSV9501-2E expansion card, selecting 2U or 3U brackets based on chassis height
Step 4: Install the external NVMe SSD expansion backplane in the rack
Step 5: Connect the expansion card to the external backplane using MCIO 8i cables
Step 6: Install 8 PCIe 5.0 NVMe SSDs in the backplane
Step 7: Close the chassis, connect power, and power on for self-test
Enter the server BIOS setup interface and configure the PCIe slot lane bifurcation mode to 4x4. After saving the configuration and rebooting, the system recognizes 8 independent NVMe SSDs.
Use the fio tool to test storage | system performance | Results are as follows |
Single-drive sequential read | 12.8 GB/s | approximately 2x improvement |
Single-drive sequential write | 10.2 GB/s | approximately 2x improvement |
Single-drive sequential write | 48 GB/s | approximately 4x improvement |
4K random read IOPS | 2,400K | approximately 3x improvement |
Application Effects and Benefits
After storage upgrade, large AI model file loading speeds improved significantly. Taking a 70B parameter large language model as an example, loading time improved substantially. This greatly shortened model switching and service restart times, improving platform operational efficiency.
Vector database query speed directly affects AI inference response time. The upgraded storage system reduced vector retrieval latency from an average of 15ms to 5ms, shortening end-to-end inference response time by approximately 30%, significantly improving user experience.
Single server storage capacity increased significantly, meeting multi-version model deployment and big data storage needs. The external expansion backplane design also allows for future further expansion.
The LRSV9501-2E's Retimer function ensures PCIe 5.0 signal integrity when transmitted through MCIO cables. Post-implementation signal quality monitoring showed reduced bit error rates, meeting enterprise-grade reliability standards.
Summary and Experience Sharing
The practical value of LRSV9501-2E in AI server storage expansion scenarios has been successfully verified. The following are key experience summaries:
1. Fully Utilize PCIe 5.0 Bandwidth
The bandwidth improvement of PCIe 5.0 brings new possibilities for storage expansion. Through reasonable lane bifurcation configuration, a single expansion card can connect multiple high-performance SSDs, achieving linear storage performance scaling.
2. Retimer Solves Signal Integrity Issues
The biggest challenge of external storage expansion is signal integrity. The Broadcom BCM85657 Retimer chip built into the LRSV9501-2E effectively solves PCIe 5.0 signal attenuation issues, making external connections possible.
3. Convenience of MCIO Interface
The MCIO cable connection solution breaks through chassis space limitations, making storage expansion no longer limited by internal server space.
4. Plug-and-Play Deployment Experience
As a protocol-transparent device, the LRSV9501-2E requires no dedicated drivers and achieves plug-and-play on both CentOS and Ubuntu systems, significantly shortening deployment cycles.
Based on the implementation experience of this project, the LRSV9501-2E can also be applied to the following similar scenarios:
Large Model Training Platforms: Provide high-speed data loading capabilities for GPU training nodes, shortening data preprocessing time
Real-time Recommendation Systems: Support high-concurrency feature vector retrieval, improving recommendation service response speed
Video Processing Services: Provide high-throughput storage access capabilities for video transcoding and analysis
Scientific Computing Clusters: Support high-speed read/write of large-scale datasets, accelerating simulation and modeling tasks
CXL Memory Expansion: Connect CXL memory expansion modules to provide large-capacity memory pools for memory-intensive applications
The LRSV9501-2E PCIe 5.0 Retimer expansion card provides a high-performance, highly reliable storage expansion solution for the Internet enterprise's AI inference platform. Through the high-speed bandwidth of PCIe 5.0 and the signal enhancement capabilities of the Retimer, the enterprise achieved several-fold storage performance improvements while breaking through chassis space limitations. For enterprises building or upgrading AI infrastructure, the LRSV9501-2E provides a high-speed signal expansion solution that balances performance, scalability, and reliability. In today's rapidly evolving PCIe 5.0 and CXL technologies, choosing an expansion solution with signal regeneration capabilities will reserve ample space for future technology upgrades.