Measurement of IP forwarding performance on complex computer architectures
2011 (English)In: Swedish National Computer Networking Workshop, SNCNW 2011, 2011Conference paper (Refereed)
Open-source routers on new PC hardware allows for forwarding speeds of 10Gb/s and above. We present detailed performance measurements using Linux on two complex PC hardware platforms. Both platforms use PCIe gen2, dual I/O bridges and have support for non-uniform memory access (NUMA). The AMD platform uses four processors equipped with eight cores and four nodes of local memory. The Intel platform has two quad-core CPUs each with local memory.
Packets being forwarded through a PC-based router can be separated into three steps: receive-dma, lookup, and transmitdma. Each step was studied individually. In particular, we studied how varying the CPU core and memory node effects the forwarding speeds.
Our results show a large performance dependency of selecting CPU cores and memory nodes. In particular, DMA works best with memory nodes closest to the I/O bridge where the interface card is connected. Correspondingly, CPU access is most efficient on local memory. Consequently, choosing CPU core and memory nodes badly leads to a significant performance decrease.
Place, publisher, year, edition, pages
IdentifiersURN: urn:nbn:se:kth:diva-66396OAI: oai:DiVA.org:kth-66396DiVA: diva2:483934
SNCNW 2011, Linköping, Sweden
QC 201201302012-01-262012-01-262012-01-30Bibliographically approved