nfx-cpu 0.1.3
Cross-platform C++ CPU feature detection and system identification library
Loading...
Searching...
No Matches
nfx-cpu TODO

Project roadmap and task tracking for the nfx-cpu library.

Todo

  • Extended SIMD detection:
    • SSE3, SSSE3, SSE4.1 detection
      • Add hasSSE3Support(), hasSSSE3Support(), hasSSE41Support()
      • Add unit tests in TESTS_FeatureDetection.cpp
      • Add benchmarks in BM_FeatureDetection.cpp
    • FMA (Fused Multiply-Add) detection
      • Add hasFmaSupport()
      • Add verifyFmaSupport() compile-time helper
      • Add unit tests and benchmarks
    • AVX-512 detection (F, CD, ER, PF, BW, DW, VL, IFMA, VBMI variants)
      • Add granular AVX-512 variant detection functions
      • Add unit tests for each variant
      • Add performance benchmarks
    • ARM NEON detection (cross-platform)
      • Requires ARM CI/CD infrastructure
      • Platform-specific CPUID equivalent
      • Full test coverage on ARM hardware
  • Cache topology:
    • CacheInfo struct with L1/L2/L3 sizes, line size
    • cacheInfo() - Complete cache hierarchy
    • Add TESTS_CacheTopology.cpp test suite
    • Add BM_CacheTopology.cpp benchmark suite
    • Sample demonstrating cache-aware optimization
  • TSC and frequency:
    • hasTSC(), hasInvariantTSC()
    • readTSC() - Timestamp counter access
    • baseFrequency(), maxFrequency() (MHz)
    • hasTurboBoost() detection
    • Add unit tests for frequency detection
    • Add TSC benchmark for overhead measurement
  • Additional x86 instructions:
    • hasBMI() - Bit manipulation instructions (BMI1/BMI2)
    • hasPOPCNT() - Population count
    • hasFMA() - Already in Phase 1 SIMD
    • Add unit tests for each instruction set
    • Add benchmarks comparing with/without hardware support
  • ARM extensions:
    • hasSVE() - Scalable Vector Extension
    • hasCRC32() - ARM CRC32 (different from x86)
  • Security features:
    • hasAESNI() - Hardware AES encryption
    • hasSHA() - SHA-1/SHA-256 acceleration
    • hasRDRAND(), hasRDSEED() - Hardware RNG
    • hasSGX() - Software Guard Extensions
    • hasSMEP() - Supervisor Mode Execution Prevention
    • Add security feature tests
    • Add benchmarks for crypto hardware acceleration
    • Sample demonstrating AES-NI usage detection
  • NUMA topology:
    • numaNodeCount() - NUMA nodes
    • NUMA-aware memory allocation helpers
    • Add NUMA detection tests
    • Add NUMA-aware allocation benchmarks
    • Platform-specific implementation (Windows vs Linux)
  • Power management:
    • hasSpeedStep(), hasThermalMonitor()
    • thermalThreshold() - Thermal limits
  • Low-level intrinsics:
    • prefetch(addr) - Cache prefetch hint
    • pause() - Spinlock optimization
    • serialize() - Memory fence
    • Add unit tests for intrinsic wrappers
    • Add benchmarks measuring prefetch impact
    • Cross-platform intrinsic abstraction
  • Add CpuInfo struct:
    • const CpuInfo& cpuInfo() - Singleton with full CPU capabilities
    • Structured organization: identification, topology, cache, features, frequency
    • Single detection pass, cache everything at once
    • Keep free function API (hasAvx2Support(), etc.) for backward compatibility
    • Hybrid approach: free functions OR struct depending on use case

In Progress

  • NIL

Done ✓

v0.1.3 (2025-11-27)

  • CMake Packaging - Bug Fixes and Cleanup
    • Remove incorrect runtime dependencies from DEB/RPM packages (header-only library)
    • Remove duplicate WiX tool detection
    • Simplify RPM architecture detection to use uname directly

v0.1.2 (2025-11-25)

  • CPU Feature Detection - Performance Optimization
    • Skip runtime CPUID checks when features are compile-time guaranteed
    • Return immediately when __AVX2__, __AVX__, or __SSE4_2__ macros are defined
    • Eliminate unnecessary static lambda initialization in optimized builds
    • Zero overhead detection when compiled with feature flags (-mavx2, etc.)
    • Maintain backward compatibility with runtime detection behavior

v0.1.1 (2025-11-12)

  • CPU Feature Detection - OS Support Verification
    • Add OSXSAVE flag check for AVX/AVX2 detection
    • Verify XCR0 register state (XMM/YMM bits) via xgetbv instruction
    • Use inline assembly for GCC/Clang to avoid -mxsave flag dependency
    • Fix MSVC AVX2 detection to use __cpuidex() properly
    • Refactor magic numbers into named constants in internal namespace