大規模最適化問題、グラフ探索、機械学習やデジタルツインなど

旧名:最適化問題に対する超高速&安定計算

GPU クラスタで Graph500

以下の GPU 計算クラスタを用いて Graph500 の測定を行ってみました。
4ノード, 8CPU, 16GPU で以下の結果になります。

Scale 23
median_TEPS: 1.31088e+09

Scale 26
median_TEPS: 2.85776e+09

Scale 28
median_TEPS: 3.04957e+09



最適化問題(SDP)用 GPU 計算クラスタ
Intel Xeon + 4 GPU マシン(4台)
CPU:Xeon X5690(3.46GHz,6コア)×2
メモリ:192GB(16GB×12)
HDD:SATA500GB×2(システム、システムバックアップ)
NIC : GbE x 1 & Inifiniband(FDR) x 1
GPGPU:Tesla C2075(C2070)×4
OS:CentOS 6.3 for x86_64

============= Result ==============
SCALE: 23
edgefactor: 16
NBFS: 64
graph_generation: 3.30924296379
num_mpi_processes: 16
construction_time: 5.1928229332
redistribution_time: 0.710233926773
min_time: 0.0684769
firstquartile_time: 0.0892055
median_time: 0.102386
thirdquartile_time: 0.114382
max_time: 0.20391
mean_time: 0.103512
stddev_time: 0.0215733
min_nedge: 134216177
firstquartile_nedge: 134216177
median_nedge: 134216177
thirdquartile_nedge: 134216177
max_nedge: 134216177
mean_nedge: 134216177
stddev_nedge: 0
min_TEPS: 6.58213e+08
firstquartile_TEPS: 1.1734e+09
median_TEPS: 1.31088e+09
thirdquartile_TEPS: 1.50457e+09
max_TEPS: 1.96002e+09
harmonic_mean_TEPS: 1.29662e+09
harmonic_stddev_TEPS: 3.40463e+07
min_validate: 0.972839
firstquartile_validate: 1.05163
median_validate: 1.07257
thirdquartile_validate: 1.10354
max_validate: 1.14581
mean_validate: 1.07595
stddev_validate: 0.0357237

============= Result ==============
SCALE: 26
edgefactor: 16
NBFS: 64
graph_generation: 13.4707419872
num_mpi_processes: 16
construction_time: 43.9261419773
redistribution_time: 5.98754906654
min_time: 0.345117
firstquartile_time: 0.367482
median_time: 0.375724
thirdquartile_time: 0.386964
max_time: 0.487818
mean_time: 0.378976
stddev_time: 0.0186412
min_nedge: 1073731075
firstquartile_nedge: 1073731075
median_nedge: 1073731075
thirdquartile_nedge: 1073731075
max_nedge: 1073731075
mean_nedge: 1073731075
stddev_nedge: 0
min_TEPS: 2.20109e+09
firstquartile_TEPS: 2.77476e+09
median_TEPS: 2.85776e+09
thirdquartile_TEPS: 2.92186e+09
max_TEPS: 3.11121e+09
harmonic_mean_TEPS: 2.83324e+09
harmonic_stddev_TEPS: 1.7558e+07
min_validate: 8.5733
firstquartile_validate: 8.8123
median_validate: 8.88245
thirdquartile_validate: 8.98263
max_validate: 9.16695
mean_validate: 8.89243
stddev_validate: 0.122703

============= Result ==============
SCALE: 28
edgefactor: 16
NBFS: 64
graph_generation: 55.9925642014
num_mpi_processes: 16
construction_time: 185.245222807
redistribution_time: 23.8423101902
min_time: 1.3443
firstquartile_time: 1.38642
median_time: 1.40837
thirdquartile_time: 1.42572
max_time: 1.56603
mean_time: 1.40905
stddev_time: 0.0370526
min_nedge: 4294927670
firstquartile_nedge: 4294927670
median_nedge: 4294927670
thirdquartile_nedge: 4294927670
max_nedge: 4294927670
mean_nedge: 4294927670
stddev_nedge: 0
min_TEPS: 2.74256e+09
firstquartile_TEPS: 3.01247e+09
median_TEPS: 3.04957e+09
thirdquartile_TEPS: 3.09786e+09
max_TEPS: 3.19491e+09
harmonic_mean_TEPS: 3.04811e+09
harmonic_stddev_TEPS: 1.00984e+07
min_validate: 35.8008
firstquartile_validate: 36.5694
median_validate: 36.9938
thirdquartile_validate: 37.2026
max_validate: 37.8319
mean_validate: 36.8881
stddev_validate: 0.474247