1. 程式人生 > >pytorch編譯/pytorch/torch/lib/THD/base/data_channels/DataChannelNccl.cpp:31:17: error: ‘ncclInt8’ was not declared in this scope

pytorch編譯/pytorch/torch/lib/THD/base/data_channels/DataChannelNccl.cpp:31:17: error: ‘ncclInt8’ was not declared in this scope

在https://github.com/pytorch/pytorch/issues/13962頁面下有我的同名回答(mtxing69)

/pytorch/torch/lib/THD/base/data_channels/DataChannelNccl.cpp:31:17: error: ‘ncclInt8’ was not declared in this scope

Failed to run 'bash ../tools/build_pytorch_libs.sh --use-cuda --use-nnpack --use-mkldnn --use-qnnpack caffe2'

原因為呼叫了系統的nccl

網路方法大概兩種:

1.修改setup.py中NO_SYSTEM_NCCL

2.修改tools/setup_helpers/nccl.py中的USE_SYSTEM_NCCL

但修改後,在編譯過程的中總會發現進入tools/setup_helpers/nccl.py的“if USE_CUDA and not check_negative_env_flag('USE_SYSTEM_NCCL'):”語句,隨之就會在cmake info中

-DUSE_SYSTEM_NCCL=ON -DNCCL_INCLUDE_DIR=/usr/local/include -DNCCL_ROOT_DIR=/usr/local/ -DNCCL_SYSTEM_LIB=/usr/local/lib/libnccl.so

 

解決辦法:

把tools/setup_helpers/nccl.py的“if USE_CUDA and not check_negative_env_flag('USE_SYSTEM_NCCL'):”語句中的“not”去掉,隨之編譯順利完成。