장치별 상대 학습 속도
Google colab을 보니 하드웨어 가속을 선택할 수 있어서, 장치별 상대 속도가 궁금했다.
테스트 환경
Windows 11
Visual Studion 2022 Community
Google colab
Surface Pro 7
데이터셋
MNIST 손글씨 숫자
테스트 결과(local GPU는 예상)
local GPU + jupyter > local GPU + VS > colab GPU >>> local CPU + jupyter > local CPU + VS > colab TPU
GPU가 있다면...
서피스 프로7에는 GPU가 없어서 테스트를 하지 못했지만,
어느 정도 이상의 GPU가 달린 노트북이나 데스크탑이 있다면,
local GPU + jupyter notebook에서 돌리는 게 제일 빠를 것 같다.
colab에서...
무료로 이용할 수 있는 colab은 사용할 수 있는 리소스가 유동적이다.
유료로 이용할 수 있는 colab Pro/Pro+는 속도도 빠르고 안정적이라고 한다.
https://research.google.com/colaboratory/faq.html#gpu-availability
colab TPU의 학습 속도가 가장 느려서,
NPU(Neural Processing Unit)에 대해 생각해 보자면,
TPU(Google), NVIDIA Jetson의 처리능력, 전력 사용량, 가격을 보면
GPU로 학습시키고, NPU로 추론을 돌리는 컨셉인거 같다.
아이폰 12에도 NPU가 들어갔다고 하는 거 보면 아무래도...
테스트 결과 상세
local CPU + VS = 약 110~115초/1회
local CPU + VS에서 처음 시작해서 속도를 기록하고 비교할 생각을 못했다.
대략적인 기억으로는 110초~115초였던 것 같다.
local CPU + jupyter notebook = 약 95.6초/1회
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch 1/12\n",
"468/468 - 43s - loss: 1.2217 - accuracy: 0.5867 - val_loss: 0.1783 - val_accuracy: 0.9491 - 43s/epoch - 92ms/step\n",
"Epoch 2/12\n",
"468/468 - 82s - loss: 0.6216 - accuracy: 0.8018 - val_loss: 0.1402 - val_accuracy: 0.9559 - 82s/epoch - 175ms/step\n",
"Epoch 3/12\n",
"468/468 - 83s - loss: 0.4807 - accuracy: 0.8495 - val_loss: 0.1057 - val_accuracy: 0.9659 - 83s/epoch - 177ms/step\n",
"Epoch 4/12\n",
"468/468 - 99s - loss: 0.4092 - accuracy: 0.8734 - val_loss: 0.0935 - val_accuracy: 0.9689 - 99s/epoch - 212ms/step\n",
"Epoch 5/12\n",
"468/468 - 104s - loss: 0.3633 - accuracy: 0.8891 - val_loss: 0.1006 - val_accuracy: 0.9669 - 104s/epoch - 223ms/step\n",
"Epoch 6/12\n",
"468/468 - 102s - loss: 0.3351 - accuracy: 0.8977 - val_loss: 0.1019 - val_accuracy: 0.9687 - 102s/epoch - 218ms/step\n",
"Epoch 7/12\n",
"468/468 - 107s - loss: 0.3119 - accuracy: 0.9053 - val_loss: 0.0918 - val_accuracy: 0.9727 - 107s/epoch - 230ms/step\n",
"Epoch 8/12\n",
"468/468 - 104s - loss: 0.2913 - accuracy: 0.9109 - val_loss: 0.0728 - val_accuracy: 0.9766 - 104s/epoch - 223ms/step\n",
"Epoch 9/12\n",
"468/468 - 104s - loss: 0.2764 - accuracy: 0.9161 - val_loss: 0.0758 - val_accuracy: 0.9753 - 104s/epoch - 223ms/step\n",
"Epoch 10/12\n",
"468/468 - 108s - loss: 0.2666 - accuracy: 0.9207 - val_loss: 0.0687 - val_accuracy: 0.9771 - 108s/epoch - 231ms/step\n",
"Epoch 11/12\n",
"468/468 - 104s - loss: 0.2606 - accuracy: 0.9215 - val_loss: 0.0897 - val_accuracy: 0.9726 - 104s/epoch - 222ms/step\n",
"Epoch 12/12\n",
"468/468 - 107s - loss: 0.2526 - accuracy: 0.9240 - val_loss: 0.0683 - val_accuracy: 0.9770 - 107s/epoch - 228ms/step\n"
]
}
google colab TPU = 약 153.3초/1회
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Epoch 1/12\n",
"468/468 - 155s - loss: 1.1228 - accuracy: 0.6213 - val_loss: 0.1823 - val_accuracy: 0.9538 - 155s/epoch - 330ms/step\n",
"Epoch 2/12\n",
"468/468 - 153s - loss: 0.6196 - accuracy: 0.8030 - val_loss: 0.1296 - val_accuracy: 0.9611 - 153s/epoch - 327ms/step\n",
"Epoch 3/12\n",
"468/468 - 153s - loss: 0.4830 - accuracy: 0.8478 - val_loss: 0.1055 - val_accuracy: 0.9684 - 153s/epoch - 327ms/step\n",
"Epoch 4/12\n",
"468/468 - 153s - loss: 0.4084 - accuracy: 0.8753 - val_loss: 0.0985 - val_accuracy: 0.9715 - 153s/epoch - 328ms/step\n",
"Epoch 5/12\n",
"468/468 - 153s - loss: 0.3644 - accuracy: 0.8902 - val_loss: 0.0871 - val_accuracy: 0.9720 - 153s/epoch - 327ms/step\n",
"Epoch 6/12\n",
"468/468 - 153s - loss: 0.3260 - accuracy: 0.9018 - val_loss: 0.0804 - val_accuracy: 0.9745 - 153s/epoch - 327ms/step\n",
"Epoch 7/12\n",
"468/468 - 153s - loss: 0.3090 - accuracy: 0.9067 - val_loss: 0.0750 - val_accuracy: 0.9750 - 153s/epoch - 326ms/step\n",
"Epoch 8/12\n",
"468/468 - 154s - loss: 0.2859 - accuracy: 0.9140 - val_loss: 0.0793 - val_accuracy: 0.9756 - 154s/epoch - 330ms/step\n",
"Epoch 9/12\n",
"468/468 - 153s - loss: 0.2746 - accuracy: 0.9162 - val_loss: 0.0716 - val_accuracy: 0.9790 - 153s/epoch - 327ms/step\n",
"Epoch 10/12\n",
"468/468 - 153s - loss: 0.2659 - accuracy: 0.9202 - val_loss: 0.0621 - val_accuracy: 0.9803 - 153s/epoch - 327ms/step\n",
"Epoch 11/12\n",
"468/468 - 153s - loss: 0.2538 - accuracy: 0.9246 - val_loss: 0.0836 - val_accuracy: 0.9736 - 153s/epoch - 327ms/step\n",
"Epoch 12/12\n",
"468/468 - 153s - loss: 0.2457 - accuracy: 0.9264 - val_loss: 0.0632 - val_accuracy: 0.9790 - 153s/epoch - 326ms/step\n"
]
}
google colab GPU = 약 25.2초/1회
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Epoch 1/12\n",
"468/468 - 35s - loss: 1.1841 - accuracy: 0.5997 - val_loss: 0.1678 - val_accuracy: 0.9573 - 35s/epoch - 76ms/step\n",
"Epoch 2/12\n",
"468/468 - 24s - loss: 0.6185 - accuracy: 0.8051 - val_loss: 0.1163 - val_accuracy: 0.9661 - 24s/epoch - 52ms/step\n",
"Epoch 3/12\n",
"468/468 - 24s - loss: 0.4820 - accuracy: 0.8504 - val_loss: 0.1016 - val_accuracy: 0.9690 - 24s/epoch - 52ms/step\n",
"Epoch 4/12\n",
"468/468 - 24s - loss: 0.4110 - accuracy: 0.8740 - val_loss: 0.1188 - val_accuracy: 0.9628 - 24s/epoch - 52ms/step\n",
"Epoch 5/12\n",
"468/468 - 24s - loss: 0.3719 - accuracy: 0.8885 - val_loss: 0.0845 - val_accuracy: 0.9728 - 24s/epoch - 52ms/step\n",
"Epoch 6/12\n",
"468/468 - 24s - loss: 0.3390 - accuracy: 0.8979 - val_loss: 0.0889 - val_accuracy: 0.9729 - 24s/epoch - 52ms/step\n",
"Epoch 7/12\n",
"468/468 - 25s - loss: 0.3125 - accuracy: 0.9054 - val_loss: 0.0736 - val_accuracy: 0.9765 - 25s/epoch - 53ms/step\n",
"Epoch 8/12\n",
"468/468 - 24s - loss: 0.2910 - accuracy: 0.9121 - val_loss: 0.0755 - val_accuracy: 0.9767 - 24s/epoch - 52ms/step\n",
"Epoch 9/12\n",
"468/468 - 25s - loss: 0.2829 - accuracy: 0.9140 - val_loss: 0.0766 - val_accuracy: 0.9736 - 25s/epoch - 52ms/step\n",
"Epoch 10/12\n",
"468/468 - 25s - loss: 0.2697 - accuracy: 0.9201 - val_loss: 0.0668 - val_accuracy: 0.9776 - 25s/epoch - 53ms/step\n",
"Epoch 11/12\n",
"468/468 - 24s - loss: 0.2605 - accuracy: 0.9224 - val_loss: 0.0612 - val_accuracy: 0.9795 - 24s/epoch - 52ms/step\n",
"Epoch 12/12\n",
"468/468 - 24s - loss: 0.2484 - accuracy: 0.9245 - val_loss: 0.0601 - val_accuracy: 0.9805 - 24s/epoch - 52ms/step\n"
]
}
2022.04.27 추가
local GPU 테스트 https://j2b2blog.tistory.com/2