PhysicsAI Error
Daehee Park_21836
Altair Community Member
Hi, I'm using Hyperworks 2024.
I am getting the error below when trying to train a PhysicsAI Model via GPU. (via CPU, there is no error)
[07:49:51] (INFO): ************************************************************************ [07:49:51] (INFO): ** ** [07:49:51] (INFO): ** ** [07:49:51] (INFO): ** Altair PhysicsAI 2024.0 ** [07:49:51] (INFO): ** ** [07:49:51] (INFO): ** Advanced Machine Learning Software ** [07:49:51] (INFO): ** from Altair Engineering, Inc. ** [07:49:51] (INFO): ** ** [07:49:51] (INFO): ** Build: 27.1ac51437e ** [07:49:51] (INFO): ************************************************************************ [07:49:51] (INFO): ** COPYRIGHT (C) 2023-2023 Altair Engineering, Inc. ** [07:49:51] (INFO): ** All Rights Reserved. Copyright notice does not imply publication. ** [07:49:51] (INFO): ** Contains trade secrets of Altair Engineering, Inc. ** [07:49:51] (INFO): ** Decompilation or disassembly of this software strictly prohibited. ** [07:49:51] (INFO): ************************************************************************ [07:49:51] (INFO): [07:49:52] (INFO): Matched subcases: 1 [07:49:52] (INFO): - subcase 1: Subcase 1 (loadstep1) [07:50:26] (INFO): ------------------------------------------------------------------------ [07:50:26] (INFO): 1. Building features and labels [07:50:26] (INFO): ------------------------------------------------------------------------ [08:10:34] (INFO): Node features: [08:10:34] (INFO): name: cae.coord [08:10:34] (INFO): type: CONTINOUS [08:10:34] (INFO): length: 3 [08:10:34] (INFO): [08:10:34] (INFO): name: cae.part_label [08:10:34] (INFO): type: CATEGORICAL [08:10:34] (INFO): length: 1 [08:10:34] (INFO): [08:10:34] (INFO): Edge features: [08:10:34] (INFO): name: cae.direction [08:10:34] (INFO): type: CONTINOUS [08:10:34] (INFO): length: 4 [08:10:34] (INFO): [08:10:34] (INFO): Node labels: [08:10:34] (INFO): name: cae.results [08:10:34] (INFO): subcase: Subcase 1 (loadstep1) [08:10:34] (INFO): field: Displacement [08:10:34] (INFO): type: CONTINOUS [08:10:34] (INFO): length: 3 [08:10:34] (INFO): Masks: [08:10:34] (INFO): - cae.nonshape_node_mask - active [08:10:34] (INFO): [08:10:34] (INFO): Vector features: [08:10:34] (INFO): Vector labels: [08:11:58] (INFO): ------------------------------------------------------------------------ [08:11:58] (INFO): 2. Training novelty detector [08:11:58] (INFO): ------------------------------------------------------------------------ [08:11:58] (INFO): ------------------------------------------------------------------------ [08:11:58] (INFO): 3. Training/Validation split [08:11:58] (INFO): ------------------------------------------------------------------------ [08:11:58] (INFO): Fraction : 0.85 [08:11:58] (INFO): # training : 37 [08:11:58] (INFO): # validation : 7 [08:11:58] (INFO): [08:11:58] (INFO): ------------------------------------------------------------------------ [08:11:58] (INFO): 4. Initializing model [08:11:58] (INFO): ------------------------------------------------------------------------ [08:12:06] (INFO): Width: 128 [08:12:06] (INFO): Depth: 8 [08:12:06] (INFO): Batch size: 2 [08:12:06] (INFO): Learning rate: 0.001 [08:12:06] (INFO): Early stopping enabled with a patience of: 400.0 [08:12:06] (INFO): [08:12:06] (INFO): Total params: 787,267 [08:12:06] (INFO): Trainable params: 787,267 [08:12:06] (INFO): Non-trainable params: 0 [08:12:06] (INFO): [08:12:06] (INFO): ------------------------------------------------------------------------ [08:12:06] (INFO): 5. Training [08:12:06] (INFO): ------------------------------------------------------------------------ [08:12:29] (ERROR): *** UNEXPECTED ERROR *** Module: execute Line: 58 Type: InternalError
CUDA v11.8 cuDNN 8.7 version installed correctly and GPU configuration checked.
nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022 Cuda compilation tools, release 11.8, V11.8.89 Build cuda_11.8.r11.8/compiler.31833905_0
Python GPU configuration from tensorflow.python.client import device_lib print(device_lib.list_local_devices()) [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 7197899504812016186 xla_global_id: -1 , name: "/device:GPU:0" device_type: "GPU" memory_limit: 6176714752 locality { bus_id: 1 links { } } incarnation: 7711125656020260411 physical_device_desc: "device: 0, name: Quadro RTX 4000, pci bus id: 0000:21:00.0, compute capability: 7.5" xla_global_id: 416903419 ]
0
Answers
-
Hello,
Please can you set an env variable EDS_DEBUG=1 and provide any console output?
Kind Regards,
Paola.
1 -
Paola Alvarez_21959 said:
Hello,
Please can you set an env variable EDS_DEBUG=1 and provide any console output?
Kind Regards,
Paola.
I just resolved this problem.
I added environment variable.
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\extras\CUPTI\lib64
0