You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that in the datasets/datatset.py, AudioVisualDataset expects also to see "landmarks" of the video, which I guess should refer to the lip landmark. However, I did not see any description on how to obtain the CREMA-D video landmark. Could you please illustrate further about how to obtain the audio encoding, how to organize the dataset folder structure, and how to include the landmark for training process?
The text was updated successfully, but these errors were encountered:
It seems that in the datasets/datatset.py, AudioVisualDataset expects also to see "landmarks" of the video, which I guess should refer to the lip landmark. However, I did not see any description on how to obtain the CREMA-D video landmark. Could you please illustrate further about how to obtain the audio encoding, how to organize the dataset folder structure, and how to include the landmark for training process?
The text was updated successfully, but these errors were encountered: