To simplify the process of training a new model, we provide our customers with DGTools.
To train a new model, assuming audio and text files are in a directory named <trainingdir>/rawdata/custom
, run:
bash$ nvidia-docker run -t -i --volume <local training dir>:/big/dgtraining --user $(id -u):$(id -g) --rm deepgram/onprem-dgtools:latest dgautotrain custom
This will standardize your data, cut it into trainable utterances, create a new model with a vocabulary customized to the new dataset, and train it on the new dataset. The -t
and -i
options allocate a tty and allow the user to gracefully halt training (Ctrl+C in Linux terminal).
Deepgram training tools incorporate many advanced features. To learn more, contact Deepgram Technical Support.