fast.ai ULMFiT helpers to easily use pretrained models
Get model and vocab files from path.
Get tokenizer from model-config. Tokenizer parameters in model.json will be passed to the Tokenizer. As of now SentencePiece and Spacy are supported.
Create langauge_model_learner from pretrained model-URL. All parameters will be passed to language_model_learner. The following parameters are set automatically: arch, pretrained and pretrained_fnames. By default accuracy and perplexity are passed as metrics.
Saves the following model files to path:
- Model (
lm_model.pth) - Encoder (
lm_encoder.pth) - Vocab from dataloaders (
lm_vocab.pkl) - SentencePieceModel (
spm/)
# path = _get_model_path(learn, path)
# with open((path/'lm_vocab.pkl').absolute(), 'rb') as f:
# return pickle.load(f)
# path = _get_model_path(learn, path)
Create text_classifier_learner from fine-tuned model path (saved with learn.save_lm()).