fast.ai ULMFiT helpers to easily use pretrained models
Get model
and vocab
files from path.
Get tokenizer
from model-config. Tokenizer parameters in model.json
will be passed to the Tokenizer. As of now SentencePiece and Spacy are supported.
Create langauge_model_learner
from pretrained model-URL. All parameters will be passed to language_model_learner
. The following parameters are set automatically: arch
, pretrained
and pretrained_fnames
. By default accuracy
and perplexity
are passed as metrics
.
Saves the following model files to path
:
- Model (
lm_model.pth
) - Encoder (
lm_encoder.pth
) - Vocab from dataloaders (
lm_vocab.pkl
) - SentencePieceModel (
spm/
)
# path = _get_model_path(learn, path)
# with open((path/'lm_vocab.pkl').absolute(), 'rb') as f:
# return pickle.load(f)
# path = _get_model_path(learn, path)
Create text_classifier_learner
from fine-tuned model path (saved with learn.save_lm()
).