Pretrained model
An even simpler option than the rule-based model, and with a better performance (although a bit more compute-intensive) consists in using the public pretrained model. This model is available on the HuggingFace model hub at AP-HP/eds-pseudo-public and was trained on synthetic data described in the Dataset page. You can also test it directly on the demo.
Installation
-
Install the latest version of edsnlp
pip install "edsnlp[ml]" -U
-
Get access to the model at AP-HP/eds-pseudo-public
- Create and copy a token https://hf.co/settings/tokens?new_token=true
-
Register the token (only once) on your machine
5. Load the modelimport huggingface_hub huggingface_hub.login( token=YOUR_TOKEN, new_session=False, add_to_git_credential=True, )
import edsnlp nlp = edsnlp.load("AP-HP/eds-pseudo-public", auto_update=True) doc = nlp( "En 2015, M. Charles-François-Bienvenu " "Myriel était évêque de Digne. C’était un vieillard " "d’environ soixante-quinze ans ; il occupait le " "siège de Digne depuis 2006." ) for ent in doc.ents: print(ent, ent.label_, str(ent._.date))
To apply the model in parallel on many documents using one or more GPUs, refer to the Inference page.