Suggestions Quick Start

In this section, we’ll quickly set up suggestions for use in Pre Annotation in LightTag. By the end of this process you should have something that looks like this :

pre-annotations

As with all of the other examples, we’ll be interacting with LightTag’s API using the LTSession convenience class, which you can download here

[14]:
from ltsession import LTSession # Get ltsession here https://gist.github.com/talolard/793563397c48dca32f75c9d4b6f8f560
from pprint import pprint # To print things pretty
session = LTSession(workspace="demo",user="lighttag",pwd="lighttag")


In this quick start we’ll assume you already have a Dataset and Schema defined and want to pre annotate the Dataset with Tags from the Schema. To do so, we’ll

  1. Retreive our Dataset and Schema from LightTag

  2. Register a new model

  3. Create the suggestions

  4. Upload the suggestions

  5. Attach the Model to a Task so that it’s suggestions are shown to your annotators

  6. Review your models output in LightTag’s review mode

[1]:

1. Fetching the dataset and the schema

[25]:
dataset = session.get('v1/projects/default/datasets/bible/').json() #Use the slug of the dataset to fetch it from the datasets endpoint
examples = session.get('v1/projects/default/datasets/fedreg/examples/').json() #Use the slug of the dataset to fetch it from the datasets endpoint
schema = session.get('v1/projects/default/schemas/basis-model-comparison/').json() #Use the slug of the schema to fetch it from the schemas endpoint
pprint(examples[0])
{'content': 'The Trade Agreement Act of 1979 prohibits Federal agencies from '
            'engaging in any standards or related activities that create '
            'unnecessary obstacles to the foreign commerce of the United '
            'States. Legitimate domestic objectives, such as safety, are not '
            'considered unnecessary obstacles. The statute also requires '
            'consideration of international standards and where appropriate, '
            'that they be the basis for U.S. standards.',
 'dataset': '42d4181d-934f-4c58-850d-ecdf6fdeb830',
 'id': '40e46279-6602-4d97-bf61-66878d565d1d',
 'metadata': {'agency': 'DEPARTMENT OF TRANSPORTATION',
              'doc_id': '[FR Doc. 2017-08944 Filed 5-2-17; 8:45 am]',
              'end': 23,
              'key': 'agreement',
              'order_code': 67,
              'phrase': 'The Trade Agreement Act',
              'start': 0,
              'title': 'Training, Qualification, and Oversight for '
                       'Safety-Related Railroad Employees',
              'type': 'rule'}}

2. Registering a SuggestionModel

A SuggestionModel (or model for short) in LightTag is a container that contains all of the suggestions that came from a single source. Typically, a model corresponds to some ML model, dictionary or regular expressions that you have.

  • Models belong to a particular Schema

  • Models can provide suggestions with Tags from that Schema only.

  • A single Model may not have overlapping suggestions.

  • You can have multiple Models in the same Schema and they can conflict

We’ll use the new LightTag Model endpoint to register our model

[26]:
model_definition = {
    'schema':schema['id'],
    'name':'My Newer than new new LightTag Suggestions Model',
    'metadata':{
        'anything':'here' #You can add an arbitrary JSON of metadata to your model
    }

}
response = session.post('v2/models/',json=model_definition)


LightTag will return the model object, including the schema id and the list of tags it is allowed to use

[27]:

model = response.json()
pprint(model)
{'id': '4fcd67ab-7a61-4258-8b93-43ee6be0527d',
 'metadata': {'anything': 'here'},
 'name': 'My Newer than new new LightTag Suggestions Model',
 'schema': '9cc0de35-708f-4886-8f4b-c52353a1fc27',
 'tags': [{'description': 'LAW',
           'id': 'dfe5bb00-0101-4a87-9eb9-a54a39d1c702',
           'name': 'LAW'},
          {'description': 'DATE',
           'id': '4f8f7141-2a25-4e62-99e5-6cb11eb0333d',
           'name': 'DATE'},
          {'description': 'GPE',
           'id': '03032a16-223c-414a-81c7-b1ce77faa017',
           'name': 'GPE'},
          {'description': 'NUMBER',
           'id': '462fffb1-134b-4f06-ae9f-bbe0b248258e',
           'name': 'NUMBER'},
          {'description': 'ORG',
           'id': 'e19d2092-db18-40b0-a77b-25f2981376ed',
           'name': 'ORG'},
          {'description': 'MONEY',
           'id': '63a62c6e-691f-4aff-b20f-013f51220d21',
           'name': 'MONEY'},
          {'description': 'MISC',
           'id': '932a1aa0-47b5-42a6-85e9-adc440f6799b',
           'name': 'MISC'},
          {'description': 'PERSON',
           'id': 'ca7277f4-5a2c-4003-9a04-d446ae894efb',
           'name': 'PERSON'},
          {'description': 'NORP',
           'id': 'e376b420-5d52-4fa1-9d81-bdc25b028787',
           'name': 'NORP'},
          {'description': 'TIME',
           'id': '6bb7422c-3f0a-4a30-b0a9-ce09b451b468',
           'name': 'TIME'},
          {'description': 'EVENT',
           'id': '6b1af7b2-e09c-48fe-b990-25c9dfb167b1',
           'name': 'EVENT'},
          {'description': 'FAC',
           'id': '439c5a16-4ede-4e25-ab98-f0c875d93186',
           'name': 'FAC'},
          {'description': 'PERCENT',
           'id': 'a69b086c-252e-47d4-9fb8-cf423e7bb7bf',
           'name': 'PERCENT'},
          {'description': 'TITLE',
           'id': 'f64b8cad-7890-4099-a64f-f77dfb865615',
           'name': 'TITLE'},
          {'description': 'DURATION',
           'id': 'ea110c2d-bbaa-436d-9a9d-a26e56482c00',
           'name': 'DURATION'},
          {'description': 'SET',
           'id': 'fa882030-99be-4fc6-931d-41e3e4708373',
           'name': 'SET'},
          {'description': 'URL',
           'id': 'fe90f7fc-3f25-4eb7-b0c0-ef44b004ff4f',
           'name': 'URL'},
          {'description': 'CRIMINAL_CHARGE',
           'id': '03b29930-0d6c-4258-9532-2ad462d67be6',
           'name': 'CRIMINAL_CHARGE'},
          {'description': 'HANDLE',
           'id': 'b620f40a-3264-4c7e-8cfe-8e11f286c96d',
           'name': 'HANDLE'},
          {'description': 'CAUSE_OF_DEATH',
           'id': 'e1874c02-823a-404c-8c4b-e80fce97415f',
           'name': 'CAUSE_OF_DEATH'},
          {'description': 'LATITUDE_LONGITUDE',
           'id': 'dd1129d0-439d-451b-8f06-2b03e39d7ed7',
           'name': 'LATITUDE_LONGITUDE'}],
 'url': 'https://demo.lighttag.io/api/v2/models/4fcd67ab-7a61-4258-8b93-43ee6be0527d/'}

3. Create your suggestions

This step is mostly independent of LightTag. You need to create a list of objects that represent your suggestions. Each object should have

  1. example_id The LightTag provided id of the example you are suggesting on

  2. tag OR tag_id The name of the tag (PERSON) or the LightTag id of the Tag

  3. start The start offset of the text you are suggesting on

  4. end The end offset of the text you are suggesting on

Example

Below, we’ll create a few suggestions using Spacy a popular NLP library with built in NER

[28]:
import spacy
nlp = spacy.load("en_core_web_sm")

def process_example_with_spacy_and_return_lighttag_suggestions(example:dict):
    '''
        Example function that shows how to create suggestions for LightTag using Spacy.
        The way you get suggestions will vary, but the output format is the same, a list of dicts
        with the properties start,end,tag and example_id

        This function takes an Example (as returned from LightTag's API) as input.
        The example contains the id and content fields as well as metadata you may have uploaded with it
    '''
    content = example['content']
    doc = nlp(content) # Run spacy on the content of the example
    suggestions = [] # Empty list to store results
    for entity in doc.ents: # Spacy exposes the ents property which has named entities it found
        start = entity.start_char
        end = entity.end_char
        text = content[start:end] # Not required, but useful to see what your model is outputting
        suggestion = {
            'example_id':example['id'], # Then LightTag example_id
            'start':start, # The start offset of the span
            'end':end, # The end offset of the span
            'tag':entity.label_, # The name of the tag being applied to the span
            'text':text # Not required, but useful to see what your model is outputting

        }

        suggestions.append(suggestion)
    return suggestions

[29]:
suggestions = process_example_with_spacy_and_return_lighttag_suggestions(examples[2])
pprint(suggestions[:3])
[{'end': 45,
  'example_id': '410d83bb-93e0-4285-9277-92a10f87b1be',
  'start': 26,
  'tag': 'ORG',
  'text': 'Marketing Order Nos'},
 {'end': 50,
  'example_id': '410d83bb-93e0-4285-9277-92a10f87b1be',
  'start': 47,
  'tag': 'CARDINAL',
  'text': '916'},
 {'end': 58,
  'example_id': '410d83bb-93e0-4285-9277-92a10f87b1be',
  'start': 55,
  'tag': 'CARDINAL',
  'text': '917'}]

4. Upload Your Suggestions

Now that we have suggestions, we upload them to the model. When you registered a model, LightTag responded with the Model object which includes it’s URL. We’ll post our suggestions there.

[32]:
session.post(model['url']+'suggestions/',json=suggestions)
[32]:
<Response [201]>

If all went well, you’ll get back a 201 response. Let’s get those suggestions

[34]:
resp = session.get(model['url']+'suggestions/',)
pprint(resp.json()[:1])
[{'end': 45,
  'example_id': '410d83bb-93e0-4285-9277-92a10f87b1be',
  'id': '54e9e9b9-16f1-4301-9fbf-2daf1b6f80be',
  'start': 26,
  'tag': 'ORG',
  'tag_id': 'e19d2092-db18-40b0-a77b-25f2981376ed',
  'value': ' Marketing Order Nos'}]

5. Attach Your Suggestions to a Task

In order to display your suggestions to your annotators, you need to explictly tell LightTag to do so. This is done by attaching the model to your task, which you can do either through the API or the UI. We’ll show both ways here

5.1 Attaching a model to an existing task in the UI

  1. Go to the tasks section in the LightTag UI taskSection

  2. Find the task, make sure it’s on the same schema and dataset that your model is operating on taskNoModel

  3. Open the Models section and select the model you just created taskWithModel

5.2 Attaching a model to an existing task with the API

[11]:
# Retreive the tasks from LightTag API
allTasks = session.get('v1/projects/default/task_definitions').json()
pprint(allTasks[0])


{'active': True,
 'allow_suggestions': True,
 'annotators_per_example': 3,
 'archived': False,
 'async_status': 'done',
 'complete_tasks': 0,
 'complete_tasksets': 0,
 'created_at': '2019-10-17T09:07:19.794526Z',
 'dataset_id': '2a6e0c06-017d-473c-b3d5-bbb24c28b71c',
 'guidelines': None,
 'id': '71b9280d-c132-4cda-a847-97878e1a4b46',
 'name': 'My Suggestions Task',
 'priority': 1,
 'progress': 0.0,
 'project_id': '2753ca38-69d9-4c96-9d31-df6d4069b027',
 'relationSchema_id': None,
 'remaining_tasks': 885,
 'schema_id': 'e7e7de79-1623-4803-9aac-7e664f12117a',
 'slug': 'my-suggestions-task',
 'status': 'active',
 'suggestion_models': ['9c26b461-c2ab-40ab-a2dc-cc454abe7d39'],
 'teams': ['c0d457f3-2609-4d1c-946b-7ab1dc72796a'],
 'total_tasks': 885,
 'total_tasksets': 295,
 'url': 'https://demo.lighttag.io/api/v1/projects/default/task_definitions/my-suggestions-task/'}
[14]:
#Find your task by it's name
my_task_name = "My Suggestions Task"
my_task = next(filter(lambda task: task['name'] == my_task_name,allTasks))
[15]:

model
[15]:
{'schema': 'e7e7de79-1623-4803-9aac-7e664f12117a',
 'name': 'My Newest LightTag Suggestions Model',
 'metadata': {'anything': 'here'},
 'tags': [{'id': '43ce3951-7ea1-423a-869f-ab07848d7f55',
   'name': 'PERSON',
   'description': ''},
  {'id': 'e332528b-dcb6-4414-9d14-fa61b12ab127',
   'name': 'LOCATION',
   'description': ''},
  {'id': '6e323f1d-1aa0-4e09-b64f-1ec0cd6dbed8',
   'name': 'ORGANIZATION',
   'description': ''},
  {'id': '978dfe2d-fdc3-4ca7-80d9-2baa0472f818',
   'name': 'TITLE',
   'description': ''}],
 'url': 'https://demo.lighttag.io/api/unstable/models/40f990db-b375-4292-8415-5621613b088f/'}
[20]:
my_task_models = my_task['suggestion_models']
my_task_models.append(model['id']) # Add the id of the model you created to this task

session.put(my_task['url'],json={'models':my_task_models})
[20]:
<Response [200]>

5.3 Attaching a model when creating a task in the UI

Sometimes, you’ll have predefined the model before starting a task. In that case you can attach the model to your task at creation time.

  1. Go to the New Task dialog newTask

  2. Select the Schema your model belongs to taskWithSchema

  3. In the advanced tab, select your model taskWithModel