{"cells":[{"cell_type":"markdown","execution_count":42,"metadata":{},"outputs":[{"ename":"SyntaxError","evalue":"invalid syntax (, line 2)","output_type":"error","traceback":["\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m2\u001b[0m\n\u001b[0;31m One of LightTag's most powerful features is the ability to upload your models predictions or existing domain knoweledge. This can be used both for pre-annotation to speed up annotators work, as well as to quickly compare and validate models\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n"]}],"source":"# Suggestions Quick Start\n\nIn this section, we'll quickly set up suggestions for use in Pre Annotation in LightTag. By the end of this process you should have something that looks like this : \n\n\n![pre-annotations](./img/pre-annotations2.gif)\n\nAs with all of the other examples, we'll be interacting with LightTag's API using the LTSession convenience class, which you can [download here](https://gist.github.com/talolard/793563397c48dca32f75c9d4b6f8f560)\n\n\n\n\n"},{"cell_type":"code","execution_count":14,"metadata":{},"outputs":[],"source":"from ltsession import LTSession # Get ltsession here https://gist.github.com/talolard/793563397c48dca32f75c9d4b6f8f560\nfrom pprint import pprint # To print things pretty\nsession = LTSession(workspace=\"demo\",user=\"lighttag\",pwd=\"lighttag\")\n\n"},{"cell_type":"markdown","execution_count":null,"metadata":{},"outputs":[],"source":"\n\nIn this quick start we'll assume you already have a Dataset and Schema defined and want to pre annotate the Dataset with Tags from the Schema. \nTo do so, we'll \n\n1. Retreive our Dataset and Schema from LightTag \n2. Register a new model \n3. Create the suggestions\n4. Upload the suggestions\n5. Attach the Model to a Task so that it's suggestions are shown to your annotators\n6. Review your models output in LightTag's review mode\n"},{"cell_type":"code","execution_count":1,"metadata":{},"outputs":[],"source":""},{"cell_type":"markdown","execution_count":null,"metadata":{},"outputs":[],"source":"\n## 1. Fetching the dataset and the schema\n"},{"cell_type":"code","execution_count":25,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":"{'content': 'The Trade Agreement Act of 1979 prohibits Federal agencies from '\n 'engaging in any standards or related activities that create '\n 'unnecessary obstacles to the foreign commerce of the United '\n 'States. Legitimate domestic objectives, such as safety, are not '\n 'considered unnecessary obstacles. The statute also requires '\n 'consideration of international standards and where appropriate, '\n 'that they be the basis for U.S. standards.',\n 'dataset': '42d4181d-934f-4c58-850d-ecdf6fdeb830',\n 'id': '40e46279-6602-4d97-bf61-66878d565d1d',\n 'metadata': {'agency': 'DEPARTMENT OF TRANSPORTATION',\n 'doc_id': '[FR Doc. 2017-08944 Filed 5-2-17; 8:45 am]',\n 'end': 23,\n 'key': 'agreement',\n 'order_code': 67,\n 'phrase': 'The Trade Agreement Act',\n 'start': 0,\n 'title': 'Training, Qualification, and Oversight for '\n 'Safety-Related Railroad Employees',\n 'type': 'rule'}}\n"}],"source":"dataset = session.get('v1/projects/default/datasets/bible/').json() #Use the slug of the dataset to fetch it from the datasets endpoint\nexamples = session.get('v1/projects/default/datasets/fedreg/examples/').json() #Use the slug of the dataset to fetch it from the datasets endpoint\nschema = session.get('v1/projects/default/schemas/basis-model-comparison/').json() #Use the slug of the schema to fetch it from the schemas endpoint\npprint(examples[0])"},{"cell_type":"markdown","execution_count":null,"metadata":{},"outputs":[],"source":"## 2. Registering a SuggestionModel\nA SuggestionModel (or model for short) in LightTag is a container that contains all of the suggestions that came from a single source. Typically, a model corresponds to some ML model, dictionary or regular expressions that you have. \n\n* Models belong to a particular Schema\n* Models can provide suggestions with Tags from that Schema **only**. \n* A single Model may not have overlapping suggestions. \n* You can have multiple Models in the same Schema and they can conflict\n\nWe'll use the new LightTag Model endpoint to register our model"},{"cell_type":"code","execution_count":26,"metadata":{},"outputs":[],"source":"model_definition = {\n 'schema':schema['id'],\n 'name':'My Newer than new new LightTag Suggestions Model',\n 'metadata':{\n 'anything':'here' #You can add an arbitrary JSON of metadata to your model \n }\n \n}\nresponse = session.post('v2/models/',json=model_definition)\n\n"},{"cell_type":"markdown","execution_count":null,"metadata":{},"outputs":[],"source":"LightTag will return the model object, including the schema id and the list of tags it is allowed to use\n"},{"cell_type":"code","execution_count":27,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":"{'id': '4fcd67ab-7a61-4258-8b93-43ee6be0527d',\n 'metadata': {'anything': 'here'},\n 'name': 'My Newer than new new LightTag Suggestions Model',\n 'schema': '9cc0de35-708f-4886-8f4b-c52353a1fc27',\n 'tags': [{'description': 'LAW',\n 'id': 'dfe5bb00-0101-4a87-9eb9-a54a39d1c702',\n 'name': 'LAW'},\n {'description': 'DATE',\n 'id': '4f8f7141-2a25-4e62-99e5-6cb11eb0333d',\n 'name': 'DATE'},\n {'description': 'GPE',\n 'id': '03032a16-223c-414a-81c7-b1ce77faa017',\n 'name': 'GPE'},\n {'description': 'NUMBER',\n 'id': '462fffb1-134b-4f06-ae9f-bbe0b248258e',\n 'name': 'NUMBER'},\n {'description': 'ORG',\n 'id': 'e19d2092-db18-40b0-a77b-25f2981376ed',\n 'name': 'ORG'},\n {'description': 'MONEY',\n 'id': '63a62c6e-691f-4aff-b20f-013f51220d21',\n 'name': 'MONEY'},\n {'description': 'MISC',\n 'id': '932a1aa0-47b5-42a6-85e9-adc440f6799b',\n 'name': 'MISC'},\n {'description': 'PERSON',\n 'id': 'ca7277f4-5a2c-4003-9a04-d446ae894efb',\n 'name': 'PERSON'},\n {'description': 'NORP',\n 'id': 'e376b420-5d52-4fa1-9d81-bdc25b028787',\n 'name': 'NORP'},\n {'description': 'TIME',\n 'id': '6bb7422c-3f0a-4a30-b0a9-ce09b451b468',\n 'name': 'TIME'},\n {'description': 'EVENT',\n 'id': '6b1af7b2-e09c-48fe-b990-25c9dfb167b1',\n 'name': 'EVENT'},\n {'description': 'FAC',\n 'id': '439c5a16-4ede-4e25-ab98-f0c875d93186',\n 'name': 'FAC'},\n {'description': 'PERCENT',\n 'id': 'a69b086c-252e-47d4-9fb8-cf423e7bb7bf',\n 'name': 'PERCENT'},\n {'description': 'TITLE',\n 'id': 'f64b8cad-7890-4099-a64f-f77dfb865615',\n 'name': 'TITLE'},\n {'description': 'DURATION',\n 'id': 'ea110c2d-bbaa-436d-9a9d-a26e56482c00',\n 'name': 'DURATION'},\n {'description': 'SET',\n 'id': 'fa882030-99be-4fc6-931d-41e3e4708373',\n 'name': 'SET'},\n {'description': 'URL',\n 'id': 'fe90f7fc-3f25-4eb7-b0c0-ef44b004ff4f',\n 'name': 'URL'},\n {'description': 'CRIMINAL_CHARGE',\n 'id': '03b29930-0d6c-4258-9532-2ad462d67be6',\n 'name': 'CRIMINAL_CHARGE'},\n {'description': 'HANDLE',\n 'id': 'b620f40a-3264-4c7e-8cfe-8e11f286c96d',\n 'name': 'HANDLE'},\n {'description': 'CAUSE_OF_DEATH',\n 'id': 'e1874c02-823a-404c-8c4b-e80fce97415f',\n 'name': 'CAUSE_OF_DEATH'},\n {'description': 'LATITUDE_LONGITUDE',\n 'id': 'dd1129d0-439d-451b-8f06-2b03e39d7ed7',\n 'name': 'LATITUDE_LONGITUDE'}],\n 'url': 'https://demo.lighttag.io/api/v2/models/4fcd67ab-7a61-4258-8b93-43ee6be0527d/'}\n"}],"source":"\nmodel = response.json()\npprint(model)"},{"cell_type":"markdown","execution_count":null,"metadata":{},"outputs":[],"source":"## 3. Create your suggestions \nThis step is mostly independent of LightTag. You need to create a list of objects that represent your suggestions. Each object should have \n\n1. **example_id** The LightTag provided id of the example you are suggesting on\n2. **tag** OR **tag_id** The name of the tag (PERSON) or the LightTag id of the Tag\n3. **start** The start offset of the text you are suggesting on \n4. **end** The end offset of the text you are suggesting on\n\n#### Example\nBelow, we'll create a few suggestions using [Spacy](https://spacy.io/) a popular NLP library with built in NER\n"},{"cell_type":"code","execution_count":28,"metadata":{},"outputs":[],"source":"import spacy \nnlp = spacy.load(\"en_core_web_sm\")\n\ndef process_example_with_spacy_and_return_lighttag_suggestions(example:dict):\n '''\n Example function that shows how to create suggestions for LightTag using Spacy. \n The way you get suggestions will vary, but the output format is the same, a list of dicts\n with the properties start,end,tag and example_id\n\n This function takes an Example (as returned from LightTag's API) as input. \n The example contains the id and content fields as well as metadata you may have uploaded with it\n '''\n content = example['content']\n doc = nlp(content) # Run spacy on the content of the example\n suggestions = [] # Empty list to store results\n for entity in doc.ents: # Spacy exposes the ents property which has named entities it found\n start = entity.start_char\n end = entity.end_char\n text = content[start:end] # Not required, but useful to see what your model is outputting\n suggestion = {\n 'example_id':example['id'], # Then LightTag example_id\n 'start':start, # The start offset of the span\n 'end':end, # The end offset of the span \n 'tag':entity.label_, # The name of the tag being applied to the span\n 'text':text # Not required, but useful to see what your model is outputting\n\n }\n\n suggestions.append(suggestion)\n return suggestions\n"},{"cell_type":"code","execution_count":29,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":"[{'end': 45,\n 'example_id': '410d83bb-93e0-4285-9277-92a10f87b1be',\n 'start': 26,\n 'tag': 'ORG',\n 'text': 'Marketing Order Nos'},\n {'end': 50,\n 'example_id': '410d83bb-93e0-4285-9277-92a10f87b1be',\n 'start': 47,\n 'tag': 'CARDINAL',\n 'text': '916'},\n {'end': 58,\n 'example_id': '410d83bb-93e0-4285-9277-92a10f87b1be',\n 'start': 55,\n 'tag': 'CARDINAL',\n 'text': '917'}]\n"}],"source":"suggestions = process_example_with_spacy_and_return_lighttag_suggestions(examples[2])\npprint(suggestions[:3])"},{"cell_type":"markdown","execution_count":null,"metadata":{},"outputs":[],"source":"## 4. Upload Your Suggestions\nNow that we have suggestions, we upload them to the model. \nWhen you registered a model, LightTag responded with the Model object which includes it's URL. We'll post our suggestions there."},{"cell_type":"code","execution_count":32,"metadata":{},"outputs":[{"data":{"text/plain":""},"execution_count":32,"metadata":{},"output_type":"execute_result"}],"source":"session.post(model['url']+'suggestions/',json=suggestions)"},{"cell_type":"markdown","execution_count":null,"metadata":{},"outputs":[],"source":"If all went well, you'll get back a 201 response. Let's **get** those suggestions \n"},{"cell_type":"code","execution_count":34,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":"[{'end': 45,\n 'example_id': '410d83bb-93e0-4285-9277-92a10f87b1be',\n 'id': '54e9e9b9-16f1-4301-9fbf-2daf1b6f80be',\n 'start': 26,\n 'tag': 'ORG',\n 'tag_id': 'e19d2092-db18-40b0-a77b-25f2981376ed',\n 'value': ' Marketing Order Nos'}]\n"}],"source":"resp = session.get(model['url']+'suggestions/',)\npprint(resp.json()[:1])"},{"cell_type":"markdown","execution_count":null,"metadata":{},"outputs":[],"source":"## 5. Attach Your Suggestions to a Task\nIn order to display your suggestions to your annotators, you need to explictly tell LightTag to do so. \nThis is done by attaching the model to your task, which you can do either through the API or the UI. We'll show both ways here\n\n\n#### 5.1 Attaching a model to an existing task in the UI\n\n1. Go to the tasks section in the LightTag UI\n ![taskSection](./img/tasksSection.png)\n\n2. Find the task, make sure it's on the same schema and dataset that your model is operating on \n ![taskNoModel](./img/taskNoModel.png)\n \n3. Open the Models section and select the model you just created\n ![taskWithModel](./img/taskWithModel.png)"},{"cell_type":"markdown","execution_count":null,"metadata":{},"outputs":[],"source":"#### 5.2 Attaching a model to an existing task with the API\n"},{"cell_type":"code","execution_count":11,"metadata":{},"outputs":[{"name":"stdout","output_type":"stream","text":"{'active': True,\n 'allow_suggestions': True,\n 'annotators_per_example': 3,\n 'archived': False,\n 'async_status': 'done',\n 'complete_tasks': 0,\n 'complete_tasksets': 0,\n 'created_at': '2019-10-17T09:07:19.794526Z',\n 'dataset_id': '2a6e0c06-017d-473c-b3d5-bbb24c28b71c',\n 'guidelines': None,\n 'id': '71b9280d-c132-4cda-a847-97878e1a4b46',\n 'name': 'My Suggestions Task',\n 'priority': 1,\n 'progress': 0.0,\n 'project_id': '2753ca38-69d9-4c96-9d31-df6d4069b027',\n 'relationSchema_id': None,\n 'remaining_tasks': 885,\n 'schema_id': 'e7e7de79-1623-4803-9aac-7e664f12117a',\n 'slug': 'my-suggestions-task',\n 'status': 'active',\n 'suggestion_models': ['9c26b461-c2ab-40ab-a2dc-cc454abe7d39'],\n 'teams': ['c0d457f3-2609-4d1c-946b-7ab1dc72796a'],\n 'total_tasks': 885,\n 'total_tasksets': 295,\n 'url': 'https://demo.lighttag.io/api/v1/projects/default/task_definitions/my-suggestions-task/'}\n"}],"source":"# Retreive the tasks from LightTag API \nallTasks = session.get('v1/projects/default/task_definitions').json()\npprint(allTasks[0])\n\n"},{"cell_type":"code","execution_count":14,"metadata":{},"outputs":[],"source":"#Find your task by it's name\nmy_task_name = \"My Suggestions Task\" \nmy_task = next(filter(lambda task: task['name'] == my_task_name,allTasks))"},{"cell_type":"code","execution_count":15,"metadata":{},"outputs":[{"data":{"text/plain":"{'schema': 'e7e7de79-1623-4803-9aac-7e664f12117a',\n 'name': 'My Newest LightTag Suggestions Model',\n 'metadata': {'anything': 'here'},\n 'tags': [{'id': '43ce3951-7ea1-423a-869f-ab07848d7f55',\n 'name': 'PERSON',\n 'description': ''},\n {'id': 'e332528b-dcb6-4414-9d14-fa61b12ab127',\n 'name': 'LOCATION',\n 'description': ''},\n {'id': '6e323f1d-1aa0-4e09-b64f-1ec0cd6dbed8',\n 'name': 'ORGANIZATION',\n 'description': ''},\n {'id': '978dfe2d-fdc3-4ca7-80d9-2baa0472f818',\n 'name': 'TITLE',\n 'description': ''}],\n 'url': 'https://demo.lighttag.io/api/unstable/models/40f990db-b375-4292-8415-5621613b088f/'}"},"execution_count":15,"metadata":{},"output_type":"execute_result"}],"source":"\nmodel"},{"cell_type":"code","execution_count":20,"metadata":{},"outputs":[{"data":{"text/plain":""},"execution_count":20,"metadata":{},"output_type":"execute_result"}],"source":"my_task_models = my_task['suggestion_models']\nmy_task_models.append(model['id']) # Add the id of the model you created to this task\n\nsession.put(my_task['url'],json={'models':my_task_models})"},{"cell_type":"markdown","execution_count":21,"metadata":{},"outputs":[],"source":"#### 5.3 Attaching a model when creating a task in the UI\nSometimes, you'll have predefined the model before starting a task. In that case you can attach the model to your task at creation time. \n\n1. Go to the New Task dialog\n ![newTask](./img/newTask.png)\n \n2. Select the Schema your model belongs to \n ![taskWithSchema](./img/taskWithSchema.png)\n3. In the advanced tab, select your model\n ![taskWithModel](./img/taskWithSugModel.png)\n"}],"nbformat":4,"nbformat_minor":2,"metadata":{"language_info":{"name":"python","codemirror_mode":{"name":"ipython","version":3}},"orig_nbformat":2,"file_extension":".py","mimetype":"text/x-python","name":"python","npconvert_exporter":"python","pygments_lexer":"ipython3","version":3}}