Extractors

Question Extractor

This module provides a function to extract question tasks (single and multiple) from panoptes annotations.

panoptes_aggregation.extractors.question_extractor.question_extractor(classification, **kwargs)

Extract annotations from a question task into a Counter object

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:extraction – A dictionary (formated like a counter) indicating what annotations were made
Return type:dict

Examples

>>> classification_multiple = {'annotations': [
    {
        'vlaue': ['Blue', 'Green']
    }
]}
>>> question_extractor(classification_multiple)
{'blue': 1, 'green': 1}
>>> classification_single = {'annotations': [
    {'vlaue': 'Yes'}
]}
>>> question_extractor(classification_single)
{'yes': 1}
panoptes_aggregation.extractors.question_extractor.slugify_or_null(s)

Slugify value while casting null as a string fisrt


Slider Extractor

This module provides a function to extract slider tasks from panoptes annotations.

panoptes_aggregation.extractors.slider_extractor.slider_extractor(classification, **kwargs)

Extract annotations from a slider task

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:extraction – A dictionary indicating what annotation was made
Return type:dict

Point Extractor

This module provides a function to extract drawn points from panoptes annotations.

panoptes_aggregation.extractors.point_extractor.point_extractor(classification, **kwargs)

Extract annotations from a point drawing tool into lists. This extractor does not support extraction from multi-frame subjects or subtask extraction. If either of these are needed use panoptes_aggregation.extractors.point_extractor_by_frame.

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:extraction – A dictionary with two keys, x and y, each containing a list of x and y postions for each marked point
Return type:dict

Examples

>>> classification = {'annotations': [
    {
        'task': 'T0',
        'value': [{'tool': 0, 'x': 5, 'y': 10}]
    }
]}
>>> point_extractor(classification)
{'T0_tool0_x': [5], 'T0_tool0_y': [10]}

Point Extractor By Frame

This module provides a function to extract drawn points from panoptes annotations.

panoptes_aggregation.extractors.point_extractor_by_frame.point_extractor_by_frame(classification, **kwargs)

Extract annotations from a point drawing tool into lists.

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:extraction – A dictionary with one key per frame. Each frame has two keys, x and y, each containing a list of x and y postions for each marked point
Return type:dict

Examples

>>> classification = {'annotations': [
    {
        'task': 'T0',
        'value': [{'tool': 0, 'x': 5, 'y': 10, 'frame': 0}],
    }
]}
>>> point_extractor(classification)
{'frame0': {'T0_tool0_x': [5], 'T0_tool0_y': [10]}}

Rectangle Extractor

This module provides a function to extract drawn rectangles from panoptes annotations.

panoptes_aggregation.extractors.rectangle_extractor.rectangle_extractor(classification, **kwargs)

Extact rectangle data from annotation

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:extraction – A dictionary containing one key per frame. Each frame contains the x, y, width, and height values for each tool used in the annotation. These are lists that contain one value for each rectangle drawn for each tool.
Return type:dict

Shape Extractor

This module provides a function to extract drawn shapes from panoptes annotations.

panoptes_aggregation.extractors.shape_extractor.shape_extractor(classification, **kwargs)

Extract shape data from annotations

Parameters:
  • classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotation
  • shape (str, keyword, required) – A string indicating what shape the annotation contains. This should be the name of one of the pre-defined shape tools.
Returns:

extraction – A dictionary containing one key per frame. Each frame contains the shape defining values for each tool used in the annotation. These are lists that contain one value for each shape drawn for each tool.

Return type:

dict


Survey Extractor

This module provides a function to extract choices and sub-questions from panoptes survey tasks.

panoptes_aggregation.extractors.survey_extractor.survey_extractor(classification, **kwargs)

Extract annotations from a survye task into a list

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:extraction – A list of dicts each with choice and answers as keys. Each choice made in an annotation is extacted to a different element of the list.
Return type:list

Examples

>>> classification = {'annotations': [
        {'value':
            [{'choice': 'AGOUTI', 'answers': {'HOWMANY': '1'}}]
        }
    ]}
>>> survey_extractor(classification)
[{'choice': 'agouti','answers_howmany': {'1': 1}}]

Polygon As Line Tool for Text Extractor

This module provides a fuction to eaxtract panoptes annotations from porjects using a polygon tool to mark word in a transcribed document and provide the transcribed text as a sub-task.

panoptes_aggregation.extractors.poly_line_text_extractor.poly_line_text_extractor(classification, dot_freq='word', **kwargs)

Extract annotations from a polygon tool with a text sub-task

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:extraction – A dictionary with one key for each frame. The value for each frame is a dict with text, a list-of-lists of transcribe words, points, a dict with the list-of-lists of x and y postions of each space between words, and slope, a list of the slopes (in deg) of each line drawn. For points and text there is one inner list for each annotaiton made on the frame.
Return type:dict

Examples

>>> classification = {'annotations': [
    'value': [
        {
            'frame': 0,
            'points': [
                {'x': 756, 'y': 197},
                {'x': 856', y': 197}
            ],
            'details': [
                {'value': '[unclear]Cipher[/unclear]'}
            ],
        },
        {
            'frame': 0,
            'points': [
                {'x': 756, 'y': 97},
                {'x': 856, 'y': 97},
                {'x': 956, 'y': 97}
            ],
            'details': [
                {'value': 'A word'}
            ],
        }
]}
>>> poly_line_text_extractor(classification)
{'frame0': {
    'points': {'x': [[756, 856], [756, 856, 956]], 'y': [[197, 197], [97, 97, 97]]},
    'text': [['[unclear]Cipher[/unclear]'], ['A', 'word']]
    'slope': [0, 0]
}}

Line Tool for Text ExtractorTest

This module provides a function to extract panoptes annotations from projects using a line tool to mark lines of text in a transcribed document and provide the text as a sub-task.

panoptes_aggregation.extractors.line_text_extractor.line_text_extractor(classification, **kwargs)

Extract annotations from a line tool with a text sub-task

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:extraction – A dictionary with one key for each frame. The value for each frame is a dict with text, a list-of-lists of transcribe lines, points, a dict with the list-of-lists of x and y postions of each line, and slope, a list of the slopes (in deg) of each line drawn. For points and text there is one inner list for each annotaiton made on the frame.
Return type:dict

Shakespeares World Text Extractor

This module provides a fuction to extract the text data from annotations made on Shakespeares World and AnnoTate.

panoptes_aggregation.extractors.sw_extractor.clean_text(s)

Clean text from Shakespeares World and AnnoTate classification to prepare it for aggregation. Unicode characters, xml, and html are removed.

Parameters:s (string) – A string to be cleaned
Returns:clean_s – The string with all unicode, xml, and html removed
Return type:string
panoptes_aggregation.extractors.sw_extractor.sw_extractor(classification, **kwargs)

Extract text annotations from Shakespeares World and AnnoTate.

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:extraction – A dictionary with one key for each frame. The value for each frame is a dict with text, a list-of-lists of transcribe words, points, a dict with the list-of-lists of x and y postions of each space between words, and slope, a list of the slopes (in deg) of each line drawn. For points and text there is one inner list for each annotaiton made on the frame.
Return type:dict

Shakespeares World Variants Extractor

This module provides a fuction to extract the variants data from annotations made on Shakespeares World.

panoptes_aggregation.extractors.sw_variant_extractor.sw_variant_extractor(classification, **kwargs)

Extract all variants in a classification into one list

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of Shakespeares World annotations
Returns:extraction – A dictionary with at most one key, variants with the list of all variants in the classification
Return type:dict

Shakespeares World Graphic Extractor

This module provides a fuction to extract the graphic data from annotations made on Shakespeares World and AnnoTate.

panoptes_aggregation.extractors.sw_graphic_extractor.sw_graphic_extractor(classification, **kwargs)

Extract all graphics data from a classification

Parameters:classification (dict) – A dictionary containing an annotations key that is a list of Shakespeares World or AnnoTate annotations
Returns:extraction – A dictionary containing one key per frame. Each frame contains the x, y, width, and height values for each tool used in the annotation. These are lists that contain one value for each rectangle drawn for each tool.
Return type:dict