Extractors

Question Extractor

This module provides a function to extract question tasks (single and multiple) from panoptes annotations.

panoptes_aggregation.extractors.question_extractor.question_extractor(classification, **kwargs)

Extract annotations from a question task into a Counter object

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations

Returns

extraction – A dictionary (formated like a counter) indicating what annotations were made

Return type

dict

Examples

>>> classification_multiple = {'annotations': [
    {
        'value': ['Blue', 'Green']
    }
]}
>>> question_extractor(classification_multiple)
{'blue': 1, 'green': 1}
>>> classification_single = {'annotations': [
    {'value': 'Yes'}
]}
>>> question_extractor(classification_single)
{'yes': 1}
panoptes_aggregation.extractors.question_extractor.slugify_or_null(s)

Slugify value while casting null as a string first


Slider Extractor

This module provides a function to extract slider tasks from panoptes annotations.

panoptes_aggregation.extractors.slider_extractor.slider_extractor(classification, **kwargs)

Extract annotations from a slider task

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations

Returns

extraction – A dictionary indicating what annotation was made

Return type

dict


Point Extractor

This module provides a function to extract drawn points from panoptes annotations.

panoptes_aggregation.extractors.point_extractor.point_extractor(classification, **kwargs)

Extract annotations from a point drawing tool into lists. This extractor does not support extraction from multi-frame subjects or subtask extraction. If either of these are needed use panoptes_aggregation.extractors.point_extractor_by_frame.

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations

Returns

extraction – A dictionary with two keys, x and y, each containing a list of x and y postions for each marked point

Return type

dict

Examples

>>> classification = {'annotations': [
    {
        'task': 'T0',
        'value': [{'tool': 0, 'x': 5, 'y': 10}]
    }
]}
>>> point_extractor(classification)
{'T0_tool0_x': [5], 'T0_tool0_y': [10]}

Point Extractor By Frame

This module provides a function to extract drawn points from panoptes annotations.

panoptes_aggregation.extractors.point_extractor_by_frame.point_extractor_by_frame(classification, **kwargs)

Extract annotations from a point drawing tool into lists.

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations

Returns

extraction – A dictionary with one key per frame. Each frame has two keys, x and y, each containing a list of x and y postions for each marked point

Return type

dict

Examples

>>> classification = {'annotations': [
    {
        'task': 'T0',
        'value': [{'tool': 0, 'x': 5, 'y': 10, 'frame': 0}],
    }
]}
>>> point_extractor(classification)
{'frame0': {'T0_tool0_x': [5], 'T0_tool0_y': [10]}}

Rectangle Extractor

This module provides a function to extract drawn rectangles from panoptes annotations.

panoptes_aggregation.extractors.rectangle_extractor.rectangle_extractor(classification, **kwargs)

Extact rectangle data from annotation

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations

Returns

extraction – A dictionary containing one key per frame. Each frame contains the x, y, width, and height values for each tool used in the annotation. These are lists that contain one value for each rectangle drawn for each tool.

Return type

dict


Shape Extractor

This module provides a function to extract drawn shapes from panoptes annotations.

panoptes_aggregation.extractors.shape_extractor.shape_extractor(classification, **kwargs)

Extract shape data from annotations

Parameters
  • classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotation

  • shape (str, keyword, required) – A string indicating what shape the annotation contains. This should be the name of one of the pre-defined shape tools.

Returns

extraction – A dictionary containing one key per frame. Each frame contains the shape defining values for each tool used in the annotation. These are lists that contain one value for each shape drawn for each tool.

Return type

dict


Survey Extractor

This module provides a function to extract choices and sub-questions from panoptes survey tasks.

panoptes_aggregation.extractors.survey_extractor.survey_extractor(classification, **kwargs)

Extract annotations from a survye task into a list

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations

Returns

extraction – A list of dicts each with choice and answers as keys. Each choice made in an annotation is extacted to a different element of the list.

Return type

list

Examples

>>> classification = {'annotations': [
        {'value':
            [{'choice': 'AGOUTI', 'answers': {'HOWMANY': '1'}}]
        }
    ]}
>>> survey_extractor(classification)
[{'choice': 'agouti','answers_howmany': {'1': 1}}]

Polygon As Line Tool for Text Extractor

This module provides a function to extract panoptes annotations from projects using a polygon tool to mark words in a transcribed document and provide the transcribed text as a sub-task.

panoptes_aggregation.extractors.poly_line_text_extractor.poly_line_text_extractor(classification, dot_freq='line', gold_standard=False, **kwargs)

Extract annotations from a polygon tool with a text sub-task

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations

Returns

extraction – A dictionary with one key for each frame. The value for each frame is a dict with text a list-of-lists of transcribe words, points a dict with the list-of-lists of x and y postions of each space between words, slope a list of the slopes (in deg) of each line drawn, and gold_standard a bool indicating if the annotation was made in gold standard mode in the classifier. For points and text there is one inner list for each annotaiton made on the frame.

Return type

dict

Examples

>>> classification = {'annotations': [
    'value': [
        {
            'frame': 0,
            'points': [
                {'x': 756, 'y': 197},
                {'x': 856', y': 197}
            ],
            'details': [
                {'value': '[unclear]Cipher[/unclear]'}
            ],
        },
        {
            'frame': 0,
            'points': [
                {'x': 756, 'y': 97},
                {'x': 856, 'y': 97},
                {'x': 956, 'y': 97}
            ],
            'details': [
                {'value': 'A word'}
            ],
        }
]}
>>> poly_line_text_extractor(classification)
{'frame0': {
    'points': {'x': [[756, 856], [756, 856, 956]], 'y': [[197, 197], [97, 97, 97]]},
    'text': [['[unclear]Cipher[/unclear]'], ['A', 'word']]
    'slope': [0, 0],
    'gold_standard': False
}}

Line Tool for Text Extractor

This module provides a function to extract panoptes annotations from projects using a line tool to mark lines of text in a transcribed document and provide the text as a sub-task.

panoptes_aggregation.extractors.line_text_extractor.line_text_extractor(classification, gold_standard=False, **kwargs)

Extract annotations from a line tool with a text sub-task

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations

Returns

extraction – A dictionary with one key for each frame. The value for each frame is a dict with text, a list-of-lists of transcribe lines, points, a dict with the list-of-lists of x and y postions of each line, and slope, a list of the slopes (in deg) of each line drawn. For points and text there is one inner list for each annotaiton made on the frame.

Return type

dict


Shakespeares World Text Extractor

This module provides a fuction to extract the text data from annotations made on Shakespeares World and AnnoTate.

panoptes_aggregation.extractors.sw_extractor.clean_text(s)

Clean text from Shakespeares World and AnnoTate classification to prepare it for aggregation. Unicode characters, xml, and html are removed.

Parameters

s (string) – A string to be cleaned

Returns

clean_s – The string with all unicode, xml, and html removed

Return type

string

panoptes_aggregation.extractors.sw_extractor.sw_extractor(classification, gold_standard=False, **kwargs)

Extract text annotations from Shakespeares World and AnnoTate.

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations

Returns

extraction – A dictionary with one key for each frame. The value for each frame is a dict with text, a list-of-lists of transcribe words, points, a dict with the list-of-lists of x and y postions of each space between words, and slope, a list of the slopes (in deg) of each line drawn. For points and text there is one inner list for each annotaiton made on the frame.

Return type

dict


Shakespeares World Variants Extractor

This module provides a fuction to extract the variants data from annotations made on Shakespeares World.

panoptes_aggregation.extractors.sw_variant_extractor.sw_variant_extractor(classification, **kwargs)

Extract all variants in a classification into one list

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of Shakespeares World annotations

Returns

extraction – A dictionary with at most one key, variants with the list of all variants in the classification

Return type

dict


Shakespeares World Graphic Extractor

This module provides a fuction to extract the graphic data from annotations made on Shakespeares World and AnnoTate.

panoptes_aggregation.extractors.sw_graphic_extractor.sw_graphic_extractor(classification, **kwargs)

Extract all graphics data from a classification

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of Shakespeares World or AnnoTate annotations

Returns

extraction – A dictionary containing one key per frame. Each frame contains the x, y, width, and height values for each tool used in the annotation. These are lists that contain one value for each rectangle drawn for each tool.

Return type

dict


panoptes_aggregation.extractors.dropdown_extractor.dropdown_extractor(classification, **kwargs)

Extract annotations from a dropdown task into a Counter object

Parameters

classification (dict) – A dictionary containing annotations as a key that is a list of panoptes annotations

Returns

extraction – A dictionary containing value as a key that is a list of Counter dictionaries, one entry for each dropdown list in the task

Return type

dict


Text Extractor

This module provides a function to extract text tasks from panoptes annotations

panoptes_aggregation.extractors.text_extractor.text_extractor(classification, gold_standard=False, **kwargs)

Extract annotations from a text task as a string.

Parameters

classification (dict) – A dictionary containing annotations as a key that is a list of panoptes annotations

Returns

extraction – A dictionary with two keys * text: the string for the text entered for the task * gold_standard: bool indicated if the classification was made in gold standard mode

Return type

dict


Intro2Astro Extractor

This module provides a function that converts the pixel annotation to wavelength and uses the subject metadata to precalculate values required by students to use Hubble’s Law to compute galactic velocity.

panoptes_aggregation.extractors.i2a_extractor.i2a_extractor(classification, **kwargs)

Extract annotations from intro2astro annotation and returns calculated values and values extracted from subject metadata required by students to use Hubble’s Law to compute galactic velocity.

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations. There should only be one, and the first is the only one considered.

Returns

extraction – A dictionary with a set of keys, including those that were computed by this function and those that were extracted from the subject metadata.

Return type

dict

Examples

>>> classification = {
    "annotations": [
        {
            "task": "T0",
            "value": [
                {
                    "width": 80.36077880859375,
                    "tool": 0,
                    "0": 0,
                    "details": [],
                    "x": 541.7737426757812,
                    "frame": 0
                }
            ]
        }
    ],
    "metadata": {
        "subject_dimensions": [
            {
                "clientWidth": 444,
                "clientHeight": 333,
                "naturalWidth": 1152,
                "naturalHeight": 864
            }
        ]
    },
    "subject": {
        "metadata": {
            "RA": "121.62522",
            "Dec": "17.42804",
            "URL": "http://skyserver.sdss.org/dr12/en/tools/explore/Summary.aspx?ra=121.62522&dec=17.42804",
            "spiral": "0",
            "elliptical": "1",
            "Distance_Mpc": "481.4064706",
            "SVG_filename": "1237665128518320259.svg",
            "#Published_Redshift": "0.1091188"
        }
    }
}
>>> point_extractor(classification)
    {
        "galaxy_id": "1237665128518320259",
        "url": "http://skyserver.sdss.org/dr12/en/tools/explore/Summary.aspx?ra=121.62522&dec=17.42804",
        "RA": "121.62522",
        "dec": "17.42804",
        "dist": 481.40647058823527,
        "redshift": 0.1146063992421806,
        "velocity": 34381.91977265418,
        "lambdacen": 438.4527192698966
    }

Nfn Extractor

This module provides functions to answer certain questions about a Notes from Nature annotation for use in their Field Book.

class panoptes_aggregation.extractors.nfn_extractor.ClassificationParser(classification, kwargs)

A classification parser


All Tasks Empty Extractor

Extractor determines whether all task values are empty.

panoptes_aggregation.extractors.all_tasks_empty_extractor.all_tasks_empty_extractor(classification, **kwargs)

Determine whether all task values in a classification are empty.

Parameters

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations

Returns

extractionextraction[“result”] is True if all task values are None. False otherwise.

Return type

dict