Extractors

Question Extractor

This module provides a function to extract question tasks (single and multiple) from panoptes annotations.

panoptes_aggregation.extractors.question_extractor.question_extractor(classification, **kwargs)

Extract annotations from a question task into a Counter object

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:: extraction – A dictionary (formated like a counter) indicating what annotations were made
Return type:: dict

Examples

>>> classification_multiple = {'annotations': [
    {
        'value': ['Blue', 'Green']
    }
]}
>>> question_extractor(classification_multiple)
{'blue': 1, 'green': 1}

>>> classification_single = {'annotations': [
    {'value': 'Yes'}
]}
>>> question_extractor(classification_single)
{'yes': 1}

panoptes_aggregation.extractors.question_extractor.slugify_or_null(s): Slugify value while casting null as a string first

Slider Extractor

This module provides a function to extract slider tasks from panoptes annotations.

panoptes_aggregation.extractors.slider_extractor.slider_extractor(classification, **kwargs)

Extract annotations from a slider task

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:: extraction – A dictionary indicating what annotation was made
Return type:: dict

Point Extractor

This module provides a function to extract drawn points from panoptes annotations.

panoptes_aggregation.extractors.point_extractor.point_extractor(classification, **kwargs)

Extract annotations from a point drawing tool into lists. This extractor does not support extraction from multi-frame subjects or subtask extraction. If either of these are needed use panoptes_aggregation.extractors.point_extractor_by_frame.

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:: extraction – A dictionary with two keys, x and y, each containing a list of x and y postions for each marked point
Return type:: dict

Examples

>>> classification = {'annotations': [
    {
        'task': 'T0',
        'value': [{'tool': 0, 'x': 5, 'y': 10}]
    }
]}
>>> point_extractor(classification)
{'T0_tool0_x': [5], 'T0_tool0_y': [10]}

Point Extractor By Frame

This module provides a function to extract drawn points from panoptes annotations.

panoptes_aggregation.extractors.point_extractor_by_frame.point_extractor_by_frame(classification, **kwargs)

Extract annotations from a point drawing tool into lists.

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:: extraction – A dictionary with one key per frame. Each frame has two keys, x and y, each containing a list of x and y postions for each marked point
Return type:: dict

Examples

>>> classification = {'annotations': [
    {
        'task': 'T0',
        'value': [{'tool': 0, 'x': 5, 'y': 10, 'frame': 0}],
    }
]}
>>> point_extractor(classification)
{'frame0': {'T0_tool0_x': [5], 'T0_tool0_y': [10]}}

Rectangle Extractor

This module provides a function to extract drawn rectangles from panoptes annotations.

panoptes_aggregation.extractors.rectangle_extractor.rectangle_extractor(classification, **kwargs)

Extact rectangle data from annotation

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:: extraction – A dictionary containing one key per frame. Each frame contains the x, y, width, and height values for each tool used in the annotation. These are lists that contain one value for each rectangle drawn for each tool.
Return type:: dict

Shape Extractor

This module provides a function to extract drawn shapes from panoptes annotations.

panoptes_aggregation.extractors.shape_extractor.shape_extractor(classification, **kwargs)

Extract shape data from annotations

Parameters:

classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotation
shape (str, keyword, required) – A string indicating what shape the annotation contains. This should be the name of one of the pre-defined shape tools.

Returns:

extraction – A dictionary containing one key per frame. Each frame contains the shape defining values for each tool used in the annotation. These are lists that contain one value for each shape drawn for each tool.

Return type:

dict

Bezier Tool Extractor

This module provides a function to extract Bezier drawn classifications from panoptes annotations.

panoptes_aggregation.extractors.bezier_extractor.bezier_extractor(classification, **kwargs)

Extact Bezier data from annotation.

See the Bezier wiki for more info about Bezier curves.

The output extraction is full xy curves, based on the individual Bezier curves from input control points stitched together into a single continuous curve. The individual Bezier curves have 10 points constructed from the 3 control points of the quadratic Bezier curve.

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations. The x and y data of the classifications needs to be of the format ‘points’: {‘x’: x, ‘y’: y}. It is assumed the input xy data is a continuous set of triplets (the last value of a triplet is the starting value of the next triplet) corresponding to the quadratic Bezier control points.
Returns:: extraction – A dictionary containing one key per frame. Each frame contains lists pathX and pathY. These are lists of lists, where each inner list of pathX is the x values, and each inner list of pathY is the y values, for a particular Bezier drawing.
Return type:: dict

Polygon/Freehand Tool Extractor

This module provides a function to extract polygon and freehand drawn classifications from panoptes annotations.

panoptes_aggregation.extractors.polygon_extractor.polygon_extractor(classification, gold_standard=False, **kwargs)

Extact polygon/freehand data from annotation

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations. The x and y data of the classifications needs to be in one of the following formats: ‘pathX’: x and ‘pathY’: y, or ‘points’: {‘x’: x, ‘y’: y}.
Returns:: extraction – A dictionary containing one key per frame. Each frame contains lists pathX and pathY. These are lists of lists, where each inner list of pathX is the x values, and each inner list of pathY is the y values, for a particular polygon/freehand drawing. The dictionary also contains information if the data is gold standard or not.
Return type:: dict

Survey Extractor

This module provides a function to extract choices and sub-questions from panoptes survey tasks.

panoptes_aggregation.extractors.survey_extractor.survey_extractor(classification, **kwargs)

Extract annotations from a survye task into a list

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:: extraction – A list of dicts each with choice and answers as keys. Each choice made in an annotation is extacted to a different element of the list.
Return type:: list

Examples

>>> classification = {'annotations': [
        {'value':
            [{'choice': 'AGOUTI', 'answers': {'HOWMANY': '1'}}]
        }
    ]}
>>> survey_extractor(classification)
[{'choice': 'agouti','answers_howmany': {'1': 1}}]

Polygon As Line Tool for Text Extractor

This module provides a function to extract panoptes annotations from projects using a polygon tool to mark words in a transcribed document and provide the transcribed text as a sub-task.

panoptes_aggregation.extractors.poly_line_text_extractor.poly_line_text_extractor(classification, dot_freq='line', gold_standard=False, **kwargs)

Extract annotations from a polygon tool with a text sub-task

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:: extraction – A dictionary with one key for each frame. The value for each frame is a dict with text a list-of-lists of transcribe words, points a dict with the list-of-lists of x and y postions of each space between words, slope a list of the slopes (in deg) of each line drawn, and gold_standard a bool indicating if the annotation was made in gold standard mode in the classifier. For points and text there is one inner list for each annotaiton made on the frame.
Return type:: dict

Examples

>>> classification = {'annotations': [
    'value': [
        {
            'frame': 0,
            'points': [
                {'x': 756, 'y': 197},
                {'x': 856', y': 197}
            ],
            'details': [
                {'value': '[unclear]Cipher[/unclear]'}
            ],
        },
        {
            'frame': 0,
            'points': [
                {'x': 756, 'y': 97},
                {'x': 856, 'y': 97},
                {'x': 956, 'y': 97}
            ],
            'details': [
                {'value': 'A word'}
            ],
        }
]}
>>> poly_line_text_extractor(classification)
{'frame0': {
    'points': {'x': [[756, 856], [756, 856, 956]], 'y': [[197, 197], [97, 97, 97]]},
    'text': [['[unclear]Cipher[/unclear]'], ['A', 'word']]
    'slope': [0, 0],
    'gold_standard': False
}}

Line Tool for Text Extractor

This module provides a function to extract panoptes annotations from projects using a line tool to mark lines of text in a transcribed document and provide the text as a sub-task.

panoptes_aggregation.extractors.line_text_extractor.line_text_extractor(classification, gold_standard=False, **kwargs)

Extract annotations from a line tool with a text sub-task

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:: extraction – A dictionary with one key for each frame. The value for each frame is a dict with text, a list-of-lists of transcribe lines, points, a dict with the list-of-lists of x and y postions of each line, and slope, a list of the slopes (in deg) of each line drawn. For points and text there is one inner list for each annotaiton made on the frame.
Return type:: dict

Shakespeares World Text Extractor

This module provides a fuction to extract the text data from annotations made on Shakespeares World and AnnoTate.

panoptes_aggregation.extractors.sw_extractor.clean_text(s)

Clean text from Shakespeares World and AnnoTate classification to prepare it for aggregation. Unicode characters, xml, and html are removed.

Parameters:: s (string) – A string to be cleaned
Returns:: clean_s – The string with all unicode, xml, and html removed
Return type:: string

panoptes_aggregation.extractors.sw_extractor.sw_extractor(classification, gold_standard=False, **kwargs)

Extract text annotations from Shakespeares World and AnnoTate.

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:: extraction – A dictionary with one key for each frame. The value for each frame is a dict with text, a list-of-lists of transcribe words, points, a dict with the list-of-lists of x and y postions of each space between words, and slope, a list of the slopes (in deg) of each line drawn. For points and text there is one inner list for each annotaiton made on the frame.
Return type:: dict

Shakespeares World Variants Extractor

This module provides a fuction to extract the variants data from annotations made on Shakespeares World.

panoptes_aggregation.extractors.sw_variant_extractor.sw_variant_extractor(classification, **kwargs)

Extract all variants in a classification into one list

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of Shakespeares World annotations
Returns:: extraction – A dictionary with at most one key, variants with the list of all variants in the classification
Return type:: dict

Shakespeares World Graphic Extractor

This module provides a fuction to extract the graphic data from annotations made on Shakespeares World and AnnoTate.

panoptes_aggregation.extractors.sw_graphic_extractor.sw_graphic_extractor(classification, **kwargs)

Extract all graphics data from a classification

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of Shakespeares World or AnnoTate annotations
Returns:: extraction – A dictionary containing one key per frame. Each frame contains the x, y, width, and height values for each tool used in the annotation. These are lists that contain one value for each rectangle drawn for each tool.
Return type:: dict

panoptes_aggregation.extractors.dropdown_extractor.dropdown_extractor(classification, **kwargs)

Extract annotations from a dropdown task into a Counter object

Parameters:: classification (dict) – A dictionary containing annotations as a key that is a list of panoptes annotations
Returns:: extraction – A dictionary containing value as a key that is a list of Counter dictionaries, one entry for each dropdown list in the task
Return type:: dict

Text Extractor

This module provides a function to extract text tasks from panoptes annotations

panoptes_aggregation.extractors.text_extractor.text_extractor(classification, gold_standard=False, **kwargs)

Extract annotations from a text task as a string.

Parameters:: classification (dict) – A dictionary containing annotations as a key that is a list of panoptes annotations
Returns:: extraction – A dictionary with two keys * text: the string for the text entered for the task * gold_standard: bool indicated if the classification was made in gold standard mode
Return type:: dict

Intro2Astro Extractor

This module provides a function that converts the pixel annotation to wavelength and uses the subject metadata to precalculate values required by students to use Hubble’s Law to compute galactic velocity.

panoptes_aggregation.extractors.i2a_extractor.i2a_extractor(classification, **kwargs)

Extract annotations from intro2astro annotation and returns calculated values and values extracted from subject metadata required by students to use Hubble’s Law to compute galactic velocity.

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations. There should only be one, and the first is the only one considered.
Returns:: extraction – A dictionary with a set of keys, including those that were computed by this function and those that were extracted from the subject metadata.
Return type:: dict

Examples

>>> classification = {
    "annotations": [
        {
            "task": "T0",
            "value": [
                {
                    "width": 80.36077880859375,
                    "tool": 0,
                    "0": 0,
                    "details": [],
                    "x": 541.7737426757812,
                    "frame": 0
                }
            ]
        }
    ],
    "metadata": {
        "subject_dimensions": [
            {
                "clientWidth": 444,
                "clientHeight": 333,
                "naturalWidth": 1152,
                "naturalHeight": 864
            }
        ]
    },
    "subject": {
        "metadata": {
            "RA": "121.62522",
            "Dec": "17.42804",
            "URL": "http://skyserver.sdss.org/dr12/en/tools/explore/Summary.aspx?ra=121.62522&dec=17.42804",
            "spiral": "0",
            "elliptical": "1",
            "Distance_Mpc": "481.4064706",
            "SVG_filename": "1237665128518320259.svg",
            "#Published_Redshift": "0.1091188"
        }
    }
}
>>> point_extractor(classification)
    {
        "galaxy_id": "1237665128518320259",
        "url": "http://skyserver.sdss.org/dr12/en/tools/explore/Summary.aspx?ra=121.62522&dec=17.42804",
        "RA": "121.62522",
        "dec": "17.42804",
        "dist": 481.40647058823527,
        "redshift": 0.1146063992421806,
        "velocity": 34381.91977265418,
        "lambdacen": 438.4527192698966
    }

Nfn Extractor

This module provides functions to answer certain questions about a Notes from Nature annotation for use in their Field Book.

class panoptes_aggregation.extractors.nfn_extractor.ClassificationParser(classification, kwargs): A classification parser

All Tasks Empty Extractor

Extractor determines whether all task values are empty.

panoptes_aggregation.extractors.all_tasks_empty_extractor.all_tasks_empty_extractor(classification, **kwargs)

Determine whether all task values in a classification are empty.

Parameters:: classification (dict) – A dictionary containing an annotations key that is a list of panoptes annotations
Returns:: extraction – extraction[“result”] is True if all task values are None. False otherwise.
Return type:: dict

Extractors

Question Extractor

Slider Extractor

Point Extractor

Point Extractor By Frame

Rectangle Extractor

Shape Extractor

Bezier Tool Extractor

Polygon/Freehand Tool Extractor

Survey Extractor

Polygon As Line Tool for Text Extractor

Line Tool for Text Extractor

Shakespeares World Text Extractor

Shakespeares World Variants Extractor

Shakespeares World Graphic Extractor

Dropdown Extractor

Text Extractor

Intro2Astro Extractor

Nfn Extractor

All Tasks Empty Extractor