Contributing
Code Style
PEP8 style is used in most cases and flake8 is used for linting by running flake8 in the source directory. The setup.cfg contains the configuration for the linter and lists the error codes and files that are being ignored.
Building Documentation
Automatic documentation will be created using sphinx so add doc strings to any files created and functions written. Documentation can be compiled with the make_docs.sh bash script.
Writing Extractors
Extractors are used to take classifications coming out of Panoptes and extract the relevant data needed to calculate an aggregated answer for one task on a subject. Ideally this extraction should be as flat as possible (i.e. no deeply nested dictionaries), but sometimes this can not be avoided.
1. Make a new function for the extractor
Create a new file for the function in the
extractorsfolderDefine a new function
*_extractorthat takes in the raw classification json (as it appears in the classification dumpcsvfrom Panoptes) and returns adict-like object of the extracted data.Use the
@extractor_wrapperdecorator on the function (can be imported withfrom .extractor_wrapper import extractor_wrapper).Use the
@subtask_wrapperand@tool_wrapperdecorators if the function is for a drawing tool (can be imported withfrom .extractor_wrapper import subtask_extractor_wrapper).Write tests for the extractor in the
tests/extractor_testsfolder. TheExtractorTestclass from thetests/extractor_tests/base_test_class.pyfile should be used to create the test function. This class ensures that both the “offline” and “online” versions of the code are tested and produce the expected results. See the other tests in that folder for examples of how to use theExtractorTestclass.
The @extractor_wrapper decorator
This decorator removes the boiler plate code that goes along with making a extractor function that works with both the classification dump csv files (offline) and API request from caesar (online). If A request is passed into the function it will pull the data out as json and pass it into the extractor, if anything else is passed in the function will be called directly. This decorator also does the following:
filter the classifications using the
taskandtoolskeywords passed into the extractoradd the aggregation version number to the final extract
The @subtask_extractor_wrapper decorator
This decorator removes the boiler plate code that goes along with extracting subtask data from drawing tasks. This decorator looks for the details keyword passed into the extractor function and will apply the specified extractor the the proper subtask data and return the extracts as a list in the same order the subtask presented them.
Note: It is assumed that the first level of the extracted dictionary refers to the subject’s frame index (e.g. frame0 or frame1) even when the subject only has one frame.
The @tool_wrapper decorator
This decorator removes the boiler plate code for filtering classifications based on the tools keyword. This makes it so each tool for a drawing task can have extractors set up independently.
2. Create the route to the extractor
The routes are automatically constructed using the extractors dictionary in the __init__.py file:
import the new extractor into the
__init__.pyfile with the following formatfrom .*_extractor import *_extractorAdd the
*_extractorfunction to theextractorsdictionary with a sensible route name as thekey(typically thekeyshould be the same as the extractor name)
3. Allow the offline version of the code automatically detect this extractor type from a workflow object
Update the
workflow_config.pyfunction with the new task type. The value used for the type should be the samekeyused in the__init__.pyfileUpdate the
tests/utility_tests/test_workflow_config.pytest with this new task type
4. Add to documentation
The code is auto-documented using sphinx.
Add a doc string to every function written and a “heading” doc string at the top of any new files created (follow the numpy doc string convention)
Add a reference to the new file to
docs/source/extractors.rstAdd to the extractor/reducer lookup table
docs/source/Task_lookup_table.rstBuild the docs with the
make_docs.shbash script
5. Make sure everything still works
run
coverage runand ensure all tests still pass(optional) run
coverage reportto check tests coverage in each file
Writing Reducers
Reducers are functions that take a list of extracts and combines them into aggregated values. Ideally this reduction should be as flat as possible (i.e. no deeply nested dictionaries), but sometimes this can not be avoided.
1. Make new functions for the reducer
Typically two function need to be defined for a reducer.
process_datais a helper function that takes a list of raw extracted data objects and pre-processes them into a form the main reducer function can use (e.g. arranging the data into arrays, creatingCounterobjects, etc…)The
*_reducerfunction that takes in the output of theprocess_datafunction and returns the reduced data as adict-like object or a list ofdict-like object (sometimes needed to avoid deeply nested dictionaires).The
*_reducerfunction should use the@reducer_wrapperdecorator with theprocess_datafunction passed as theprocess_datakeyword.If the reducer exposes keywords the user can specify a
DEFAULTSdictionary must be specified of the form:DEFAULTS = {'<keyword name>': {'default': <default value>, 'type': <data type>}}If these keywords are passed into the
process_datafunction theyDEFAULTSdictionary should be passed into the@reducer_wrapperas thedefaults_processkeyword. If these keywords are passed into the main*_reducerfunction theDEFAULTSdictionary should be passed into the@reducer_wrapperas thedefaults_datakeyword. Note: any combination of these two can be used.Write tests for all the above functions and place them in the
test/reducer_test/folder. The decorator exposes the original function on the._originalmethod of the decorated function, this allows for it to be tested directly. TheReducerTestclass from thetests/reducer_tests/base_test_class.pyfile should be used to create the test function. This class ensures that both the “offline” and “online” versions of the code are tested and produce the expected results. See the other tests in that folder for examples of how to use theReducerTestclass.
The @reducer_wrapper decorator
This decorator removes the boiler plate needed to set up a reducer function to work with extractions from either a csv file (offline) or an API request from caesar. It will also run an optional process_data data function and pass the results into the wrapped function. Various user defined keywords are also passed into either the process_data function or the wrapped function. All keywords are parsed and type-checked before being used, that way no invalid keywords will be passed into either function. This wrapper will also do the following:
Remove the
aggregation_versionkeyword from each extract so it is not passed into the reducer functionAdd the
aggregation_versionkeyword to the final reduction dictionary
The @subtask_reducer_wrapper decorator
This decorator removes the boiler plate code that goes along with reducing subtask data from drawing tasks. This decorator looks for the details keyword passed into the reducer function and will apply the specified reducer the the proper subtask data within each cluster found on the subject and returns the reductions as a list in the same order the subtask presented them.
Note: It is assumed that the first level of the reduced dictionary refers to the subject’s frame index (e.g. frame0 or frame1) even when the subject only has one frame.
2. Create the route to the reducer
The routes are automatically constructed using the reducers dictionary in the __init__.py file:
import the new reducer into the
__init__.pyfile with the following formatfrom .*_reducer import *_reducerAdd the
*_reducerfunction to thereducerdictionary with a sensible route name as thekey(typically thekeyshould be the same as the reducer name)
3. Add to documentation
The code is auto-documented using sphinx.
Add a doc string to every function written and a “heading” doc string at the top of any new files created (follow the numpy doc string convention)
Add a reference to the new file to
docs/source/reducers.rstAdd to the extractor/reducer lookup table
docs/source/Task_lookup_table.rstBuild the docs with the
make_docs.shbash script
4. Make sure everything still works
run
coverage runand ensure all tests still pass(optional)
coverage reportto checking what parts of the code are not covered
Copying extractors and reducers
Sometimes it is useful to have two extractor/reducers routes point to the same underlying function (e.g. question and shortcut tasks), to ensure separate csv files are created in offline mode. Unfortunately if you just place the same function multiple times in the extractors/__init__.py or reducers/__init__.py dictionaries flask will crash since two routes point to functions with the same name. To help with this panoptes_aggregation.copy_function.copy_function can be used to clone any function with a new name:
from panoptes_aggregation.copy_function import copy_function
new_function = copy_function(old_function, 'new_name')