Project structure
GLOBALS
DATABASE: SQLite
SAMPLE_SIZE=size of dataframe * 5encoder = pp.LabelEncoder()discretizer = pp.KBinsDiscretizer(n_bins=5, encode='ordinal', strategy='quantile')has_logit=0,use_mixture=0,score_function="K2"has_logit=1,use_mixture=1,score_function="MI" (different params to avoid isolated vertices)API
There are 4 main and 1 additional modules implemeneted. Additional modules are tests, we use them only in dev stage. Each module follows this pattern:
Controller. File with query
Service. File with core functions
Models. File with declarations of tables in database
Schema. File with docs.
Other elements for particular route group.
Quick API reference
Resource |
Operation |
Description |
|---|---|---|
AuthToken |
Get Token |
|
BN |
GET /api/experiment/(string:owner)/(string:name)/(string:dataset)/(bn_params) |
Train bayessian network |
BNAnalyser |
Get difference between 2 networks |
|
BNDownloader |
Download bn |
|
BNGetNamesManager |
Get list with names of bns |
|
BNManager |
Get dict with bns of user and their info |
|
BNRemover |
Remove bn |
|
CheckFullness |
Check database fullness |
|
DataUploader |
Upload dataset |
|
DatasetObserver |
Get dataset description |
|
DatasetRemover |
Remove dataset |
|
Models |
Get available models |
|
Register |
Registration |
|
RootNodes |
Get root nodes from dataset |
|
Sampler |
Get data to display on x- and y-axis and metrics |
|
SignIn |
Sign in |
AuthMod
This module provides a communication between user and auth system.
Controller
- POST /api/auth/get_token
Authorize user.
- Parameters:
username – user’s name
password – password
- Status Codes:
codes –
200 Success - returns {“token”: token}.
400 Unauthorized - NotFound or incorrect password
- PUT /api/auth/signin
Link token to user.
- Parameters:
username – user’s name
password – password
token – token
- Status Codes:
codes –
200 Success
400 NotFound
- POST /api/auth/signup
User registration.
- Parameters:
username – user’s name
password – password
- Status Codes:
codes –
200 Success - registration successful.
- 400 Bad request
Forbidden name
Empty body
User not found
User already exists
Models
Declare the tables related to auth system.
Service
Here we defined functions to work with auth system.
Experiment
One of the most important module in application. It is responsible for training bayssian network, sample from it.
Controller
- GET /api/experiment/get_models
Get available models for nodes.
- Parameters:
model_type – str, “regressor” or “classifier”
- Status Codes:
codes –
200 - list with models as strings
500 - server error
- GET /api/experiment/(string: owner)/(string: name)/(string: dataset)/(bn_params)
Train BN and sample from it, then save it to db.
- Parameters:
owner – bn’s owner
name – name of bayessian network
dataset – dataset name
bn_params – additional parameters
- bn_params json:
{ "scoring_function": "K2", "use_mixture": "true" or True, "has_logit": "true" or True, "classifier": "LogisticRegression", "regressor": "LinearRegression", "compare_with_default": "true" or True, "params": { "remove_init_edges": "true" or True or None, "init_edges": [["node1", "node2"], ["node2", "node3"], ["node3", "node4"]] or None, "init_nodes": ["node3", "node4"] or None } }
- Status Codes:
codes –
200 Success - returns trained network data (see below)
- 400 Bad request -
net name is too big
use_mixture of has_logit or both or scoring_function is not defined
404 NotFound - User is not found
406 - bn’s limit reached
422 - check for uniqueness of name failed
500 - server error
- network json:
{ "network": { "name": name of net, "dataset_name": name of dataset bn trained on, "edges": edges, "nodes": nodes, "use_mixture": bool, "has_logit": bool, "classifier": str, "regressor": str, "params": {"init_edges": None or List[List[str]], "init_nodes": None or List[str], "white_list": FROZEN FEATURE, "bl_add": FROZEN FEATURE, "remove_init_edges": bool or none}, "scoring_function": str, "descriptor": str, string with dictionary with pairs } }
Models
Declare tables with networks and samples.
Service
Core functions to fit bayessian network and save them.
BN manager
Module provides operations with bayessian networks in database such as: find BN(-s) if exists, delete, put and train.
Controller
- GET /api/bn_manager/get_equal_edges
get different edges between 2 nets.
- Parameters:
names – nets name as List[str]
owner – owner of nets
- Status Codes:
codes –
400 Bad request
404 Nets wasn’t found
- Return:
{“equal_edges”: List of strings with nodes}
- GET /api/bn_manager/download_BN
Download BN.
- Parameters:
user – username
bn_name – name of bn
- Status Codes:
codes –
200 Success - send file
400 Bad request - user or bn_name wasn’t found in request
- 404 Bad request -
user or bn_name wasn’t found
network was not find
- GET /api/bn_manager/get_display_data/(string: owner)/(string: net_name)/(string: dataset_name)/(string: node)
Get real and sampled data.
- Parameters:
owner – username
net_name – name of network
dataset_name – name of dataset
node – name of node
- Status Codes:
codes –
200 Success - return json with data to display
400 Bad Request
404 NotFound - Sample wasn’t found.
- display json:
{ 'data': List with data for y-axis, 'xvals': List with data for x-axis, 'metrics': {metric: val}, 'type': Str, type of node }
- GET /api/bn_manager/get_BN_names/(string: owner)
Get BN names to validate uniqueness.
- Parameters:
owner – net holder
- Status Codes:
codes –
200 Success - return {“networks”: list with names of nets for owner}
404 NotFound - user not found.
- DELETE /api/bn_manager/remove/(string: owner)/(string: name)
Delete bn and its samples.
- Parameters:
owner – username
name – name of bayessian network
- Status Codes:
codes –
200 Success
404 NotFound - net was not found
- GET /api/bn_manager/get_BN/(string: owner)
Get BN Data.
- Parameters:
owner – bn’s owner
- Status Codes:
codes –
200 Success - return user’s bns and info about them
404 NotFound
- network json:
{ "networks": {"number": { "name": name of net, "dataset_name": name of dataset bn trained on, "edges": edges, "nodes": nodes, "use_mixture": bool, "has_logit": bool, "classifier": str, "regressor": str, "params": {"init_edges": None or List[List[str]], "init_nodes": None or List[str], "white_list": FROZEN FEATURE, "bl_add": FROZEN FEATURE, "remove_init_edges": bool or none}, "scoring_function": str, "descriptor": str, string with dictionary with pairs } } }
Service
Core functions to work with samples. It contains SampleWorker class that provide sample analysis and processing.
Data manager
Module provides operations with data such as: (up)-, (down-) load datasets, their removal and preprocessing.
Controller
- DELETE /api/data_manager/remove_dataset
Remove dataset.
- Parameters:
name – dataset name
user – username
- Status Codes:
codes –
200 Success
400 BadRequest - Empty location provided
403 - Attempt to delete our data
404 NotFound - Dataset was not found in database
- GET /api/data_manager/get_root_nodes
Return all possible root nodes.
Note that under vk and hack names we store our datasets. If you want to get them, you don’t need to pass an owner.
- Parameters:
name – dataset name
owner – OPTIONAL owner doesn’t accept if dataset is ours.
- Status Codes:
codes –
200 Success - return json {“root_nodes”: List[str]}
400 BadRequest - Empty parameters or empty location
404 NotFound
- GET /api/data_manager/check_fullness
Return True if the upload_folder is the same as the list of locations from the database.
- Return:
if corrupted returns a paths, if not returns message “Database is full.”
- GET /api/data_manager/get_datasets
Get a list with user’s datasets.
- Parameters:
user – username
- Status Codes:
codes –
200 Success - return dict with {dataset.name:dataset.description}
404 NotFound
422 - user wasn’t found in request body
- DELETE /api/data_manager/wipe_cache
Clean cached samples
- POST /api/data_manager/upload
Put dataset’s link into db.
Dataset itself is put inside folder of user, into db stores only links.
- Parameters:
name – name of dataset
owner – user’s name
description – Description of dataset
content – Raw file
- Status Codes:
codes –
200 Success
- 400 BadRequest -
name of dataset must be unique
dataset contains “Unnamed:0” column
dataset contains too many rows and/or columns
- 404 NotFound
no file or user not found
empty file or conversion error
405 - dataset’s limit reached
422 BadRequest - cannot read the file
Models
Declare tables with datasets.
Service
Core functions to upload datasets and save them.