TC conference call - 9. Dec 2015

Jump to: navigation, search
TC conference call - 9. Dec 2015
Title TC conference call - 9. Dec 2015
Location
Date Start 2015/12/09
Date End 2015/12/09
Tags
Description

Agenda

  1. Copenhagen notes - link to Google Drive open on the wiki, remove James' and Paul's manuscript by the 15th of Dec?
  2. DINA-Web Classifications - dockerized taxonomy module https://github.com/DINA-Web/dw-classifications
  3. DINA-Web Collections - dockerized back end https://github.com/DINA-Web/dw-collections
    1. Demo KeyCloak, Ida Li and Markus Skyttner
    2. MySQL, run with Postgres and/or MariaDB, could CAN assist with this task?
  4. Paul's and James' db schema test in liquibase, Paul Morris
  5. Seqdb update, Satpal Bilkhu and Ingimar Erlingsson
  6. DINA Technical workshop, Berlin

Minutes

Present: Karin, Markus S, Ingimar, Ida, Kevin, Al, and Satpal

Copenhagen notes

Karin has compiled and posted the summary from the Copenhagen meeting, focused on the details for each of the working groups. Copenhagen notes

The files in the Google Drive folder need to be examined, and all non-public content needs to be removed before the deadline.

Dockerizing DINA-Web modules

The aim of the currently ongoing work is to package modules as docker containers that can deployed as components of a micro-service architecture. For example, data stored in a database server container is exposed to applications through REST APIs, for example the DINA-Web Collections REST API, the DINA-Web Media REST API and the DINA-Web Classifications REST API.

The efforts include the current REST APIs but also aims to address concerns and needs using a novel approach. For example, we wish to provide a minimalist "Data Wrangler's Platform" - a pre-packaged web-deployable toolbox for curators and other data wranglers that enables loading, transforming and importing/exporting of data into the system, and which can also be used to provide (interactive) reports as well as for learning or tutorial purposes.

Markus S reported on DINA-Web development activities, esp two new contributions available as repositories at GitHub.

  • The first is "dw-collections", a docker application providing a set of containers including the DINA-Web REST API v0 (exposing similar functionality as the Specify 7 web API and utilizing the current database schema) and a MySQL container with the sample db loaded as provided by the Specify team and a KeyCloak server container. Currently one can load sample data for a collection and users. Currently the db engine used is the latest mysql and a suggested future step is to provide containers with also mariadb and postgresdb and support for loading CSV files through the collections batch tool. The WAR files for the DINA-Web REST API v0 is on GitHub, but not yet the source code; this is also going to be added in a future step. Please take a look - in order to run it, one needs Docker and Docker Compose. A Keycloak demo will be scheduled after known issues are handled, such as character encoding errors in uploads. With this Docker setup, one can start running security and accessibility evaluation tools recommended by Al during recent meetings.
  • The second repository is "dw-classifications", which is a dockerized version of an integration project for the Pluto-F Classifications module (a Docker parallell of the Vagrant project at https://github.com/DINA-Web/dw-taxonomy). This docker application now includes a set of containers, including tools for loading data. Currently one can upload classifications data over the REST API, which ends up being stored in a postgres db. The load script currently supports only DynTaxa datasets (XML) and not yet loading of CSV data. See Readme file (https://github.com/DINA-Web/dw-collections/blob/master/README.md), and please submit improvements or suggestions to Markus S. An issue has been created on this github repo requesting new functionality to be added for batch loading .csv data using a load script.

There were discussions on the use of other back end databases (PostgreSQL and MariaDB), and implications of a shift from Vagrant to Docker, with an inquiry of whether Canada team would assist with the approaches under consideration. Changes to the information model proposed by James and Paul would be worked into later into docker after the work with the “embryo” is initiated. Ida implemented a REST API that runs with any database schema in the "dw-collections" project, so evolving this to provide versions after the v0 would be possible when we get increasingly mature iterations of new data models .

Additional considerations are management of foreign keys to work across modules and Global Unique Identifiers (GUIs). The primary advantage of Docker is the shift to many boxes (microservices) rather than several boxes, and the shift Vagrant is not "huge" and partly complementary. See: http://blog.dina-web.net

Paul's and James' db schema test in liquibase

Ingmar's has been in touch with Paul in order to test a liquibase implementation of the new information model. Current fixes are related to PostgreSQL while MariaDB fixes are considered for future development.

SeqDB update

Satpal and Ingimar gave an update on the Stockholm (CGI) local installation of the SeqDB system. Critical updates for Sanger sequencing have been completed, with the next modifications for refining genotype data handling based on CGI requirements from Niklas and Ingimar. Niklas will buy the printer although there is no free third-party alternative for the barcode printer software. SeqDB has recently been showcased as a new standard for molecular labs during recent presentations in Ottawa, suggesting that local institutional support for SeqDB could increase.

DINA TC workshop, Berlin

Answers from TC members on the Doodle suggest that the best dates for the group for the next face-to-face meeting are the first and second week of June.

Action Items

  • Thomas will send text to Glen on the referential integrity use case, Glen will create the corresponding UML interaction diagram.
  • TC members need to remove all sensitive material from the shared Google drive so it can be used as a public repository after 15 December.

Next Meeting

Wednesday, 13. January, 2016 15-17 (13-15 UTC)


This page was last modified on 12 January 2016, at 13:21. Content is available under Attribution-Share Alike Non-commercial 2.5 or later, Unported unless otherwise noted.