the myTea ProjectHome | myTea vLab | myTea vBench Close Up (Geek View) | vLab Software Archtecture | Downloads The myTea vBench: closeupOverviewThe main component of the myTea vLab is the Bench. The bench coordinates a variety of data: (1) data such as sequences collected by scientists (2) automatically generated provenance data, such as when and where that data has been stored and manipulated and (3) the scientists' own annotations about that data. We will be producing wrappers for application developers so that applications they create can communicate with the Bench to take advantage of its services. The vBench consists of two core components the Job Manager which keeps track of activities on the Bench and the Datastore which acts as a repository for data produced by the Bench and is used by the Report. vBench in Detail (ie, Geek View)The vBench is an application based on the myTea framework, and BioJava. The application consists of three main visible components,
The Data ManagerThe data manager allows the importing of sequence data from any format understood by BioJava. This data is stored in the data manager permanently, so the bioinformatician will never loose their source data and can always trace back their work to it. Data can be referenced in the Data Store by reports. Every piece of data has a unique identifier called an LSID, which is universal to the life sciences community. Once the scientist has imported some sequences, they can create collections of associated sequences and apply tags as memory cues or arrange them in any hierarchical framework they desire for their referencing convenience. These sequence collections can then be exported to any commonly used bioinformatics file type, or have work done on them from within the myTea system, by for example invoking an alignment tool on them. Should the bioinformatician wish to prune sequences from a collection, they can do so, while retaining the original set for future reference. In this way, the bioinformatician saves on the tedious process of editing and saving many versions of the same sequence file, while running the risk of loosing their original data or not having explanation for why a particular sequence was removed. Any process the scientist initiates within the myTea system is annotatable to facilitate the recall and future understanding of the reasoning behind the scientist's actions. The Job Manager ("the heart of the Bench")This is the screen within which the scientist does their work. It allows the bioinformatician to:
The tree of jobs executed on data and the data retrieved from these processes can be examined firstly in the single branch view of the current job tree along the top of the window, which displays the current activity or data and its history, or in the separate job tree view. This allows the bioinformatician to quickly trace any results back to their source data The Report CreatorIn this window, the scientist can construct reports of their work by searching through and collating relevant events from their research history. They can then annotate this list of events and add headings and hierachy as they would in a normal report. The data associated with any event is immediately accessible and can be filtered and edited to display the most relevant results. Once the scientist is happy with the final result, they can output the report to various standard formats and print them off or just save them. What is the myTea framework?The myTea architecture provides a plug and play style framework, written purely in Java and OWL, which is comprised of the following components:
The system works in a client / server architecture, however, it is designed to operate on a local workstation alone so that a scientist can have constant and private access to all their data and have complete control over the system. In other words, the client and server both run on the local workstation. This does not mean however that services and applications outside the workstation cannot be used, only that the data sent to and received from these services is retained locally at the scientist's fingertips. The server side of the system communicates to any client components by Java RMI or web-services The Data Store - dataStore.jarThis is the storage infrastructure for the myTea system. It is a triple store running on the Sesame framework. The data is governed by a combination of the myTea ontology, the myGrid ontology, and an ontology that reflects the BioJava class hierarchy. The ontologies are all written in OWL. The data store has an interface that allows the creation of BioJava and myTea classes in triples as well as methods by which data can be changed, removed or added. vBench - vBench.jarThis is the heart of the job execution and service locating system. Applications can:
The Report Generator - report.jarThis final component of the myTea architecture is the only component that has a GUI element within the architecture. Any application can register events with the myTea system that correspond to actions a scientist has taken. For example, importing some sequences. This event is associated with some data and stored in the data store. The Report Generator poles the data store for events that have occured within the myTea system and displays them on the Events Screen. These events are filterable, searchable and browsable, allowing the scientist to quickly and easily find specific events in their process history or to browse their events to see what they've done. The scientist then selects events from the Events Screen that they wish to include in a report. Once selected or dragged across, these events are visible within the Report Screen. From here, content can be added in the form of further annotations, re-arranging and removing data from the report, adding hierachy and other normal report manipulating techniques. Once finished, the scientist can save the report in a standard format such as XHTML for printing or submission. Comments?Have thoughts about the myTea vBench? Please post your comments to the myTea forum. |