Generate a 3 tab format using a list of pubmed identifiers, a pdf or a free text file.
File labelling allows you to classify documents.
Identify entities within documents.
Link entities within documents with database or ontology identifiers.
Compare, combine and create "Gold standard" datasets from various curators.
View publically available Corpora.
Make your corpora available to everyone and contribute in to train better textmining systems.
Submit your corpus
Tutorials & Feedback
Learn how to use MyMiner tools.
Provide us your suggestions or comments.
Get involved, find/suggestions and help. MyMiner Google Group
MyMiner Mailling list
Links to other annotation tools.
Some links to textmining or bicuration intiatives.
How to create sets of abstract articles ?
To collect abstract articles, you need to use the "Create set" tool.
On this page you can add several pubmed identifier, separated by tabulations, spaces, commas, pipes, return, etc.
Once all the pubmeds identifiers added you can run the search.
The system will then collect several fields as title and abstract. All the retrieved data are shown on a result table.
You can next save your set of collected items for classification, annotation or linking.
How to use the "file labelling" tool ?
To label some items as abstracts, gene ontology terms, disease reports, you have to select the tool "File labelling" tool.
You can specify which file you need to classify.
The file must be composed by at least 3 elements. e.g. Pubmed_id, Title, Abstract, or GO_id, Go Term, Go definition ... Each element must be separeted by a tabulation.
You can also add a file containing the "rules" of the classification to help annotators in ambiguous cases.
Next you have to specify the options to label your entities. Here we want to classify articles related to muscles or not. So we use "Muscle" and "Other". Addition of options can be done by clicking on the link "add options".
A summary of the options used to classify is shown.
Classification can start ... To facilitate the labelling of entities, two textareas are available to define positive and negative keywords, sufixes, etc... End users can easily add some relevant terms to the topic to highlight. Here "musc", "myo", "cardi".
You can save your labelling task using the record button. This saved file can be used to continue labelling later.
During classification, labels and time needed to classify are recorded. These information can next be used to compare and create gold standard datasets.
How to compare several "manually" or "automatically" tagged sets?
If you want to compare several sets of articles, manually or automatically classified, you have to select within the menu the "Compare files" tool.
In the form, add the various sets of classified entities to compare, here we used one set of articles classified by three annotators.
You can add one or several sets of entities using the "add a file to compare" option.
Next, if you launch the tool using the "Go" button the comparison is shown.
The report is organized in several parts, the first part display several information of interest for each set, as the number of articles, all the options,...
A common statistical part is available, where users can get a detailed view of all the elements similarly or differentially tagged
The last part of the report, allows users to export a "Gold Standard" set. They can specify to retrieve only the entities, here articles, annotated similarly by two or three annotators.
The produced file can be next downloaded and used to train a classifier or to use it with a machine learning techniques.
How to tag entities within a scientific document with MyMiner?
To label some entities within a scientific document you can use the entity tagging tool.
You have, first, to upload the file containing the various documents to tag.
Once this step done, you can run the automatic ABNER tagging to determine some of the present entities.
Various options are available to facilitate the annotation of entities within a document, as "tag all occurrences", "undo" step, ...
Wrong ABNER predictions can be corrected by selecting the mistagged elements and by clicking on the correct tag.
You can also complete the automatic tagging by adding your own entities, as for instance, tag some "organisms" or "diseases" as shown in this movie.
Once the detection is done, you have to record the tagged text, by clicking on the "Record" button.
You can go to the next untagged article, repeat the various steps and save them.
How to report interactions within a document with the MyMiner?
To define interactions such as protein-protein interactions, you need to use the "Entity tagging" tool.
As for "tagging items" you have to upload a document containing at least 3 columns tabulation separated.
You can next use the option to detect automatically all the entities within the document, add or remove some of tags.
Once all the relevant biological items detected within the document, you can define relationship between identified elements, by clicking as shown on the tutorial video, on the two sided arrow.
This will open a matrix, containing all the tagged items. You just have to chech the corresponding check box to report an interaction or a link between items.
Tags and relations have to be saved once you have finished to define them.
Finally, you have to export the corresponding file.
How to link biological entities within a scientific document to specific database identifier ? (Normalisation)
To link some entities within a scientific document to database identifiers, you can use the "entity linking" tool.
You have, first, to upload a file composed by the various documents to link. Warning : You need to provide a file with 3 columns tabulation separated. The file must be composed by : identifier(only numbers as pubmed identifiers ), title, abstract.
Select next the various element you want to link (Organism, Protein, Disease).
You can next use elements previously tagged or add your own element to the different textarea to search into the database. Note : You have to submit the element using the add button to complete the list of element to search. Delete the useless or redundant element to gain in time.
Next launch the search of identifiers.
Check the box of the corresponding database identifier.
Redo the same operation for protein and disease (if you have selected them).
Save the corresponding linking and go to the next article.
Redo the same operations.
To finish, save the current task and export the file as shown of the demo video.
- Martin Krallinger - Marc Depaule - Elodie Drula - Ashish Tendulkar - Florian Leitner