Search anything and hit enter
  • Teams
  • Members
  • Projects
  • Events
  • Calls
  • Jobs
  • publications
  • Software
  • Tools
  • Network
  • Equipment

A little guide for advanced search:

  • Tip 1. You can use quotes "" to search for an exact expression.
    Example: "cell division"
  • Tip 2. You can use + symbol to restrict results containing all words.
    Example: +cell +stem
  • Tip 3. You can use + and - symbols to force inclusion or exclusion of specific words.
    Example: +cell -stem
e.g. searching for members in projects tagged cancer
Search for
Count
IN
OUT
Content 1
  • member
  • team
  • department
  • center
  • program_project
  • nrc
  • whocc
  • project
  • software
  • tool
  • patent
  • Administrative Staff
  • Assistant Professor
  • Associate Professor
  • Clinical Research Assistant
  • Full Professor
  • Graduate Student
  • Lab assistant
  • Non-permanent Researcher
  • Permanent Researcher
  • Pharmacist
  • PhD Student
  • Physician
  • Post-doc
  • Project Manager
  • Research Associate
  • Research Engineer
  • Retired scientist
  • Technician
  • Undergraduate Student
  • Veterinary
  • Visiting Scientist
  • Deputy Director of Center
  • Deputy Director of Department
  • Deputy Director of National Reference Center
  • Deputy Head of Facility
  • Director of Center
  • Director of Department
  • Director of Institute
  • Director of National Reference Center
  • Group Leader
  • Head of Facility
  • Head of Operations
  • Head of Structure
  • Honorary President of the Departement
  • Labex Coordinator
Content 2
  • member
  • team
  • department
  • center
  • program_project
  • nrc
  • whocc
  • project
  • software
  • tool
  • patent
  • Administrative Staff
  • Assistant Professor
  • Associate Professor
  • Clinical Research Assistant
  • Full Professor
  • Graduate Student
  • Lab assistant
  • Non-permanent Researcher
  • Permanent Researcher
  • Pharmacist
  • PhD Student
  • Physician
  • Post-doc
  • Project Manager
  • Research Associate
  • Research Engineer
  • Retired scientist
  • Technician
  • Undergraduate Student
  • Veterinary
  • Visiting Scientist
  • Deputy Director of Center
  • Deputy Director of Department
  • Deputy Director of National Reference Center
  • Deputy Head of Facility
  • Director of Center
  • Director of Department
  • Director of Institute
  • Director of National Reference Center
  • Group Leader
  • Head of Facility
  • Head of Operations
  • Head of Structure
  • Honorary President of the Departement
  • Labex Coordinator
Search
Go back
Scroll to top
Share
© Research
Publication : PloS one

An exhaustive, non-euclidean, non-parametric data mining tool for unraveling the complexity of biological systems–novel insights into malaria

Scientific Fields
Diseases
Organisms
Applications
Technique

Published in PloS one - 09 Sep 2011

Loucoubar C, Paul R, Bar-Hen A, Huret A, Tall A, Sokhna C, Trape JF, Ly AB, Faye J, Badiane A, Diakhaby G, Sarr FD, Diop A, Sakuntabhai A, Bureau JF

Link to Pubmed [PMID] – 21931645

PLoS ONE 2011;6(9):e24085

Complex, high-dimensional data sets pose significant analytical challenges in the post-genomic era. Such data sets are not exclusive to genetic analyses and are also pertinent to epidemiology. There has been considerable effort to develop hypothesis-free data mining and machine learning methodologies. However, current methodologies lack exhaustivity and general applicability. Here we use a novel non-parametric, non-euclidean data mining tool, HyperCube®, to explore exhaustively a complex epidemiological malaria data set by searching for over density of events in m-dimensional space. Hotspots of over density correspond to strings of variables, rules, that determine, in this case, the occurrence of Plasmodium falciparum clinical malaria episodes. The data set contained 46,837 outcome events from 1,653 individuals and 34 explanatory variables. The best predictive rule contained 1,689 events from 148 individuals and was defined as: individuals present during 1992-2003, aged 1-5 years old, having hemoglobin AA, and having had previous Plasmodium malariae malaria parasite infection ≤10 times. These individuals had 3.71 times more P. falciparum clinical malaria episodes than the general population. We validated the rule in two different cohorts. We compared and contrasted the HyperCube® rule with the rules using variables identified by both traditional statistical methods and non-parametric regression tree methods. In addition, we tried all possible sub-stratified quantitative variables. No other model with equal or greater representativity gave a higher Relative Risk. Although three of the four variables in the rule were intuitive, the effect of number of P. malariae episodes was not. HyperCube® efficiently sub-stratified quantitative variables to optimize the rule and was able to identify interactions among the variables, tasks not easy to perform using standard data mining methods. Search of local over density in m-dimensional space, explained by easily interpretable rules, is thus seemingly ideal for generating hypotheses for large datasets to unravel the complexity inherent in biological systems.

http://www.ncbi.nlm.nih.gov/pubmed/21931645