Partitioning of the NCI Cancer Ontology

The NCI Thésaurus is a public domain description logic-based terminology produced by the National Cancer Institute. The OWL version is almost 33 megabyte in size and contains more than 17.000 concepts.

Original

Modules

The original graph file (see above, quarter used for partitioning) and the cluster-file probably is the best starting point if you want to play with the partitioning yourself.

Sizes and heights of modules

partition name size height
10 Document Type 96 0,08
16 Population Group 71 0,13
27 Technology 50 0,13
1 sites of care delivery 29 0,13
15 Social Concept 66 0,14
26 Business Rules 61 0,17
42 Biology 41 0,17
4 NCI Advisory boards and Groups 32 0,17
21 Clinical Nursing 15 0,17
97 Cell Line 46 0,2
98 Model System 42 0,2
25 Information Sciences 27 0,2
90 Training and Education 15 0,2
56 Occupation or Discipline 69 0,25
9 Component of the NCI 65 0,25
54 Pathology etc 43 0,25
8 Professional Organization or Group 31 0,25
13 Funding Categories 30 0,25
55 Epidemiology 30 0,25
20 Research Career Programs, K-series 24 0,25
95 Miscelleneous Molecular Biology Terms 23 0,25
47 Sociology 20 0,25
57 Public Health 18 0,25
84 Carciogenesis Mechanism 17 0,25
96 Cancer Biology 17 0,25
48 Medical Economics 14 0,25
37 Funding 98 0,33
63 Medicine 40 0,33
12 Pharmacology 20 0,33
24 Patient or Public Education 20 0,33
93 Specimen 19 0,33
83 Database 15 0,33
89 Board Certification 12 0,33
29 Costs 9 0,33
104 Information and Media 6 0,33
94 Cancer Science 4 0,33
102 NCI Adminitrative Concept 4 0,33
103 NCI Kind 4 0,33
108 Conceptual Entity 133 0,5
80 Clinical Sciences 89 0,5
78 Chemistry 28 0,5
76 Nursing 22 0,5
62 Clinical Medicine 15 0,5
105 Media 15 0,5
52 Physics, Physical Chemistry 14 0,5
18 Radiology 12 0,5
5 Component of the Office of the Director 10 0,5
46 Oncologist 7 0,5
92 Tissue Sample 7 0,5
51 Disease Outcome 6 0,5
53 Nucleic Acid Biochemistry 6 0,5
45 Social Sciences 5 0,5
59 Pediatrics 5 0,5
61 Cellular Immunology 5 0,5
77 Physical sciences 5 0,5
100 Prognosis 5 0,5
60 Risk 4 0,5
79 Mathematics 3 0,5
2 country 239 1
75 Physiology 18 1
3 Other Agency or organization 13 1
11 Epidemology Factors 13 1
14 Toxicology 12 1
17 Surgeon 11 1
19 Cythera ES Cell Line 10 1
23 Clinical Nurse Specialist 10 1
22 Technion ES Cell Line 9 1
81 Geron ES Cell Line 8 1
82 Reliance ES Cell Line 8 1
6 Clinical Trials Cooperative Group 7 1
28 AIDS Treatment Research 7 1
38 Protein Biochemistry 7 1
39 Organic Chemistry 7 1
40 Public Health Nursing 7 1
41 Radiation Biology 7 1
85 ES ES Cell Line 7 1
86 Staging System 7 1
87 Physical Map of Huma Genome 7 1
88 Karolinska ES Cell Line 7 1
43 Biophysics 6 1
44 Developmental Physiology, General 6 1
58 Risk Estimates 6 1
91 Wincounsin ES Cell Line 6 1
7 Oncology Group 5 1
33 Program Announcements 5 1
36 Fellowship Programs 5 1
49 Psychiatry 5 1
50 Miscelleneous Chemistry Terms 5 1
73 Internal Medicine 5 1
35 Research Programm Projects and Centers, P-Series 4 1
99 Biochemical Pathway 4 1
30 Predoctoral Individual National Research Service Award 3 1
32 Intramural Research Award 3 1
34 Funding Opportunities 3 1
65 Combinatorial Syntheses 3 1
68 Microbial Genetics 3 1
69 Virology, DNA Viruses Papovavirus 3 1
70 Biology of HIV Infection 3 1
71 Sensory Physiology 3 1
74 Hematology 3 1
101 PDQ Information System 3 1
31 Grants 2 1
64 Virology, DNA Viruses General 2 1
66 Synthesis Chemistry 2 1
67 Nursing Sciences General 2 1
72 Neurophysiology 2 1
106 Global Change 2 1
107 Category 2 1