Change search
Link to record
Permanent link

Direct link
BETA
Alternative names
Publications (10 of 36) Show all publications
Corander, J., Diekmann, O. & Koski, T. (2016). A tribute to Mats Gyllenberg, on the occasion of his 60th birthday. Journal of Mathematical Biology, 72(4), 793-795
Open this publication in new window or tab >>A tribute to Mats Gyllenberg, on the occasion of his 60th birthday
2016 (English)In: Journal of Mathematical Biology, ISSN 0303-6812, E-ISSN 1432-1416, Vol. 72, no 4, p. 793-795Article in journal (Refereed) Published
National Category
Mathematics
Identifiers
urn:nbn:se:kth:diva-183611 (URN)10.1007/s00285-016-0965-9 (DOI)000370269200001 ()26815046 (PubMedID)2-s2.0-84958113835 (Scopus ID)
Note

QC 20160319

Available from: 2016-03-19 Created: 2016-03-18 Last updated: 2017-11-30Bibliographically approved
Koski, T., Sandström, E. & Sandström, U. (2016). Towards field-adjusted production: Estimating research productivity from a zero-truncated distribution. Journal of Informetrics, 10(4), 1143-1152
Open this publication in new window or tab >>Towards field-adjusted production: Estimating research productivity from a zero-truncated distribution
2016 (English)In: Journal of Informetrics, ISSN 1751-1577, E-ISSN 1875-5879, Vol. 10, no 4, p. 1143-1152Article in journal (Refereed) Published
Abstract [en]

Measures of research productivity (e.g. peer reviewed papers per researcher) is a fundamental part of bibliometric studies, but is often restricted by the properties of the data available. This paper addresses that fundamental issue and presents a detailed method for estimation of productivity (peer reviewed papers per researcher) based on data available in bibliographic databases (e.g. Web of Science and Scopus). The method can, for example, be used to estimate average productivity in different fields, and such field reference values can be used to produce field adjusted production values. Being able to produce such field adjusted production values could dramatically increase the relevance of bibliometric rankings and other bibliometric performance indicators. The results indicate that the estimations are reasonably stable given a sufficiently large data set.

Place, publisher, year, edition, pages
Elsevier, 2016
Keywords
Research productivity, Waring distribution, Field adjusted production, Size-dependent indicators
National Category
Information Studies Other Social Sciences
Research subject
Industrial Engineering and Management
Identifiers
urn:nbn:se:kth:diva-197085 (URN)10.1016/j.joi.2016.09.002 (DOI)000389548900019 ()2-s2.0-84992025500 (Scopus ID)
Funder
Riksbankens Jubileumsfond, P12-1302:1
Note

QC 20170109

Available from: 2016-11-29 Created: 2016-11-29 Last updated: 2017-11-29Bibliographically approved
Westerlind, H., Imrell, K., Ramanujam, R., Myhr, K.-M. -., Celius, E. G., Harbo, H. F., . . . Hillert, J. (2015). Identity-by-descent mapping in a Scandinavian multiple sclerosis cohort. European Journal of Human Genetics, 23(5), 688-692
Open this publication in new window or tab >>Identity-by-descent mapping in a Scandinavian multiple sclerosis cohort
Show others...
2015 (English)In: European Journal of Human Genetics, ISSN 1018-4813, E-ISSN 1476-5438, Vol. 23, no 5, p. 688-692Article in journal (Refereed) Published
Abstract [en]

In an attempt to map chromosomal regions carrying rare gene variants contributing to the risk of multiple sclerosis (MS), we identified segments shared identical-by-descent (IBD) using the software BEAGLE 4.0's refined IBD analysis. IBD mapping aims at identifying segments inherited from a common ancestor and shared more frequently in case-case pairs. A total of 2106 MS patients of Nordic origin and 624 matched controls were genotyped on Illumina Human Quad 660 chip and an additional 1352 ethnically matched controls typed on Illumina HumanHap 550 and Illumina 1M were added. The quality control left a total of 441 731 markers for the analysis. After identification of segments shared by descent and significance testing, a filter function for markers with low IBD sharing was applied. Four regions on chromosomes 5, 9, 14 and 19 were found to be significantly associated with the risk for MS. However, all markers but for one were located telomerically, including the very distal markers. For methodological reasons, such segments have a low sharing of IBD signals and are prone to be false positives. One marker on chromosome 19 reached genome-wide significance and was not one of the distal markers. This marker was located within the GNA11 gene, which contains no previous association with MS. We conclude that IBD mapping is not sufficiently powered to identify MS risk loci even in ethnically relatively homogenous populations, or that alternatively rare variants are not adequately present.

Keywords
Coronary-Artery-Disease, Susceptibility, Population, Linkage, Locus, Association, Haplotype, Risk, Gene
National Category
Biochemistry and Molecular Biology
Identifiers
urn:nbn:se:kth:diva-166905 (URN)10.1038/ejhg.2014.155 (DOI)000353028200022 ()25159868 (PubMedID)2-s2.0-84928007936 (Scopus ID)
Funder
Wellcome trust
Note

QC 20150608

Available from: 2015-06-08 Created: 2015-05-21 Last updated: 2017-12-04Bibliographically approved
Pensar, J., Nyman, H., Koski, T. & Corander, J. (2015). Labeled directed acyclic graphs: a generalization of context-specific independence in directed graphical models. Data mining and knowledge discovery, 29(2), 503-533
Open this publication in new window or tab >>Labeled directed acyclic graphs: a generalization of context-specific independence in directed graphical models
2015 (English)In: Data mining and knowledge discovery, ISSN 1384-5810, E-ISSN 1573-756X, Vol. 29, no 2, p. 503-533Article in journal (Refereed) Published
Abstract [en]

We introduce a novel class of labeled directed acyclic graph (LDAG) models for finite sets of discrete variables. LDAGs generalize earlier proposals for allowing local structures in the conditional probability distribution of a node, such that unrestricted label sets determine which edges can be deleted from the underlying directed acyclic graph (DAG) for a given context. Several properties of these models are derived, including a generalization of the concept of Markov equivalence classes. Efficient Bayesian learning of LDAGs is enabled by introducing an LDAG-based factorization of the Dirichlet prior for the model parameters, such that the marginal likelihood can be calculated analytically. In addition, we develop a novel prior distribution for the model structures that can appropriately penalize a model for its labeling complexity. A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill climbing approach is used for illustrating the useful properties of LDAG models for both real and synthetic data sets.

Keywords
Directed acyclic graph, Graphical model, Context-specific independence, Bayesian model learning, Markov chain Monte Carlo
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-161600 (URN)10.1007/s10618-014-0355-0 (DOI)000349369300007 ()2-s2.0-84923215211 (Scopus ID)
Note

QC 20150325

Available from: 2015-03-25 Created: 2015-03-13 Last updated: 2018-01-11Bibliographically approved
Nyman, H., Pensar, J., Koski, T. & Corander, J. (2014). Stratified Graphical Models: Context-Specific Independence in Graphical Models. BAYESIAN ANAL, 9(4), 883-908
Open this publication in new window or tab >>Stratified Graphical Models: Context-Specific Independence in Graphical Models
2014 (English)In: BAYESIAN ANAL, ISSN 1931-6690, Vol. 9, no 4, p. 883-908Article in journal (Refereed) Published
Abstract [en]

Theory of graphical models has matured over more than three decades to provide the backbone for several classes of models that are used in a myriad of applications such as genetic mapping of diseases, credit risk evaluation, reliability and computer security. Despite their generic applicability and wide adoption, the constraints imposed by undirected graphical models and Bayesian networks have also been recognized to be unnecessarily stringent under certain circumstances. This observation has led to the proposal of several generalizations that aim at more relaxed constraints by which the models can impose local or context-specific dependence structures. Here we consider an additional class of such models, termed stratified graphical models. We develop a method for Bayesian learning of these models by deriving an analytical expression for the marginal likelihood of data under a specific subclass of decomposable stratified models. A non-reversible Markov chain Monte Carlo approach is further used to identify models that are highly supported by the posterior distribution over the model space. Our method is illustrated and compared with ordinary graphical models through application to several real and synthetic datasets.

Keywords
Graphical Model, Context-Specific Interaction Model, Markov Chain Monte Carlo, Bayesian Model Learning, Multivariate Discrete Distribution
National Category
Mathematics
Identifiers
urn:nbn:se:kth:diva-159988 (URN)10.1214/14-BA882 (DOI)000347547500008 ()2-s2.0-84920285419 (Scopus ID)
Note

QC 20150225

Available from: 2015-02-25 Created: 2015-02-12 Last updated: 2015-02-25Bibliographically approved
Corander, J., Koski, T., Pavlenko, T. & Tillander, A. (2013). Bayesian block-diagonal predictive classifier for Gaussian data. In: Synergies of Soft Computing and Statistics for Intelligent Data Analysis. Paper presented at 6th International Conference on Soft Methods in Probability and Statistics, SMPS 2012, 4 October 2012 through 6 October 2012, Konstanz (pp. 543-551). Springer
Open this publication in new window or tab >>Bayesian block-diagonal predictive classifier for Gaussian data
2013 (English)In: Synergies of Soft Computing and Statistics for Intelligent Data Analysis, Springer, 2013, p. 543-551Conference paper, Published paper (Refereed)
Abstract [en]

The paper presents a method for constructing Bayesian predictive classifier in a high-dimensional setting. Given that classes are represented by Gaussian distributions with block-structured covariance matrix, a closed form expression for the posterior predictive distribution of the data is established. Due to factorization of this distribution, the resulting Bayesian predictive and marginal classifier provides an efficient solution to the high-dimensional problem by splitting it into smaller tractable problems. In a simulation study we show that the suggested classifier outperforms several alternative algorithms such as linear discriminant analysis based on block-wise inverse covariance estimators and the shrunken centroids regularized discriminant analysis.

Place, publisher, year, edition, pages
Springer, 2013
Series
Advances in Intelligent Systems and Computing, ISSN 2194-5357 ; 190 AISC
Keywords
Covariance estimators, discriminant analysis, high-dimensional data, hyperparameters
National Category
Computer and Information Sciences Mathematics
Identifiers
urn:nbn:se:kth:diva-117792 (URN)10.1007/978-3-642-33042-1_58 (DOI)000312969600058 ()2-s2.0-84870759465 (Scopus ID)978-364233041-4 (ISBN)
Conference
6th International Conference on Soft Methods in Probability and Statistics, SMPS 2012, 4 October 2012 through 6 October 2012, Konstanz
Note

QC 20130205

Available from: 2013-02-05 Created: 2013-02-05 Last updated: 2018-01-11Bibliographically approved
Corander, J., Cui, Y., Koski, T. & Sirén, J. (2013). Have I seen you before?: Principles of Bayesian predictive classification revisited. Statistics and computing, 23(1), 59-73
Open this publication in new window or tab >>Have I seen you before?: Principles of Bayesian predictive classification revisited
2013 (English)In: Statistics and computing, ISSN 0960-3174, E-ISSN 1573-1375, Vol. 23, no 1, p. 59-73Article in journal (Refereed) Published
Abstract [en]

A general inductive Bayesian classification framework is considered using a simultaneous predictive distribution for test items. We introduce a principle of generative supervised and semi-supervised classification based on marginalizing the joint posterior distribution of labels for all test items. The simultaneous and marginalized classifiers arise under different loss functions, while both acknowledge jointly all uncertainty about the labels of test items and the generating probability measures of the classes. We illustrate for data from multiple finite alphabets that such classifiers achieve higher correct classification rates than a standard marginal predictive classifier which labels all test items independently, when training data are sparse. In the supervised case for multiple finite alphabets the simultaneous and the marginal classifiers are proven to become equal under generalized exchangeability when the amount of training data increases. Hence, the marginal classifier can be interpreted as an asymptotic approximation to the simultaneous classifier for finite sets of training data. It is also shown that such convergence is not guaranteed in the semi-supervised setting, where the marginal classifier does not provide a consistent approximation.

Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2013
Keywords
Classification, Exchangeability, Inductive learning, Predictive inference
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-88088 (URN)10.1007/s11222-011-9291-7 (DOI)000313731400005 ()2-s2.0-84872607314 (Scopus ID)
Funder
EU, European Research Council, 239784
Note

QC 20130204

Available from: 2012-02-14 Created: 2012-02-14 Last updated: 2018-01-12Bibliographically approved
Corander, J., Cui, Y. & Koski, T. (2013). Inductive Inference and Partition Exchangeability in Classification. In: Dowe, David L. (Ed.), Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence: Papers from the Ray Solomonoff 85th Memorial Conference.. Paper presented at Ray Solomonoff 85th Memorial Conference on Algorithmic Probability and Friends: Bayesian Prediction and Artificial Intelligence; Melbourne, VIC; Australia; 30 November 2011 through 2 December 2011 (pp. 91-105). Springer Berlin/Heidelberg
Open this publication in new window or tab >>Inductive Inference and Partition Exchangeability in Classification
2013 (English)In: Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence: Papers from the Ray Solomonoff 85th Memorial Conference. / [ed] Dowe, David L., Springer Berlin/Heidelberg, 2013, p. 91-105Conference paper, Published paper (Refereed)
Abstract [en]

Inductive inference has been a subject of intensive research efforts over several decades. In particular, for classification problems substantial advances have been made and the field has matured into a wide range of powerful approaches to inductive inference. However, a considerable challenge arises when deriving principles for an inductive supervised classifier in the presence of unpredictable or unanticipated events corresponding to unknown alphabets of observable features. Bayesian inductive theories based on de Finetti type exchangeability which have become popular in supervised classification do not apply to such problems. Here we derive an inductive supervised classifier based on partition exchangeability due to John Kingman. It is proven that, in contrast to classifiers based on de Finetti type exchangeability which can optimally handle test items independently of each other in the presence of infinite amounts of training data, a classifier based on partition exchangeability still continues to benefit from a joint prediction of labels for the whole population of test items. Some remarks about the relation of this work to generic convergence results in predictive inference are also given.

Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2013
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 7070
Keywords
Bayesian learning, classification, exchageabily, inductive inference
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-137054 (URN)2-s2.0-84893200464 (Scopus ID)978-3-642-44957-4 (ISBN)978-3-642-44958-1 (ISBN)
Conference
Ray Solomonoff 85th Memorial Conference on Algorithmic Probability and Friends: Bayesian Prediction and Artificial Intelligence; Melbourne, VIC; Australia; 30 November 2011 through 2 December 2011
Funder
Swedish Research Council, 90583401
Note

QC 20140214

Available from: 2013-12-10 Created: 2013-12-10 Last updated: 2017-04-28Bibliographically approved
Corander, J., Xiong, J., Cui, Y. & Koski, T. (2013). Optimal Viterbi Bayesian predictive classification for data from finite alphabets. Journal of Statistical Planning and Inference, 143(2), 261-275
Open this publication in new window or tab >>Optimal Viterbi Bayesian predictive classification for data from finite alphabets
2013 (English)In: Journal of Statistical Planning and Inference, ISSN 0378-3758, E-ISSN 1873-1171, Vol. 143, no 2, p. 261-275Article in journal (Refereed) Published
Abstract [en]

A family of Viterbi Bayesian predictive classifiers has been recently popularized for speech recognition applications with continuous acoustic signals modeled by finite mixture densities embedded in a hidden Markov framework. Here we generalize such classifiers to sequentially observed data from multiple finite alphabets and derive the optimal predictive classifier under exchangeability of the emitted symbols. We demonstrate that the optimal predictive classifier which learns from unlabelled test items improves considerably upon marginal maximum a posteriori rule in the presence of sparse training data. It is shown that the learning process saturates when the amount of test data tends to infinity, such that no further gain in classification accuracy is possible upon arrival of new test items in the long run.

Keywords
Bayesian learning, Hidden Markov models, Predictive classification
National Category
Mathematics
Identifiers
urn:nbn:se:kth:diva-107602 (URN)10.1016/j.jspi.2012.07.013 (DOI)000310942200004 ()2-s2.0-84867736475 (Scopus ID)
Note

QC 20121214

Available from: 2012-12-14 Created: 2012-12-14 Last updated: 2017-12-06Bibliographically approved
Koski, T. & Noble, J. (2012). A Review of Bayesian Networks and Structure Learning. Mathematica Applicanda (Matematyka Stosowana), 40(1), 51-103
Open this publication in new window or tab >>A Review of Bayesian Networks and Structure Learning
2012 (English)In: Mathematica Applicanda (Matematyka Stosowana), ISSN 2299-4009, Vol. 40, no 1, p. 51-103Article in journal (Refereed) Published
Abstract [en]

This article reviews the topic of Bayesian networks. A Bayesian networkis a factorisation of a probability distribution along a directed acyclic graph. Therelation between graphicald-separation and independence is described. A short ar-ticle from 1853 by Arthur Cayley [8] is discussed, which contains several ideas laterused in Bayesian networks: factorisation, the noisy ‘or’ gate, applications of algebraicgeometry to Bayesian networks. The ideas behind Pearl’s intervention calculus whenthe DAG represents acausaldependence structure and the relation between the workof Cayley and Pearl is commented on.Most of the discussion is about structure learning, outlining the two main approaches,search and score versus constraint based. Constraint based algorithms often rely onthe assumption offaithfulness, that the data to which the algorithm is applied isgenerated from distributions satisfying a faithfulness assumption where graphicald-separation and independence are equivalent. The article presents some considerationsfor constraint based algorithms based on recent data analysis, indicating a variety ofsituations where the faithfulness assumption does not hold. There is a short discussionabout the causal discovery controversy, the idea thatcausalrelations may be learnedfrom data.

Place, publisher, year, edition, pages
Polish Mathematical Society, 2012
Keywords
directed acyclic graphs, intervention calulus, Markov graphical models, Markov equivalence
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-137073 (URN)
Funder
Swedish Research Council, 90583401
Note

QC 20131217

Available from: 2013-12-10 Created: 2013-12-10 Last updated: 2013-12-17Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-1489-8512

Search in DiVA

Show all publications