Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 31) Show all publications
Buda, M., Maki, A. & Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106, 249-259
Open this publication in new window or tab >>A systematic study of the class imbalance problem in convolutional neural networks
2018 (English)In: Neural Networks, ISSN 0893-6080, E-ISSN 1879-2782, Vol. 106, p. 249-259Article in journal (Refereed) Published
Abstract [en]

In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks (CNNs) and compare frequently used methods to address the issue. Class imbalance is a common problem that has been comprehensively studied in classical machine learning, yet very limited systematic research is available in the context of deep learning. In our study, we use three benchmark datasets of increasing complexity, MNIST, CIFAR-10 and ImageNet, to investigate the effects of imbalance on classification and perform an extensive comparison of several methods to address the issue: oversampling, undersampling, two-phase training, and thresholding that compensates for prior class probabilities. Our main evaluation metric is area under the receiver operating characteristic curve (ROC AUC) adjusted to multi-class tasks since overall accuracy metric is associated with notable difficulties in the context of imbalanced data. Based on results from our experiments we conclude that (i) the effect of class imbalance on classification performance is detrimental; (ii) the method of addressing class imbalance that emerged as dominant in almost all analyzed scenarios was oversampling; (iii) oversampling should be applied to the level that completely eliminates the imbalance, whereas the optimal undersampling ratio depends on the extent of imbalance; (iv) as opposed to some classical machine learning models, oversampling does not cause overfitting of CNNs; (v) thresholding should be applied to compensate for prior class probabilities when overall number of properly classified cases is of interest. 

Place, publisher, year, edition, pages
PERGAMON-ELSEVIER SCIENCE LTD, 2018
Keywords
Class imbalance, Convolutional neural networks, Deep learning, Image classification
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-235561 (URN)10.1016/j.neunet.2018.07.011 (DOI)000445015200021 ()30092410 (PubMedID)2-s2.0-85050996431 (Scopus ID)
Note

QC 20181001

Available from: 2018-10-01 Created: 2018-10-01 Last updated: 2018-10-01Bibliographically approved
Nordström, M., Hult, H., Maki, A. & Löfman, F. (2018). Pareto Dose Prediction Using Fully Convolutional Networks Operating in 3D. Paper presented at 60th Annual Meeting of the American-Association-of-Physicists-in-Medicine, JUL 29-AUG 02, 2018, Nashville, TN. Medical physics (Lancaster), 45(6), E176-E176
Open this publication in new window or tab >>Pareto Dose Prediction Using Fully Convolutional Networks Operating in 3D
2018 (English)In: Medical physics (Lancaster), ISSN 0094-2405, Vol. 45, no 6, p. E176-E176Article in journal, Meeting abstract (Other academic) Published
Place, publisher, year, edition, pages
WILEY, 2018
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-232417 (URN)10.1002/mp.12938 (DOI)000434978000213 ()
Conference
60th Annual Meeting of the American-Association-of-Physicists-in-Medicine, JUL 29-AUG 02, 2018, Nashville, TN
Note

QC 20180726

Available from: 2018-07-26 Created: 2018-07-26 Last updated: 2018-07-26Bibliographically approved
Nawata, S., Maki, A. & Takashi, H. (2018). Power packet transferability via symbol propagation matrix. Proceedings of the Royal Society. Mathematical, Physical and Engineering Sciences, 474(2213), Article ID 20170552.
Open this publication in new window or tab >>Power packet transferability via symbol propagation matrix
2018 (English)In: Proceedings of the Royal Society. Mathematical, Physical and Engineering Sciences, ISSN 1364-5021, E-ISSN 1471-2946, Vol. 474, no 2213, article id 20170552Article in journal (Refereed) Published
Abstract [en]

A power packet is a unit of electric power composed of a power pulse and an information tag. In Shannon’s information theory, messages are represented by symbol sequences in a digitized manner. Referring to this formulation, we define symbols in power packetization as a minimum unit of power transferred by a tagged pulse. Here, power is digitized and quantized. In this paper, we consider packetized power in networks for a finite duration, giving symbols and their energies to the networks. A network structure is defined using a graph whose nodes represent routers, sources and destinations. First, we introduce the concept of a symbol propagation matrix (SPM) in which symbols are transferred at links during unit times. Packetized power is described as a network flow in a spatio-temporal structure. Then, we study the problem of selecting an SPM in terms of transferability, that is, the possibility to represent given energies at sources and destinations during the finite duration. To select an SPM, we consider a network flow problem of packetized power. The problem is formulated as an M-convex submodular flow problem which is a solvable generalization of the minimum cost flow problem. Finally, through examples, we verify that this formulation provides reasonable packetized power.

Place, publisher, year, edition, pages
Royal Society Publishing, 2018
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Electrical Engineering
Identifiers
urn:nbn:se:kth:diva-228464 (URN)10.1098/rspa.2017.0552 (DOI)000433498800003 ()2-s2.0-85047560728 (Scopus ID)
Note

QC 20180612

Available from: 2018-05-24 Created: 2018-05-24 Last updated: 2018-06-19Bibliographically approved
Olczak, J., Fahlberg, N., Maki, A., Razavian, A. S., Jilert, A., Stark, A., . . . Gordon, M. (2017). Artificial intelligence for analyzing orthopedic trauma radiographs Deep learning algorithms-are they on par with humans for diagnosing fractures?. Acta Orthopaedica, 88(6), 581-586
Open this publication in new window or tab >>Artificial intelligence for analyzing orthopedic trauma radiographs Deep learning algorithms-are they on par with humans for diagnosing fractures?
Show others...
2017 (English)In: Acta Orthopaedica, ISSN 1745-3674, E-ISSN 1745-3682, Vol. 88, no 6, p. 581-586Article in journal (Refereed) Published
Abstract [en]

Background and purpose - Recent advances in artificial intelligence (deep learning) have shown remarkable performance in classifying non-medical images, and the technology is believed to be the next technological revolution. So far it has never been applied in an orthopedic setting, and in this study we sought to determine the feasibility of using deep learning for skeletal radiographs. Methods - We extracted 256,000 wrist, hand, and ankle radiographs from Danderyd's Hospital and identified 4 classes: fracture, laterality, body part, and exam view. We then selected 5 openly available deep learning networks that were adapted for these images. The most accurate network was benchmarked against a gold standard for fractures. We furthermore compared the network's performance with 2 senior orthopedic surgeons who reviewed images at the same resolution as the network. Results - All networks exhibited an accuracy of at least 90% when identifying laterality, body part, and exam view. The final accuracy for fractures was estimated at 83% for the best performing network. The network performed similarly to senior orthopedic surgeons when presented with images at the same resolution as the network. The 2 reviewer Cohen's kappa under these conditions was 0.76. Interpretation - This study supports the use for orthopedic radiographs of artificial intelligence, which can perform at a human level. While current implementation lacks important features that surgeons require, e.g. risk of dislocation, classifications, measurements, and combining multiple exam views, these problems have technical solutions that are waiting to be implemented for orthopedics.

National Category
Orthopaedics
Identifiers
urn:nbn:se:kth:diva-220304 (URN)10.1080/17453674.2017.1344459 (DOI)000416605900005 ()28681679 (PubMedID)
Note

QC 20171221

Available from: 2017-12-21 Created: 2017-12-21 Last updated: 2018-01-13Bibliographically approved
Ghadirzadeh, A., Maki, A., Kragic, D. & Björkman, M. (2017). Deep predictive policy training using reinforcement learning. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017: . Paper presented at 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, Canada, 24 September 2017 through 28 September 2017 (pp. 2351-2358). Institute of Electrical and Electronics Engineers (IEEE), Article ID 8206046.
Open this publication in new window or tab >>Deep predictive policy training using reinforcement learning
2017 (English)In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Institute of Electrical and Electronics Engineers (IEEE), 2017, p. 2351-2358, article id 8206046Conference paper, Published paper (Refereed)
Abstract [en]

Skilled robot task learning is best implemented by predictive action policies due to the inherent latency of sensorimotor processes. However, training such predictive policies is challenging as it involves finding a trajectory of motor activations for the full duration of the action. We propose a data-efficient deep predictive policy training (DPPT) framework with a deep neural network policy architecture which maps an image observation to a sequence of motor activations. The architecture consists of three sub-networks referred to as the perception, policy and behavior super-layers. The perception and behavior super-layers force an abstraction of visual and motor data trained with synthetic and simulated training samples, respectively. The policy super-layer is a small subnetwork with fewer parameters that maps data in-between the abstracted manifolds. It is trained for each task using methods for policy search reinforcement learning. We demonstrate the suitability of the proposed architecture and learning framework by training predictive policies for skilled object grasping and ball throwing on a PR2 robot. The effectiveness of the method is illustrated by the fact that these tasks are trained using only about 180 real robot attempts with qualitative terminal rewards.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017
National Category
Robotics
Identifiers
urn:nbn:se:kth:diva-224269 (URN)10.1109/IROS.2017.8206046 (DOI)2-s2.0-85041944294 (Scopus ID)9781538626825 (ISBN)
Conference
2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, Canada, 24 September 2017 through 28 September 2017
Funder
Swedish Research CouncilEU, Horizon 2020
Note

QC 20180315

Available from: 2018-03-15 Created: 2018-03-15 Last updated: 2018-05-21Bibliographically approved
Högman, V., Björkman, M., Maki, A. & Kragic, D. (2016). A sensorimotor learning framework for object categorization. IEEE Transactions on Cognitive and Developmental Systems, 8(1), 15-25
Open this publication in new window or tab >>A sensorimotor learning framework for object categorization
2016 (English)In: IEEE Transactions on Cognitive and Developmental Systems, ISSN 2379-8920, Vol. 8, no 1, p. 15-25Article in journal (Refereed) Published
Abstract [en]

This paper presents a framework that enables a robot to discover various object categories through interaction. The categories are described using action-effect relations, i.e. sensorimotor contingencies rather than more static shape or appearance representation. The framework provides a functionality to classify objects and the resulting categories, associating a class with a specific module. We demonstrate the performance of the framework by studying a pushing behavior in robots, encoding the sensorimotor contingencies and their predictability with Gaussian Processes. We show how entropy-based action selection can improve object classification and how functional categories emerge from the similarities of effects observed among the objects. We also show how a multidimensional action space can be realized by parameterizing pushing using both position and velocity.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2016
Keywords
sensorimotor learning, object classification, categorization, cognitive robotics, active perception, learning and adaptive system, embodiment, developmental robotics
National Category
Robotics
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-172143 (URN)10.1109/TAMD.2015.2463728 (DOI)000388682400003 ()
Funder
Swedish Research CouncilEU, European Research Council, H2020-FETPROACT-2014 641321
Note

QC 20160422

Available from: 2016-04-21 Created: 2015-08-13 Last updated: 2017-01-04Bibliographically approved
Ghadirzadeh, A., Bütepage, J., Maki, A., Kragic, D. & Björkman, M. (2016). A sensorimotor reinforcement learning framework for physical human-robot interaction. In: IEEE International Conference on Intelligent Robots and Systems: . Paper presented at 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, 9 October 2016 through 14 October 2016 (pp. 2682-2688). IEEE
Open this publication in new window or tab >>A sensorimotor reinforcement learning framework for physical human-robot interaction
Show others...
2016 (English)In: IEEE International Conference on Intelligent Robots and Systems, IEEE, 2016, p. 2682-2688Conference paper, Published paper (Refereed)
Abstract [en]

Modeling of physical human-robot collaborations is generally a challenging problem due to the unpredictive nature of human behavior. To address this issue, we present a data-efficient reinforcement learning framework which enables a robot to learn how to collaborate with a human partner. The robot learns the task from its own sensorimotor experiences in an unsupervised manner. The uncertainty in the interaction is modeled using Gaussian processes (GP) to implement a forward model and an actionvalue function. Optimal action selection given the uncertain GP model is ensured by Bayesian optimization. We apply the framework to a scenario in which a human and a PR2 robot jointly control the ball position on a plank based on vision and force/torque data. Our experimental results show the suitability of the proposed method in terms of fast and data-efficient model learning, optimal action selection under uncertainty and equal role sharing between the partners.

Place, publisher, year, edition, pages
IEEE, 2016
Keywords
Behavioral research, Intelligent robots, Reinforcement learning, Robots, Bayesian optimization, Forward modeling, Gaussian process, Human behaviors, Human-robot collaboration, Model learning, Optimal actions, Physical human-robot interactions, Human robot interaction
National Category
Robotics
Identifiers
urn:nbn:se:kth:diva-202121 (URN)10.1109/IROS.2016.7759417 (DOI)000391921702127 ()2-s2.0-85006367922 (Scopus ID)9781509037629 (ISBN)
Conference
2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, 9 October 2016 through 14 October 2016
Note

QC 20170228

Available from: 2017-02-28 Created: 2017-02-28 Last updated: 2018-05-21Bibliographically approved
Azizpour, H., Sharif Razavian, A., Sullivan, J., Maki, A. & Carlssom, S. (2016). Factors of Transferability for a Generic ConvNet Representation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 38(9), 1790-1802, Article ID 7328311.
Open this publication in new window or tab >>Factors of Transferability for a Generic ConvNet Representation
Show others...
2016 (English)In: IEEE Transaction on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 38, no 9, p. 1790-1802, article id 7328311Article in journal (Refereed) Published
Abstract [en]

Evidence is mounting that Convolutional Networks (ConvNets) are the most effective representation learning method for visual recognition tasks. In the common scenario, a ConvNet is trained on a large labeled dataset (source) and the feed-forward units activation of the trained network, at a certain layer of the network, is used as a generic representation of an input image for a task with relatively smaller training set (target). Recent studies have shown this form of representation transfer to be suitable for a wide range of target visual recognition tasks. This paper introduces and investigates several factors affecting the transferability of such representations. It includes parameters for training of the source ConvNet such as its architecture, distribution of the training data, etc. and also the parameters of feature extraction such as layer of the trained ConvNet, dimensionality reduction, etc. Then, by optimizing these factors, we show that significant improvements can be achieved on various (17) visual recognition tasks. We further show that these visual recognition tasks can be categorically ordered based on their similarity to the source task such that a correlation between the performance of tasks and their similarity to the source task w.r.t. the proposed factors is observed.

Place, publisher, year, edition, pages
IEEE Computer Society Digital Library, 2016
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-177033 (URN)10.1109/TPAMI.2015.2500224 (DOI)000381432700006 ()2-s2.0-84981266620 (Scopus ID)
Note

QC 20161208

Available from: 2015-11-13 Created: 2015-11-13 Last updated: 2018-01-10Bibliographically approved
Razavian, A. S., Sullivan, J., Carlsson, S. & Maki, A. (2016). Visual instance retrieval with deep convolutional networks. ITE Transactions on Media Technology and Applications, 4(3), 251-258
Open this publication in new window or tab >>Visual instance retrieval with deep convolutional networks
2016 (English)In: ITE Transactions on Media Technology and Applications, ISSN 2186-7364, Vol. 4, no 3, p. 251-258Article in journal (Refereed) Published
Abstract [en]

This paper provides an extensive study on the availability of image representations based on convolutional networks (ConvNets) for the task of visual instance retrieval. Besides the choice of convolutional layers, we present an efficient pipeline exploiting multi-scale schemes to extract local features, in particular, by taking geometric invariance into explicit account, i.e. positions, scales and spatial consistency. In our experiments using five standard image retrieval datasets, we demonstrate that generic ConvNet image representations can outperform other state-of-the-art methods if they are extracted appropriately.

Place, publisher, year, edition, pages
Institute of Image Information and Television Engineers, 2016
Keywords
Convolutional network, Learning representation, Multi-resolution search, Visual instance retrieval
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-195472 (URN)2-s2.0-84979503481 (Scopus ID)
Note

QC 20161125

Available from: 2016-11-25 Created: 2016-11-03 Last updated: 2018-01-13Bibliographically approved
Sharif Razavian, A., Sullivan, J., Maki, A. & Carlsson, S. (2015). A Baseline for Visual Instance Retrieval with Deep Convolutional Networks. In: : . Paper presented at International Conference on Learning Representations,May 7 - 9, 2015, San Diego, CA. San Diego, US: ICLR
Open this publication in new window or tab >>A Baseline for Visual Instance Retrieval with Deep Convolutional Networks
2015 (English)Conference paper, Poster (with or without abstract) (Refereed)
Place, publisher, year, edition, pages
San Diego, US: ICLR, 2015
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-165765 (URN)
Conference
International Conference on Learning Representations,May 7 - 9, 2015, San Diego, CA
Note

QC 20150522

Available from: 2015-04-29 Created: 2015-04-29 Last updated: 2015-05-22Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-4266-6746

Search in DiVA

Show all publications