Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 34) Show all publications
Buda, M., Maki, A. & Mazurowski, M. A. (2018). A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks, 106, 249-259
Open this publication in new window or tab >>A systematic study of the class imbalance problem in convolutional neural networks
2018 (English)In: Neural Networks, ISSN 0893-6080, E-ISSN 1879-2782, Vol. 106, p. 249-259Article in journal (Refereed) Published
Abstract [en]

In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks (CNNs) and compare frequently used methods to address the issue. Class imbalance is a common problem that has been comprehensively studied in classical machine learning, yet very limited systematic research is available in the context of deep learning. In our study, we use three benchmark datasets of increasing complexity, MNIST, CIFAR-10 and ImageNet, to investigate the effects of imbalance on classification and perform an extensive comparison of several methods to address the issue: oversampling, undersampling, two-phase training, and thresholding that compensates for prior class probabilities. Our main evaluation metric is area under the receiver operating characteristic curve (ROC AUC) adjusted to multi-class tasks since overall accuracy metric is associated with notable difficulties in the context of imbalanced data. Based on results from our experiments we conclude that (i) the effect of class imbalance on classification performance is detrimental; (ii) the method of addressing class imbalance that emerged as dominant in almost all analyzed scenarios was oversampling; (iii) oversampling should be applied to the level that completely eliminates the imbalance, whereas the optimal undersampling ratio depends on the extent of imbalance; (iv) as opposed to some classical machine learning models, oversampling does not cause overfitting of CNNs; (v) thresholding should be applied to compensate for prior class probabilities when overall number of properly classified cases is of interest. 

Place, publisher, year, edition, pages
PERGAMON-ELSEVIER SCIENCE LTD, 2018
Keywords
Class imbalance, Convolutional neural networks, Deep learning, Image classification
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-235561 (URN)10.1016/j.neunet.2018.07.011 (DOI)000445015200021 ()30092410 (PubMedID)2-s2.0-85050996431 (Scopus ID)
Note

QC 20181001

Available from: 2018-10-01 Created: 2018-10-01 Last updated: 2018-10-01Bibliographically approved
Holesovsky, O. & Maki, A. (2018). Compact ConvNets with Ternary Weights and Binary Activations. In: : . Paper presented at The 23rd Computer Vision Winter Workshop. Plague
Open this publication in new window or tab >>Compact ConvNets with Ternary Weights and Binary Activations
2018 (English)Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

Compact convolutional neural network (CNN) architectures with ternary weights and binary activations is a combination of methods suitable for making neural networks more efficient. We show that the combination of ternary weights and depthwise separable convolutions on the CIFAR-10 benchmark can yield a small neural network of size 32kB and 83.70% test accuracy. We present a novel dithering binary activation which we expected to improve accuracy of networks with binary activations by randomizing quantization error. This work presents the outcome of our experiments which show that it brings only mild improvements. A compact SqueezeNet network with ternary weights and binary activations is more accurate than the same network with binary weights. Nevertheless, the accuracy gap to its full precision variant remains large.

Place, publisher, year, edition, pages
Plague: , 2018
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-250567 (URN)
Conference
The 23rd Computer Vision Winter Workshop
Note

QC 20190624

Available from: 2019-04-30 Created: 2019-04-30 Last updated: 2019-06-24Bibliographically approved
Maki, A. (2018). Epilogue to the 6th Sweden-Japan Academic Network: Towards Replicating Our Visual Function: Approaches with Machine Learning. JSPS Stockholm Newsletter (English Edition), Spring 2018(32), 13-13
Open this publication in new window or tab >>Epilogue to the 6th Sweden-Japan Academic Network: Towards Replicating Our Visual Function: Approaches with Machine Learning
2018 (English)In: JSPS Stockholm Newsletter (English Edition), Vol. Spring 2018, no 32, p. 13-13Article in journal (Other (popular science, discussion, etc.)) Published
Place, publisher, year, edition, pages
Stockholm: JSPS, 2018
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-250565 (URN)
Note

QC 20190520

Available from: 2019-04-30 Created: 2019-04-30 Last updated: 2019-05-20Bibliographically approved
Li, V. & Maki, A. (2018). Feature Contraction: New ConvNet Regularization in Image Classification. In: : . Paper presented at British Machine Vision Conference.
Open this publication in new window or tab >>Feature Contraction: New ConvNet Regularization in Image Classification
2018 (English)Conference paper, Oral presentation with published abstract (Refereed)
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-250575 (URN)
Conference
British Machine Vision Conference
Note

QC 20190624

Available from: 2019-04-30 Created: 2019-04-30 Last updated: 2019-06-24Bibliographically approved
Nordström, M., Hult, H., Maki, A. & Löfman, F. (2018). Pareto Dose Prediction Using Fully Convolutional Networks Operating in 3D. Paper presented at 60th Annual Meeting of the American-Association-of-Physicists-in-Medicine, JUL 29-AUG 02, 2018, Nashville, TN. Medical physics (Lancaster), 45(6), E176-E176
Open this publication in new window or tab >>Pareto Dose Prediction Using Fully Convolutional Networks Operating in 3D
2018 (English)In: Medical physics (Lancaster), ISSN 0094-2405, Vol. 45, no 6, p. E176-E176Article in journal, Meeting abstract (Other academic) Published
Place, publisher, year, edition, pages
WILEY, 2018
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-232417 (URN)10.1002/mp.12938 (DOI)000434978000213 ()
Conference
60th Annual Meeting of the American-Association-of-Physicists-in-Medicine, JUL 29-AUG 02, 2018, Nashville, TN
Note

QC 20180726

Available from: 2018-07-26 Created: 2018-07-26 Last updated: 2018-07-26Bibliographically approved
Nawata, S., Maki, A. & Takashi, H. (2018). Power packet transferability via symbol propagation matrix. Proceedings of the Royal Society. Mathematical, Physical and Engineering Sciences, 474(2213), Article ID 20170552.
Open this publication in new window or tab >>Power packet transferability via symbol propagation matrix
2018 (English)In: Proceedings of the Royal Society. Mathematical, Physical and Engineering Sciences, ISSN 1364-5021, E-ISSN 1471-2946, Vol. 474, no 2213, article id 20170552Article in journal (Refereed) Published
Abstract [en]

A power packet is a unit of electric power composed of a power pulse and an information tag. In Shannon’s information theory, messages are represented by symbol sequences in a digitized manner. Referring to this formulation, we define symbols in power packetization as a minimum unit of power transferred by a tagged pulse. Here, power is digitized and quantized. In this paper, we consider packetized power in networks for a finite duration, giving symbols and their energies to the networks. A network structure is defined using a graph whose nodes represent routers, sources and destinations. First, we introduce the concept of a symbol propagation matrix (SPM) in which symbols are transferred at links during unit times. Packetized power is described as a network flow in a spatio-temporal structure. Then, we study the problem of selecting an SPM in terms of transferability, that is, the possibility to represent given energies at sources and destinations during the finite duration. To select an SPM, we consider a network flow problem of packetized power. The problem is formulated as an M-convex submodular flow problem which is a solvable generalization of the minimum cost flow problem. Finally, through examples, we verify that this formulation provides reasonable packetized power.

Place, publisher, year, edition, pages
Royal Society Publishing, 2018
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Electrical Engineering
Identifiers
urn:nbn:se:kth:diva-228464 (URN)10.1098/rspa.2017.0552 (DOI)000433498800003 ()2-s2.0-85047560728 (Scopus ID)
Note

QC 20180612

Available from: 2018-05-24 Created: 2018-05-24 Last updated: 2018-06-19Bibliographically approved
Olczak, J., Fahlberg, N., Maki, A., Razavian, A. S., Jilert, A., Stark, A., . . . Gordon, M. (2017). Artificial intelligence for analyzing orthopedic trauma radiographs Deep learning algorithms-are they on par with humans for diagnosing fractures?. Acta Orthopaedica, 88(6), 581-586
Open this publication in new window or tab >>Artificial intelligence for analyzing orthopedic trauma radiographs Deep learning algorithms-are they on par with humans for diagnosing fractures?
Show others...
2017 (English)In: Acta Orthopaedica, ISSN 1745-3674, E-ISSN 1745-3682, Vol. 88, no 6, p. 581-586Article in journal (Refereed) Published
Abstract [en]

Background and purpose - Recent advances in artificial intelligence (deep learning) have shown remarkable performance in classifying non-medical images, and the technology is believed to be the next technological revolution. So far it has never been applied in an orthopedic setting, and in this study we sought to determine the feasibility of using deep learning for skeletal radiographs. Methods - We extracted 256,000 wrist, hand, and ankle radiographs from Danderyd's Hospital and identified 4 classes: fracture, laterality, body part, and exam view. We then selected 5 openly available deep learning networks that were adapted for these images. The most accurate network was benchmarked against a gold standard for fractures. We furthermore compared the network's performance with 2 senior orthopedic surgeons who reviewed images at the same resolution as the network. Results - All networks exhibited an accuracy of at least 90% when identifying laterality, body part, and exam view. The final accuracy for fractures was estimated at 83% for the best performing network. The network performed similarly to senior orthopedic surgeons when presented with images at the same resolution as the network. The 2 reviewer Cohen's kappa under these conditions was 0.76. Interpretation - This study supports the use for orthopedic radiographs of artificial intelligence, which can perform at a human level. While current implementation lacks important features that surgeons require, e.g. risk of dislocation, classifications, measurements, and combining multiple exam views, these problems have technical solutions that are waiting to be implemented for orthopedics.

National Category
Orthopaedics
Identifiers
urn:nbn:se:kth:diva-220304 (URN)10.1080/17453674.2017.1344459 (DOI)000416605900005 ()28681679 (PubMedID)
Note

QC 20171221

Available from: 2017-12-21 Created: 2017-12-21 Last updated: 2018-01-13Bibliographically approved
Ghadirzadeh, A., Maki, A., Kragic, D. & Björkman, M. (2017). Deep predictive policy training using reinforcement learning. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017: . Paper presented at 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, Canada, 24 September 2017 through 28 September 2017 (pp. 2351-2358). Institute of Electrical and Electronics Engineers (IEEE), Article ID 8206046.
Open this publication in new window or tab >>Deep predictive policy training using reinforcement learning
2017 (English)In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Institute of Electrical and Electronics Engineers (IEEE), 2017, p. 2351-2358, article id 8206046Conference paper, Published paper (Refereed)
Abstract [en]

Skilled robot task learning is best implemented by predictive action policies due to the inherent latency of sensorimotor processes. However, training such predictive policies is challenging as it involves finding a trajectory of motor activations for the full duration of the action. We propose a data-efficient deep predictive policy training (DPPT) framework with a deep neural network policy architecture which maps an image observation to a sequence of motor activations. The architecture consists of three sub-networks referred to as the perception, policy and behavior super-layers. The perception and behavior super-layers force an abstraction of visual and motor data trained with synthetic and simulated training samples, respectively. The policy super-layer is a small subnetwork with fewer parameters that maps data in-between the abstracted manifolds. It is trained for each task using methods for policy search reinforcement learning. We demonstrate the suitability of the proposed architecture and learning framework by training predictive policies for skilled object grasping and ball throwing on a PR2 robot. The effectiveness of the method is illustrated by the fact that these tasks are trained using only about 180 real robot attempts with qualitative terminal rewards.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017
National Category
Robotics
Identifiers
urn:nbn:se:kth:diva-224269 (URN)10.1109/IROS.2017.8206046 (DOI)2-s2.0-85041944294 (Scopus ID)9781538626825 (ISBN)
Conference
2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, Canada, 24 September 2017 through 28 September 2017
Funder
Swedish Research CouncilEU, Horizon 2020
Note

QC 20180315

Available from: 2018-03-15 Created: 2018-03-15 Last updated: 2018-05-21Bibliographically approved
Högman, V., Björkman, M., Maki, A. & Kragic, D. (2016). A sensorimotor learning framework for object categorization. IEEE Transactions on Cognitive and Developmental Systems, 8(1), 15-25
Open this publication in new window or tab >>A sensorimotor learning framework for object categorization
2016 (English)In: IEEE Transactions on Cognitive and Developmental Systems, ISSN 2379-8920, Vol. 8, no 1, p. 15-25Article in journal (Refereed) Published
Abstract [en]

This paper presents a framework that enables a robot to discover various object categories through interaction. The categories are described using action-effect relations, i.e. sensorimotor contingencies rather than more static shape or appearance representation. The framework provides a functionality to classify objects and the resulting categories, associating a class with a specific module. We demonstrate the performance of the framework by studying a pushing behavior in robots, encoding the sensorimotor contingencies and their predictability with Gaussian Processes. We show how entropy-based action selection can improve object classification and how functional categories emerge from the similarities of effects observed among the objects. We also show how a multidimensional action space can be realized by parameterizing pushing using both position and velocity.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2016
Keywords
sensorimotor learning, object classification, categorization, cognitive robotics, active perception, learning and adaptive system, embodiment, developmental robotics
National Category
Robotics
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-172143 (URN)10.1109/TAMD.2015.2463728 (DOI)000388682400003 ()
Funder
Swedish Research CouncilEU, European Research Council, H2020-FETPROACT-2014 641321
Note

QC 20160422

Available from: 2016-04-21 Created: 2015-08-13 Last updated: 2017-01-04Bibliographically approved
Ghadirzadeh, A., Bütepage, J., Maki, A., Kragic, D. & Björkman, M. (2016). A sensorimotor reinforcement learning framework for physical human-robot interaction. In: IEEE International Conference on Intelligent Robots and Systems: . Paper presented at 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, 9 October 2016 through 14 October 2016 (pp. 2682-2688). IEEE
Open this publication in new window or tab >>A sensorimotor reinforcement learning framework for physical human-robot interaction
Show others...
2016 (English)In: IEEE International Conference on Intelligent Robots and Systems, IEEE, 2016, p. 2682-2688Conference paper, Published paper (Refereed)
Abstract [en]

Modeling of physical human-robot collaborations is generally a challenging problem due to the unpredictive nature of human behavior. To address this issue, we present a data-efficient reinforcement learning framework which enables a robot to learn how to collaborate with a human partner. The robot learns the task from its own sensorimotor experiences in an unsupervised manner. The uncertainty in the interaction is modeled using Gaussian processes (GP) to implement a forward model and an actionvalue function. Optimal action selection given the uncertain GP model is ensured by Bayesian optimization. We apply the framework to a scenario in which a human and a PR2 robot jointly control the ball position on a plank based on vision and force/torque data. Our experimental results show the suitability of the proposed method in terms of fast and data-efficient model learning, optimal action selection under uncertainty and equal role sharing between the partners.

Place, publisher, year, edition, pages
IEEE, 2016
Keywords
Behavioral research, Intelligent robots, Reinforcement learning, Robots, Bayesian optimization, Forward modeling, Gaussian process, Human behaviors, Human-robot collaboration, Model learning, Optimal actions, Physical human-robot interactions, Human robot interaction
National Category
Robotics
Identifiers
urn:nbn:se:kth:diva-202121 (URN)10.1109/IROS.2016.7759417 (DOI)000391921702127 ()2-s2.0-85006367922 (Scopus ID)9781509037629 (ISBN)
Conference
2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, 9 October 2016 through 14 October 2016
Note

QC 20170228

Available from: 2017-02-28 Created: 2017-02-28 Last updated: 2019-08-16Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-4266-6746

Search in DiVA

Show all publications